A decade ago, supercomputers cost about a million US$ per Gflop/s
performance. By using standard PC parts, "Beowulf" cluster
supercomputers dramatically reduce the cost, but as processors and
other components have become faster and cheaper, the network needed to
coordinate them has become relatively expensive. The University of
Kentucky researchers made their first breakthrough in reducing network
cost in May 2000, when KLAT2, Kentucky Linux Athlon Testbed 2used
standard 100mb/s Fast Ethernet hardware in the world's first
machine-designed asymmetric cluster network -- and achieved $640 per
Gflop/s, breaking the $1,000 per Gflop/s barrier. Their newest machine,
KASY0, Kentucky Asymmetric Zero, uses a more advanced type of
asymmetric network design to break the $100 per Gflop/s barrier.
A well-known reference for supercomputer performance is the TOP500
500 list. Performance on the Linpack uses depends partly on the
theoretical peak Gflop/s of the processors, but also on the parallel
implementation and efficiency of the network that allows the processors
to work together. In the current (June 2003) list, most systems use
expensive, specialized, network hardware. The machines explicitly
listed as using standard 100mb/s Fast Ethernet achieve an average of
less than 8.5% of peak. The average for the systems listed as using
Gigabit Ethernet is somewhat better, at about 30% of peak. In contrast,
KASY0's 100mb/s Fast Ethernet network allows it to achieve 187.3
Gflop/s, over 35% of peak using a double-precision version of the
benchmark (HPL). Using a single-precision version, the $39,454.31 KASY0
obtains over 471.5 Gflop/s, more than 44% of its theoretical peak and
less than $84 per Gflop/s.
The remarkable thing about KASY0's price/performance is that, while
network hardware is often the dominant cost for a system of its size
(128 plus 4 spare nodes), less than 11% of the system cost went for the
network hardware. The AMD Athlon XP 2600+ processors were more than 35%
of the total system cost; memory was 21%. Even more significantly, the
network design technology that made this possible can be applied with
similar benefit to cluster supercomputers with thousands of nodes.
KLAT2's network was the world's first Flat Neighborhood Network; the
enhanced version used for KASY0 is the world's first Sparse Flat
Neighborhood Network (SFNN). KASY0 also is the first supercomputer to
have its physical node and switch placement optimized by a computer
program. FNN design technology and tools have been freely available and
used by various other groups; so too will the new SFNN technology be
freely available.
KASY0 is not a toy or a "hack" -- it is a serious demonstration of a
fundamental new advance in network design. The only other supercomputer
we have seen claim close to the price/performance measured for KASY0 is
this $50,000+ system built by the National Center for Supercomputing
Applications (NCSA) using 70 PlayStation2 units. Not only does KASY0
have a vastly superior network and significantly higher peak floating
point performance per node, but KASY0's lower price yields many more
nodes and real application performance, not just high peak numbers.
For example, KASY0 also has set a new world record for rendering a
complex image using the Persistence of Vision Raytracer (POV-Ray).
Executing pvmpovray 3.5 on KASY0 to render the standard benchmark.pov
scene yielded a time of 72 seconds. According to this site, the
previous record was 107 seconds set on August 1, 2003 by a cluster
costing $79,000.
The primary architect of KASY0 is Tim Mattox, a research assistant
who has been developing the Sparse Flat Neighborhood Network concept
for his Ph.D. thesis. As an educational experience available to anyone,
the physical construction of KASY0 was done entirely by volunteers at
the University of Kentucky.
From the creation of the first Linux PC cluster in February 1994 to
the construction of KASY0, Hank Dietz and his students have continued
to improve cluster performance by making compilers, hardware
architecture, and operating system work together more efficiently. At
the University of Kentucky, as Professor of Electrical and Computer
Engineering and James F. Hardymon Chair in Networking, Dietz's goal is
to develop and freely diseminate the new technologies that will allow
scientists and engineers to solve their most important computational
problems.
|