|
|
#1 |
|
Forum King
Join Date: Jul 2003
Location: Forums
Posts: 2,685
|
Supercomputer Firm Cray to build Red Storm
The Red Storm uses 108 cabinets and 10,368 Sledgehammer 2GHz processors, and around 10TB of 333MHz DDR memory.
The monster uses Linux on service and IO nodes, LWK (Catamount) on compute nodes, and a version of stripped down Linux on the RAS modes. The whole beastie takes less than 2MW of total power and cooling. Full Story |
|
|
|
|
|
#2 |
|
Major Dude
Join Date: Aug 2003
Location: m/cr, UK
Posts: 1,143
|
Stop it! Please stop it! My head hurts....I keep hearing airplanes flying over my head...I don't understand...
It's been said that I could start an arguement in an empty room.....I see no reason to disbelieve this. |
|
|
|
|
|
#3 |
|
Moderator Alumni
Join Date: Jun 2000
Location: the MANCANNON!
Posts: 22,448
|
Red Storm?
I smell a copyright suit. |
|
|
|
|
|
#4 |
|
Major Dude
Join Date: Sep 2002
Location: ?
Posts: 1,473
|
i dont understand what they do with this kind of power?
chess? .........?. |
|
|
|
|
|
#5 |
|
Forum King
Join Date: Jul 2003
Location: Forums
Posts: 2,685
|
Nuclear explosion simmulations and chemical reaction sims.
|
|
|
|
|
|
#7 |
|
Forum King
Join Date: Jul 2002
Location: Norn Ir'nd, leek...
Posts: 6,287
|
making backups of forums.winamp.com
|
|
|
|
|
|
#8 |
|
Member
|
now I bet that wouldn't run using Windows...
|
|
|
|
|
|
#9 |
|
Post Master General
(Forum King) Join Date: Jun 2000
Location: Seattle, Now Las Vegas
Posts: 6,032
|
15 years from now I will be able to put something like that in my pocket. 2MW and all
I'm Back? |
|
|
|
|
|
#10 |
|
Forum King
Join Date: Jul 2002
Location: Norn Ir'nd, leek...
Posts: 6,287
|
it shall be called "the apple us-pod"
|
|
|
|
|
|
#11 |
|
Senior Member
|
I know a Cray supercomputer is used by the National Weather Service to predict storms, so that's one more application.
|
|
|
|
|
|
#12 | |
|
Senior Member
Join Date: Jan 2003
Location: Teaching Starfleet how to fire a phaser with a hand rather than a pogo stick
Posts: 324
|
Quote:
![]() Rip the shacles from your ankles and live your life damnit "Never be bullied into silence. Never allow yourself to be made a victim. Accept no one's definition of your life; define yourself." |
|
|
|
|
|
|
#13 |
|
Post Master General
(Forum King) Join Date: Jun 2000
Location: Seattle, Now Las Vegas
Posts: 6,032
|
if i get rich ill buy a cray to check my email and do my taxes on.
I'm Back? |
|
|
|
|
|
#14 | |
|
Rudolf the Red.
(Forum King) Join Date: Nov 2000
Posts: 9,314
|
Quote:
"We think science is interesting and if you disagree, you can fuck off." |
|
|
|
|
|
|
#15 | |
|
Senior Member
Join Date: Jul 2003
Posts: 209
|
Quote:
Cray also builds X1 systems which are based on vector-processors. These are far more powerfull. In many real life situations, you gain very little from systems like the everyday clusters based on PCs. An arhitecture like that used in Red Storm offers some advantages. Vector chips offer even more advantages. Future quantum processors will function at the theoretical computational limit (and such experimental devices work already at the theoretical limit). For pure serial calculations the theoretical limit of calculatons is obtained with a black hole, as the Schwarzschield radius which defines the black whole is the shortest way for an impulse to travel. For serial calculatons modern computers are far away from any peak performance. This is best seen in atmospheric simulations where a PC is a catastrophy, even big clusters based on PCs are bad and the earth simulator uses some 5000 vector chips (very different from PC processors)and still has to split the atmosphere in some 500 m x 500 m x 500 m blocks. If you whish to increase the quasi-resolution of your forecast, the number of operations rises exponentially and, because the data is higly dependant on other calculatons, a pure parallel computer or cluster will NOT cope with all that. |
|
|
|
|
|
|
#16 |
|
Post Master General
(Forum King) Join Date: Jun 2000
Location: Seattle, Now Las Vegas
Posts: 6,032
|
ugh what he said. how'd you get to know so much about computer technology and such?
I'm Back? |
|
|
|
|
|
#17 | |
|
Forum King
Join Date: Jul 2003
Location: Forums
Posts: 2,685
|
Quote:
http://news.com.com/2100-7337_3-5097398.html Last edited by Starbucks; 12th November 2003 at 02:00. |
|
|
|
|
|
|
#18 | |
|
Senior Member
Join Date: Jul 2003
Posts: 209
|
Quote:
I'll explain a little bit the vector technology, so you understand why it is superior to any classical computing. Suppose we have a complex problem to solve and we wish to allow many CPUs to make calculations. First CPU controls process. It assigns to CPU 2 a block of code. It makes sense to let 1 CPU calculate till the first branch point and other CPUs executing code beyond the branch point. In most current programs, a branch point is every 100-200 asm instructions, on x86 means (idealized) 1000-5000 cycles apart. To gain advantage of more CPUs, you will assign to CPU 3 one path of the branch point, CPU 4 the second path, so before the actual execution reaches this branch point, you have already made some advanced calculations beyond the branch point. The first thing one notices is that one of the CPUs (3 or 4) calculates nonsense, but you can't tell which one, so one CPUs calculations were futile. Some 200 instructions away, another branchpoint will be encountered, so you need for CPU 3 another 2 CPUs (5 and 6) and for CPU 4 still further 2 CPUs (7 and 8). BUT 3 of this CPUs (5,6,7 and 8) will execute nonsense. You see, at 8 CPUs anytime 4 of them will execute nonsense which will be discarded. At infinity, you get only log(CPU number) of processors doing real work, most other just waste resources. There have been made some optimizations to this, but it still is a nonefficient process. There is still another problem: When CPU 2 reaches branch point and has the result, it must notify CPU 3 or 4 that one is correct, the other is wrong (this is actually best done by CPU 1 which controlls all processes). Then CPUs 5/6 or 7/8 must be resetted. Lets analyze this a little: CPU 2 must execute some additional code to notify other CPUs that they are wrong. This is code not initially present in the program. It is usually pushing some arguments and a call to some subroutine. The subroutine connects to the other CPUs and sends them the proper messages, then returns back, which is extremely costly (thats why it makes more sense to have 1 CPU doing all this jobs, so that CPU 2 can still function normally). Now, CPU 2 must be reinitialized with different code (CPU 3 or CPU 4 already executed past the branch point and it would be costy to transfer the code from CPU 3/4 to CPU 2, better assign CPU3/4 as current code executer and reinitialize CPU 2 with new code, somewhere beyond the next free branchpoint). Also, reinitialize the wrong CPU 3 or 4 with new code. This are all costy operations. Only the physical distance between CPUs gives some 2-3 cycle losses, with L1 cache on x86 additional 3 cycles, with L2 cache some 100 cycles, so transfers between CPUs are at best 100 cycles, but usually longer. So, for every gain of 1000-5000 cycles, you loose some 500-2000 cycles, makes a net gain of about 2000 cycles for every efficient processor (and for 8 CPUs, only 3+the controll CPU function effectively). You see, for most programs, a cluster is a very bad choice (not covered here, but if the connection between CPUs is too slow, the connection in a cluster must be a catastrophy). A multiprocessor machine is also very bad. There is still another problem: The data beyond the branch point must be perfectly independant of previous, still non-executed, calculatons. Any optimization to inhance the inter-CPU speed is still very bad. There is a solution to partly overcome this: multiple CPUs on one core and some specialized form of common cache. There is a more elegant solution: the processor performs an operation simultaneously on each member of an array of data. This is what a vector chip does. Some explanation is needed here: to have many similar operations, you need to have only very few asm instructions, only a basic set. Thats why vector chips are more like RISC processors. But they group similar data altogether and perform then the instruction on all that data. It sounds weird, but in machine code this makes much sense, as all complex instructions are a combination of primary instructions. Also, because everything is done on the same chip/ FPU it is damn fast. A last note: there are very few situations where a cluster makes sense:[list=1][*]a big enterprize, where at night all computers stand still[*]where aproximations are feasible and not dangerous, e.g. inexact graphic rendering[/list=1] But in most serious computations, a cluster is a catastropy. While a multi CPU processor is better, it is still far away from a vector chip. Kind regards, discoleo. |
|
|
|
|
|
|
#19 | ||
|
Forum King
Join Date: Jul 2003
Location: Forums
Posts: 2,685
|
Quote:
Quote:
In comparison, the 2,200 CPU IBMPPC970 ranks 3rd, and only costs 5.3 million, and was compeleted in 5 months. ES costs 200 million, and 1-2 years to complete. http://news.com.com/2100-1006_3-5107...l?tag=nefd_top |
||
|
|
|
|
|
#20 | |
|
Forum King
Join Date: Jan 2002
Location: the nether reaches of bonnie scotland
Posts: 13,375
|
Quote:
*ba-zing* |
|
|
|
|
![]() |
|
|||||||
| Thread Tools | Search this Thread |
| Display Modes | |
|
|