Winamp & Shoutcast Forums

Winamp & Shoutcast Forums (http://forums.winamp.com/index.php)
-   Breaking News (http://forums.winamp.com/forumdisplay.php?f=80)
-   -   Supercomputer Firm Cray to build Red Storm (http://forums.winamp.com/showthread.php?t=154376)

Starbucks 28th October 2003 20:21

Supercomputer Firm Cray to build Red Storm
 
The Red Storm uses 108 cabinets and 10,368 Sledgehammer 2GHz processors, and around 10TB of 333MHz DDR memory.

The monster uses Linux on service and IO nodes, LWK (Catamount) on compute nodes, and a version of stripped down Linux on the RAS modes. The whole beastie takes less than 2MW of total power and cooling.

Full Story

marvinbarcelona 28th October 2003 20:37

Stop it! Please stop it! My head hurts....I keep hearing airplanes flying over my head...I don't understand...http://www.fileheaven.org/forum/imag...s/headache.gif

ElChevelle 28th October 2003 20:44

Red Storm?
I smell a copyright suit.

xenosomething 28th October 2003 22:31

i dont understand what they do with this kind of power?

chess?

Starbucks 28th October 2003 23:02

Nuclear explosion simmulations and chemical reaction sims.

godoncrack 28th October 2003 23:09

Complining Sarges "World's Biggest Smilie"
Viewing Ultraporn
Debugging Windows
Balancing the national defecit
Cataloging hgnis's collection of just-not-right pics

mark 29th October 2003 15:06

making backups of forums.winamp.com

liquidmetallic 30th October 2003 00:15

now I bet that wouldn't run using Windows...

whiteflip 30th October 2003 02:39

15 years from now I will be able to put something like that in my pocket. 2MW and all :)

mark 30th October 2003 10:39

it shall be called "the apple us-pod" :igor:

Bob the Tomato 2nd November 2003 02:00

I know a Cray supercomputer is used by the National Weather Service to predict storms, so that's one more application.

SemiTechGeek 2nd November 2003 03:40

Quote:

Originally posted by godoncrack
Debugging Windows
I Wish

whiteflip 2nd November 2003 08:53

if i get rich ill buy a cray to check my email and do my taxes on.

fwgx 2nd November 2003 10:08

Quote:

Originally posted by Bob the Tomato
I know a Cray supercomputer is used by the National Weather Service to predict storms, so that's one more application.
Down at number 180 on the world fastest comuters list.

discoleo 7th November 2003 16:40

Quote:

i dont understand what they do with this kind of power?
It's not that much of power.

Cray also builds X1 systems which are based on vector-processors. These are far more powerfull.

In many real life situations, you gain very little from systems like the everyday clusters based on PCs. An arhitecture like that used in Red Storm offers some advantages. Vector chips offer even more advantages. Future quantum processors will function at the theoretical computational limit (and such experimental devices work already at the theoretical limit).

For pure serial calculations the theoretical limit of calculatons is obtained with a black hole, as the Schwarzschield radius which defines the black whole is the shortest way for an impulse to travel. For serial calculatons modern computers are far away from any peak performance. This is best seen in atmospheric simulations where a PC is a catastrophy, even big clusters based on PCs are bad and the earth simulator uses some 5000 vector chips (very different from PC processors)and still has to split the atmosphere in some 500 m x 500 m x 500 m blocks. If you whish to increase the quasi-resolution of your forecast, the number of operations rises exponentially and, because the data is higly dependant on other calculatons, a pure parallel computer or cluster will NOT cope with all that.

whiteflip 11th November 2003 23:33

ugh what he said. how'd you get to know so much about computer technology and such?

Starbucks 12th November 2003 01:41

Quote:

Cray also builds X1 systems which are based on vector-processors. These are far more powerfull.
Cray's Red Storm will outplace the #1 on the Top500.org. You'd need 60 Cray X1's to match Red Storm's performance (assuming the X1's CPUs can scale as well as AMD's Opterons).

http://news.com.com/2100-7337_3-5097398.html

discoleo 15th November 2003 11:43

Quote:

Cray's Red Storm will outplace the #1 on the Top500.org
Silly, but the Earth Silmulator was finished in 2001/2002, its build lasted 1-2 years, its processors are from the end of the 90s and it is still the fastest computer in the world!!! (And I seriously believe that Red Storm won't outplace it, nor any computer very soon.)

I'll explain a little bit the vector technology, so you understand why it is superior to any classical computing.

Suppose we have a complex problem to solve and we wish to allow many CPUs to make calculations.

First CPU controls process. It assigns to CPU 2 a block of code. It makes sense to let 1 CPU calculate till the first branch point and other CPUs executing code beyond the branch point.

In most current programs, a branch point is every 100-200 asm instructions, on x86 means (idealized) 1000-5000 cycles apart. To gain advantage of more CPUs, you will assign to CPU 3 one path of the branch point, CPU 4 the second path, so before the actual execution reaches this branch point, you have already made some advanced calculations beyond the branch point.

The first thing one notices is that one of the CPUs (3 or 4) calculates nonsense, but you can't tell which one, so one CPUs calculations were futile. Some 200 instructions away, another branchpoint will be encountered, so you need for CPU 3 another 2 CPUs (5 and 6) and for CPU 4 still further 2 CPUs (7 and 8). BUT 3 of this CPUs (5,6,7 and 8) will execute nonsense.

You see, at 8 CPUs anytime 4 of them will execute nonsense which will be discarded. At infinity, you get only log(CPU number) of processors doing real work, most other just waste resources.

There have been made some optimizations to this, but it still is a nonefficient process. There is still another problem:

When CPU 2 reaches branch point and has the result, it must notify CPU 3 or 4 that one is correct, the other is wrong (this is actually best done by CPU 1 which controlls all processes). Then CPUs 5/6 or 7/8 must be resetted.

Lets analyze this a little: CPU 2 must execute some additional code to notify other CPUs that they are wrong. This is code not initially present in the program. It is usually pushing some arguments and a call to some subroutine. The subroutine connects to the other CPUs and sends them the proper messages, then returns back, which is extremely costly (thats why it makes more sense to have 1 CPU doing all this jobs, so that CPU 2 can still function normally).

Now, CPU 2 must be reinitialized with different code (CPU 3 or CPU 4 already executed past the branch point and it would be costy to transfer the code from CPU 3/4 to CPU 2, better assign CPU3/4 as current code executer and reinitialize CPU 2 with new code, somewhere beyond the next free branchpoint). Also, reinitialize the wrong CPU 3 or 4 with new code. This are all costy operations. Only the physical distance between CPUs gives some 2-3 cycle losses, with L1 cache on x86 additional 3 cycles, with L2 cache some 100 cycles, so transfers between CPUs are at best 100 cycles, but usually longer.

So, for every gain of 1000-5000 cycles, you loose some 500-2000 cycles, makes a net gain of about 2000 cycles for every efficient processor (and for 8 CPUs, only 3+the controll CPU function effectively).

You see, for most programs, a cluster is a very bad choice (not covered here, but if the connection between CPUs is too slow, the connection in a cluster must be a catastrophy). A multiprocessor machine is also very bad. There is still another problem: The data beyond the branch point must be perfectly independant of previous, still non-executed, calculatons.

Any optimization to inhance the inter-CPU speed is still very bad. There is a solution to partly overcome this: multiple CPUs on one core and some specialized form of common cache.

There is a more elegant solution: the processor performs an operation simultaneously on each member of an array of data. This is what a vector chip does.

Some explanation is needed here: to have many similar operations, you need to have only very few asm instructions, only a basic set. Thats why vector chips are more like RISC processors.

But they group similar data altogether and perform then the instruction on all that data. It sounds weird, but in machine code this makes much sense, as all complex instructions are a combination of primary instructions. Also, because everything is done on the same chip/ FPU it is damn fast.

A last note: there are very few situations where a cluster makes sense:[list=1][*]a big enterprize, where at night all computers stand still[*]where aproximations are feasible and not dangerous, e.g. inexact graphic rendering[/list=1]

But in most serious computations, a cluster is a catastropy. While a multi CPU processor is better, it is still far away from a vector chip.

Kind regards,

discoleo.

Starbucks 16th November 2003 08:07

Quote:

I seriously believe that Red Storm won't outplace it
We still have to see about Red Storm when it is released
Quote:

nor any computer very soon
IBM's Blue Gene/L is expected to give 360 teraflops.

In comparison, the 2,200 CPU IBMPPC970 ranks 3rd, and only costs 5.3 million, and was compeleted in 5 months. ES costs 200 million, and 1-2 years to complete.

http://news.com.com/2100-1006_3-5107...l?tag=nefd_top

zootm 17th November 2003 23:12

Quote:

Originally posted by godoncrack
Complining Sarges "World's Biggest Smilie"
Viewing Ultraporn
Debugging Windows
Balancing the national defecit
Cataloging hgnis's collection of just-not-right pics

running winamp3

*ba-zing*


All times are GMT. The time now is 20:24.

Copyright © 1999 - 2010 Nullsoft. All Rights Reserved.