Intel has announced a research chip that contains 80 (yes, eighty!!) cores. Each is a dual floating point execution unit. That's a record of maximum integration! This achieves a performance of 1 teraflops at 3.16 Ghz. That's a supercomputer performance that can make supercomputer on a desk possible when commercially available.
All that performance at an energy expense of 62 watts only, lower than some of the PC chips today! As I have mentioned couple of times in my past posts, this is what's going to happen. Ever more performance at as low a consumption as is possible. Otherwise the way the need for more performance, whether through such high performance devices or a multitude of servers, increases the energy consumption and corresponding cooling requirements will become unmanageable.
Only thing comparable has been a supercomputer built by Intel with 10,000 Pentium chips that reached teraflop performance yet consumed 500 KW of power! This was a supercomputer housed in Sandia National Labs of USA. Mind boggles just to think of possibilities when such power would be available for personal use! Some of the applications being talked about are real time speech recognition, multimedia data mining, photo-realistic game etc. That would/could lead to immediate speech based interaction with the computer like the HAL of 2001 A Space Odyssey! Digging out a photo of some one with a smile on his face in contrast to his frowning! Photo realism particularly in real time animation would add a totally different dimension to gaming experience!
With several processors running together, system bus saturates very soon. Caches help but even then limits are reached very soon, as numbers increase. Memory interface speeds, data management between these cores, keeping them current are some of the problem areas. Particularly the interface speeds possible on this system bus.
Intels' stated aim is to research these areas with a chip like this. Innovations generated in the project as per Intel are as follows,
- Rapid design – The tiled–design approach allows designers to use smaller cores that can easily be repeated across the chip. A single–core chip of this size (100 million transistors) would take roughly twice as long and twice as many people to design.
- Network on a chip – In addition to the compute element, each core contains a 5–port messaging passing router. These are connected in a 2D mesh network that implement message–passing. This mesh interconnect scheme could prove much more scalable than today’s multi–core chip interconnects, allowing for better communications between the cores and delivering more processor performance.
- Fine–grain power management – The individual compute engines and data routers in each core can be activated or put to sleep based on the performance required by the application a person is running. In addition, new circuit techniques give the chip world–class power efficiency—1 teraflops requires only 62W, comparable to desktop processors sold today.
- And other innovations – Such as sleep transistors, mesochronous clocking, and clock gating.