Recently, I had the privilege of being a part of the inauguration ceremony of the new CRAY XC-40 PetaFLOPS SuperComputer at SuperComputer Education and Research Center (SERC) at the Indian Institute of Science (Bangalore). The inauguration was on 11th May and I was invited largely because of my Prof (hint hint Chairman SERC) and my chosen research field - High Performance Computing. The new machine was named Sahasrat - Sahasra meaning thousand spokes or arms.
For some people the term "SuperComputer" remains shrouded in mystery. Computer pundits, pardon me for my oversimplified attempt at jargon busting here. A SuperComputer is a cohesive and integrated group of large number CPU processors and accelerator clusters (GPGPU card, Many Integrated Cores[MIC], etc.) tied together with a superfast network interconnect and proportionately fast storage solution. All this hardware supported by scalable software libraries and binaries to run applications transparently. We are talking CPU cores in the ballparks of 10s of thousands, network speeds of multiple gigabits and a guaranteed availability of 99.999% (Quick math - downtime of 5 mins in a whole year). Obviously, such a large system would generate tremendous amount of heat and gulp the power nectar and hence the supporting datacenter/cooling design/setup and maintenance costs are tremendous. Such large systems can only be afforded by a privileged few countries. At 1.2 PetaFLOPS theoretical maximum computational power this new machine is the fastest in India and puts itself and India in about 50-60 ranked super computer in the world according to the soon to be released June 2015 top500 SuperComputer list. This is a huge improvement over our previous fastest SuperComputer in India also at SERC the 22.94 teraFLOPS Rpeak (theoretical maximum) IBM BlueGene/L. The current fastest supercomputer in the world churns out about 33 PetaFLOPS (in China).
To give you a sense of what this accomplishment means to us and why we should be proud of it, here is an excerpt from top500.org which ranks the top500 most powerful computer systems in the world.
So far, there have been 42 editions of the Top500 List. Over those editions, a total of 58 countries have appeared on the list.
Depending on how one counts, there are somewhere between 191 and 260 “countries” in the world. So, the list of those that at one time or another chose to join the list is in the range of 22% to 30% of total countries.
This is an elite list. A look at the countries owning the top 100 of these tells us there about 18 countries and being part of that is as elite as the 9 countries who have sent a satellites to Mars! I have witnessed first hand the effort and pains in setting up and assembling Sahasrat everyday when I come to the lab and believe me it's been a superb effort by the team here. Having narrated the technical specifications of Sahasrat at the inauguration ceremony, let me elucidate the same here to blow your minds:
Sahasrat draws its massive computational power from 3 clusters as below:
-
COMPUTE
- CPU Cluster - Intel Xeon E5-2680v3 @ 2.5GHz (Haswell) based 1376 compute nodes with a total count of 33024 cores (24 cores per node) with a sustained peformance of 950 TFLOPS
- GPU Cluster - NVIDIA Tesla K-40 based 44 nodes (2880 cores per node) with a sustained performance of 52TFLOPS
- MIC Cluster - Intel XeonPhi 5120D Knights Corner based 48 nodes with a sustained performance of 28TFLOPS STORAGE
- High Speed 2 PetaByte storage space connected by infiniband- FDR using Cray's parallel Lustre filesystem in RAID6 configuration. NETWORK
- Proprietary Cray Aries Interconnect with Dragonfly Topology OS and Software
- Cray's customised SUSE Linux Enterprise Server (SLES) 11 called Cray Linux Environment (CLE)
- Several parallel program development tools, architecture specific compilers, parallel scientific and mathematical libraries.
At the inauguration there were some of the countries greatest scientists present and they made us wary of the immense challenge that lies ahead of us. Now that we have this huge instrument how do we best utilize it and obtain the maximum results. This my friends is no meager challenge throw to us. Writing software that is immensely parallel, distributed with fewer synchronizations and that scales well to these number of cores needs domain knowledge and a true computer scientist at heart. My fellow computer engineers come join us at IISc if you are up the task and meet the new bleeding edge limits of computer "science". There is a definite need manpower at this level as highlighted IISc ex-associate director Prof. N Balakrishnan
The inauguration also featured 2 user groups that ran their computations during the 60 day trial run of the CRAY machine. The 2 groups presented interesting results from computational aerodynamics and computational astrophysics. So far the results have been very promising on the CRAY system that shows good architectural scaling properties for these 2 sample programs.
The aerodynamics user group simulated the entire landing sequence of a high lift wing, which requires simulation of complex physics for large grids using Flow Solver HiFUN. They have simulated 36 million volume grids for more than 11 simulation seconds of a landing sequence, at a granularity of 1 millisecond, using 10000 cores. There are three distinct phases. Gliding and flaring where the wind incidence increases. In the post-touch down phase where the wind incidence is lost while on ground roll.
The astro-physics user group simulated the overlap of supernovae that forms a hot, over-pressured bubble using the public domain PLUTO hydrodynamics code using about 12,000 cores (roughly one-third of the system).
There are however some clear caveats that come with the machine. As we procured this machine from CRAY systems (USA) and haven't manufactured this machine indigenously the US is obviously afraid of such large computational power falling into our hands and being used for wrong purposes (for defense and space research etc.) and hence by contract this machine will be strictly off limits for all non academic users. Yeah it's a bummer but I hope we quietly learn on our part from this system and take it in our stride. Maybe one day we can build an indigenous supercomputer to rub it in their face. There is already a National SuperComputing Mission setup by CDAC in India that will invest Rs.4,500 crore ($730 million) towards advancing our interests in the High Performance Computing field both in terms of resources and qualified manpower.
I leave you with some images (in the video below) taken during the installation and commissioning of the CRAY system at SERC and the tech spec slides I narrated at the inauguration.
EDIT: Excerpts of this blog appeared in Economic Times (Bangalore Edition) 22nd May Data Drive: IISc Brings Supercomp Right Here
Also read:
On Quora - What would be the reason behind IISc purchasing the new CRAY XC-40 supercomputer for their SERC department?
The Meraki Soul - A walkthrough my days at IISc through these pictures.