“Supercomputer” is a trendy word you may have already heard once or twice before. And it doesn’t take a genius to infer that supercomputers are to regular PCs much the same as Superman is to a Neanderthal. Even though the idea behind the technology appears simple enough: you take a lot of integrated circuits, or microchips, and you put them together to create a massive computer – one much bigger than your PC. But, really, can it be that straightforward? Is this all it takes to build a working supercomputer?
The computers you know and use are almost all based on the von Neumann (Princeton) architecture, which, in turn, consists of three main components: CPU (central processing unit), I/O (input/output) devices, and a memory unit.
Up to recently, most computers used serial processing, meaning that a single CPU processed every piece of input data. Or, to summarize few points made by Blaise Barney, the author of Introduction to Parallel Computing:
“Traditionally, computer software has been written for serial computation. To solve a problem, an algorithmÂ is constructed and implemented as a serial stream of instructions. These instructions are executed on a central processing unit on one computer. Only one instruction may execute at a time – after that instruction is finished, the next one is executed.”
Nowadays, though, we’ve grown accustomed to much faster computers. The difference is that this new generation of computers (first introduced a decade ago) uses multi-core processing technology, which means that multiple processing units (called cores) are placed on a single microchip. Multi-core processing is a type of parallel processing, because it allows for a single processor to run multiple instructions on separate cores simultaneously.
Though precedents exist (IBM’s Cyclops64), supercomputers wouldn’t have been super if they satisfied themselves with multi-core processing only. And this where massively parallel processing (MPP) comes in. How massive? Just for illustrative purposes, China’s TaihuLight, the world’s fastest supercomputer (as of 2017), uses about 41,000 processors, each of them containing 256 cores, amounting to more than 10.5 million of CPU cores across the system. Now, that’s what you can rightly call “massive”.
As far as supercomputers are concerned, massively parallel processing is the norm, but it doesn’t really paint the whole picture. This is mainly due to the fact that different modes of processing are optimized for different types of instructions: while some are better suited for parallel processing, few tasks can be better treated in a “serial processing” fashion.
Consequently, in terms of supercomputer’s architecture, clusters and grids should be treated with the same degree of reverence as massively parallel processing. Both names are self-explanatory.
Loosely speaking, to make a computer cluster, it’s enough to take few PCs, put them in one room and connect them to each other by LAN (local area network). Controlled by a software, the computers in the cluster can subsequently work together in a manner which allows to view them as a single computer, meaning computer clusters are able to perform tasks as if a single unit. Some of the fastest supercomputers in the world often include many clusters.
As opposed to cluster computing, in grids, each node is set to perform a different task/application only. This is because in grid computing, PCs are not connected via LAN, but by means of the Internet, meaning neither the location of each PC is important, nor it is possible to look at the connected computers as a single unit. But since a grid connects remote single units, and cluster computers can be treated as such, a single grid can be built of multiple geographically independent computer clusters. Grid computing is a type of distributed computing, the main idea of which is to divide a problem into many tasks,
“each of which is solved by one or more computers, which communicate with each other by message passing” – Tolga Soyata ed., Enabling Real-Time Mobile Cloud Computing through Emerging Technologies, p. 180
Supercomputers, however, are usually faster. And can be built to perform a simple task as perfectly as possible. For instance, IBM Watson was developed to take part in the Jeopardy! TV show, Deep Blue was built specifically to play against world chess champion Garry Kasparov, and Google’s AlphaGo to compete against the best Go players in the world.
Yup, they all won. Which left many people wondering if the time when machines will become as intelligent as humans is near.
Few years ago, it was virtual reality; nowadays, it seems that it is Artificial Intelligence (AI) dictating software and hardware requirements. Nothing strange there, if you bear in mind the fact that advanced ML algorithms and complex data models require massively parallel processing. And even though NVIDIA’s GPUs turned out to be significantly more efficient for ML than traditional CPUs, they aren’t optimized for performing ML tasks yet.
Things are changing, however. Heavyweights such as Google are already paying much more attention to creating hardware specifically AI-oriented. Google’s CEO Sundar Pichai has recently revealed that a powerful new AI chip (Tensor Processing Unit) has been developed by the company:
“We are building what we think of as AI-first data centers. Cloud TPUs are optimized for both training and inference. This lays the foundation for significant progress”.
Conventionally, computer performance is measured in FLOPS (floating point operations per second). Google’s Tensor Processing Unit is capable of delivering 128 teraflops (consider that one teraflop equals a trillion floating point operations per second). Compared to Intel’s CPUs and Nvidia’s GPUs, Google’s TPUs look much faster and more efficient.
And if something is happening in the world of computers, you can be damn sure that something much bigger is already in the process of becoming reality in the world of supercomputers. Two of the ten fastest supercomputers in the world are built by Japan’s Fujitsu, but TOP500 is topped by two Chinese supercomputers. And if China can, why shouldn’t Japan at least try?
A year ago, the Land of the Rising Sun announced a plan to build the fastest supercomputer in the world. Named “AI Bridging Cloud Infrastructure” (ABCI), at 130 petaflops, it should leave TaihuLight’s 93 petaflops far behind.
But, that’s not what’s most interesting part about ABCI’s development. This is: it’s built specifically for the purposes of AI. It’s time we started preparing for a radically different world, isn’t it?
The field of AI is extremely dependent on hardware. So much so, in fact, that it seems that we have reached a point where hardware is built and optimized specifically for AI purposes. It took a while to evolve from ordinary CPUs to massively parallel processing GPUs. But it shouldn’t take as much to make the next step. AI was never interested in ordinary computers, so why should hardware manufacturers be? It’s the dawn of a new age, the age of supercomputers. Be alert and expect massive breakthroughs in the near future.