According to Reuters, Cerebras Systems launched its 13.5 million core Andromeda AI supercomputer on Monday. According to Cerebras, Andromeda offers 16-bit half precision AI processing at speeds exceeding one exaflop (1 quintillion operations per second).
The Andromeda is made up of a networked grouping of 16 Cerebras C-2 computers. The Wafer Scale Engine chip (also known as “WSE-2”), which measures approximately 8.5 inches square and contains 2.6 trillion transistors arranged into 850,000 cores, is the current largest silicon chip ever made.
For $35 million, Cerebras constructed Andromeda at a data center in Santa Clara, California. It has previously been utilized for both academic and commercial work and is tailored for applications like huge language models. In a press release, Cerebras states that “Andromeda enables near-perfect scaling via easy data parallelism across GPT-class large language models, including GPT-3, GPT-J, and GPT-NeoX.”
According to Cerebras, “near-perfect scaling” describes how training time for neural networks is decreased as more CS-2 computer units are added to Andromeda. Typically, when hardware costs grow, the benefits of scaling up a deep-learning model by increasing compute capacity on GPU-based systems may become less favorable. Additionally, Cerebras asserts that its supercomputer is capable of jobs that GPU-based systems are not: ” GPU impossible work was demonstrated by one of Andromeda’s first users, who achieved near perfect scaling on GPT-J at 2.5 billion and 25 billion parameters with long sequence lengths—MSL of 10,240. The users attempted to do the same work on Polaris, a 2,000 Nvidia A100 cluster, and the GPUs were unable to do the work because of GPU memory and memory bandwidth limitations.”
It remains to be seen if those assertions can withstand external inspection, but Cerebras seems to be presenting an option in a time when businesses frequently train deep-learning models on escalating Nvidia GPU clusters.