Building a Wafer-Scale Deep Learning System: Lessons Learned
TimeTuesday, 19 November 20192pm - 2:30pm
DescriptionDeep learning has emerged as one of the most important workloads of our time. While it has applicability to many problems, its computational demands are profound. Compute requirements to train the largest deep learning models increased by 300,000x between 2012-2018 . Traditional processors are not well-suited to meet this demand.
To address this challenge, Cerebras has developed a new computer system optimized for deep learning, the CS-1. This system is powered by the largest chip ever built: the Cerebras Wafer-Scale Engine (WSE). Cerebras’ WSE is a single integrated 46,225 mm^2 silicon chip with >1.2 trillion transistors and 400,000 compute cores. It is >56x larger than today’s largest GPU, with 3,000x more on-chip memory and >10,000x memory bandwidth.
Here, we provide a technical overview of the CS-1 and discuss the unique engineering challenges associated with packaging, powering, cooling, and I/O for a wafer-scale processor.
1. Amodei, D., Hernandez, D. (2018). https://openai.com/blog/ai-and-compute/.