At the same time, Google has launched Ironwood, its seventh-generation TPU (Tensor Processing Unit). This new model is technology’s next big leap forward, and it will tremendously amplify artificial intelligence (AI) applications. Ironwood is expected to be generally available later this year for Google Cloud customers. It is set to change the way we power and scale inferential AI models.
The biggest new feature that sets Ironwood apart is that it’s the first TPU optimized for specific inference tasks. The chip features a new proprietary specialized core, named SparseCore. It is purposefully designed to be able to efficiently process the types of data typically encountered within “superior ranking” and “recommendation” common task. This focus on inference marks a strategic shift in Google’s approach to AI processing, emphasizing the growing demand for real-time data analysis and decision-making.
Ironwood features a maximum computing power of 4,614 TFLOPs and is loaded with 192GB of exclusive RAM. This sophisticated arrangement ensures that we’ll see major performance improvements over previous iterations. The chip features an astounding bandwidth that approaches 7.4 terabits per second (Tbps). This ultra-low latency allows for incredibly fast transfer and processing of data, a necessity for the most impactful large-scale AI applications.
Ironwood’s architecture greatly reduces data movement and latency on-chip, resulting in improved power savings of 40% on average. This improvement increases the chip’s robustness to levels well beyond those of past TPU generations. It ensures that the chip is designed to continually meet the increasing needs of today’s cutting-edge AI workloads.
“Ironwood is our most powerful, capable, and energy-efficient TPU yet,” said Amin Vahdat, Google Cloud Vice President. He emphasized the innovation behind Ironwood, stating, “Ironwood represents a unique breakthrough in the age of inference.”
The new TPU will be available in two configurations: a 256-chip cluster and a larger 9,216-chip cluster. This level of flexibility provides companies the opportunity to choose an architecture that works for their specific compute requirements. Ironwood will soon be connected into Google’s own AI Hypercomputer, multiplying its intelligence and productivity exponentially.
Leave a Reply