Microsoft has announced a new Maia 200 accelerator chip for artificial intelligence (AI) that is three times more powerful than hardware from rivals such as Google and Amazon, company representatives said.
This modern chip is used for AI inference rather than training, powering systems and agents used to make predictions, provide answers to queries, and generate outputs based on new data fed to them.
you may like
Scott Guthrie, Microsoft’s executive vice president of cloud and AI, said in a blog post that the new chip will deliver performance in excess of 10 petaflops (1015 floating point operations per second). It is a measure of supercomputing performance, with the world’s most powerful supercomputers reaching more than 1,000 petaflops of power.
The new chip achieved this performance level in a data representation category known as “4-bit precision (FP4),” a highly compressed model designed to accelerate AI performance. The Maia 200 also delivers 5 PFLOPS of performance with 8-bit precision (FP8). The difference between the two is that FP4 is much more energy efficient but less accurate.
“Practically speaking, a single Maia 200 node can comfortably run our largest models today, with plenty of headroom for even larger models in the future,” Guthrie said in a blog post. “This means Maia 200 delivers three times the FP4 performance of 3rd generation Amazon Trainium and FP8 performance over Google’s 7th generation TPUs.”
Chips, hi
Maia 200 may be used for specialized AI workloads in the future, such as running large-scale LLMs. So far, Microsoft’s Maia chips have only been used in Azure cloud infrastructure to run large-scale workloads for Microsoft’s own AI services, particularly Copilot. But Guthrie said “there will be much broader availability for customers in the future,” hinting that other organizations may be able to utilize the Maia 200 via the Azure cloud, or that the chip could one day be deployed in standalone data centers or server stacks.
Guthrie said the adoption of a 3-nanometer process made by Taiwan Semiconductor Manufacturing Company (TSMC), the world’s most important manufacturing company, allows Microsoft to achieve 100 billion transistors per chip, with a 30% increase in performance per dollar compared to existing systems. This essentially means that Maia 200 can be more cost-effective and efficient for the most demanding AI workloads than existing chips.
In addition to performance and efficiency improvements, Maia 200 has several other features. For example, it includes a memory system that helps keep AI model weights and data local, so the model requires less hardware to run. It is also designed for immediate integration into existing data centers.
Maia 200 enables you to run AI models faster and more efficiently. This means Azure OpenAI users, including scientists, developers, and enterprises, can now experience increased throughput and speed when developing AI applications and using GPT-4 and more in production.
Because the Maia 200 is designed for data centers rather than consumer hardware, this next-generation AI hardware is unlikely to disrupt most people’s daily AI and chatbot usage in the short term. However, end users will see the impact of Maia 200 in the form of faster response times and potentially more advanced capabilities for Copilot and other AI tools built into Windows and Microsoft products.
Maia 200 also has the potential to bring performance improvements to developers and scientists using AI inference through Microsoft’s platform. This, in turn, could lead to improvements in large-scale research projects and the implementation of AI in elements such as advanced climate modeling, biological or chemical systems and compositions.
Source link
