In the heart of Microsoft’s Redmond campus lies a closely guarded lab, where engineers meticulously refine silicon, the fundamental building block of our digital era. After years of stealthy development, Microsoft proudly unveiled two bespoke chips and integrated systems at the recent Microsoft Ignite event: the Azure Maia AI Accelerator, optimized for artificial intelligence tasks and generative AI, and the Azure Cobalt CPU, an Arm-based processor tailored for general-purpose compute workloads on the Microsoft Cloud.
These groundbreaking chips represent the final piece of the puzzle for Microsoft, enabling the delivery of infrastructure systems designed entirely in-house. From silicon choices to software, servers, racks, and cooling systems, every element has been meticulously optimized, offering flexibility to cater to both internal and customer workloads.
Scott Guthrie, Executive Vice President of Microsoft’s Cloud + AI Group, emphasized the company’s commitment to building infrastructure that supports AI innovation and meets the diverse needs of its customers. “At the scale we operate, it’s important for us to optimize and integrate every layer of the infrastructure stack to maximize performance, diversify our supply chain, and give customers infrastructure choice.”
Chips: The Backbone of Cloud Computing
Chips serve as the workhorses of the cloud, orchestrating billions of transistors that process colossal streams of data. Microsoft sees the inclusion of proprietary chips as a strategic move to ensure every aspect of its cloud and AI workloads is finely tuned. These chips will integrate into custom server boards and fit seamlessly within tailor-made racks, all designed to perfectly complement Microsoft’s software ecosystem.
The goal is to create an Azure hardware system that offers unparalleled flexibility while optimizing for power, performance, sustainability, and cost, according to Rani Borkar, Corporate Vice President for Azure Hardware Systems and Infrastructure (AHSI). The chips are part of Microsoft’s larger strategy to co-design hardware and software for enhanced capabilities and opportunities.
Optimizing the Tech Stack
During the Ignite event, Microsoft also announced the general availability of Azure Boost, a system designed to accelerate storage and networking by offloading these processes from host servers onto purpose-built hardware and software.
Expanding its collaboration with industry partners, Microsoft introduced the NC H100 v5 Virtual Machine Series, built for NVIDIA H100 Tensor Core GPUs. Additionally, it will incorporate the latest NVIDIA H200 Tensor Core GPU into its fleet next year to support larger model inferencing without latency increases. Further, Azure will integrate AMD MI300X accelerated VMs to accelerate AI workloads.
Borkar emphasized Microsoft’s commitment to providing customers with diverse infrastructure options by adding first-party silicon to its growing ecosystem of chips and hardware from industry partners. The goal is to prioritize customer needs, offering a range of choices based on performance, price, and various other dimensions.
Co-Evolving Hardware and Software
The Maia 100 AI Accelerator, a cornerstone of Microsoft’s Azure infrastructure, powers some of the largest internal AI workloads running on Azure. Microsoft’s collaboration with OpenAI has been pivotal in refining the Maia chip to run efficiently and cost-effectively on Microsoft’s infrastructure.
According to Brian Harry, a Microsoft technical fellow leading the Azure Maia team, the vertical integration of chip design with AI infrastructure maximizes performance and efficiency, ensuring optimal utilization of hardware.
On the other hand, the Cobalt 100 CPU, built on Arm architecture, focuses on efficiency and performance in cloud-native offerings. Wes McCullough, Corporate Vice President of hardware product development, highlighted the sustainability goal, aiming for enhanced “performance per watt” to achieve more computing power with minimal energy consumption.
Custom Hardware: A Holistic Approach
Microsoft’s journey towards custom silicon began in 2016, aiming to optimize its cloud infrastructure. The recent introduction of proprietary chips allows Microsoft to target specific qualities, ensuring optimal performance for crucial workloads.
The meticulous testing process ensures each chip performs optimally under varying conditions, mirroring real-world scenarios in Microsoft’s data centers. The integration of chips into custom-designed racks, optimized for the intensive computational demands of AI tasks, showcases Microsoft’s dedication to a systems approach in infrastructure development.
The adoption of liquid cooling methods for these high-performance chips, represented by the Maia 100 rack and its “sidekick” design, reflects Microsoft’s commitment to overcoming thermal challenges while maintaining efficiency.
Future Innovations and Mission
Microsoft’s innovation in silicon design sets the stage for future developments. Plans for second-generation Azure Maia AI Accelerators and Azure Cobalt CPUs are already underway. The overarching mission remains clear: optimize every layer of the technological stack, from core silicon to end services, ensuring the best possible experience for Azure customers.
By leveraging in-house designed chips and expanding partnerships with industry giants, Microsoft aims to offer unmatched performance, power efficiency, and cost optimization. The company’s commitment to technological advancement heralds a new era in cloud computing and AI, setting the stage for transformative solutions that meet the diverse needs of businesses and developers worldwide.