Many will look at 2023 as the year that generative AI (GAI) made silicon cool again. AMD, Intel, and NVIDIA made major announcements in machine learning and deep learning, with product launches and strategic roadmaps accompanying bold predictions. And, of course, the GAI wave shined a spotlight on a long list of silicon startups with heavy financial backing, including Cerebras, Groq, Graphcore, and so many others.
While the GAI hype dominated the attention of marketers and pundits alike, it was easy to overlook all the activity in the traditional CPU market. AMD, Ampere, and Intel were all very active in 2023, as were the major cloud service providers (CSPs). It’s this last bit of activity—from the CSPs—that was perhaps more interesting, and that raises the question I pose in the title of this article. Is the merchant silicon stranglehold on the market over? How do the chip makers, especially AMD and Intel, respond to a customer base increasingly looking inward for tailored silicon to meet specific compute and acceleration needs?
In the following sections, I’ll examine the CSP market dynamics that have changed the merchant silicon market and offer some predictions.
Arm and its Neoverse portfolio have changed the equation
At the risk of sounding like an old-timer, the CPU game was much simpler “back in the day.” When hyperscalers and the cloud were still in the hypergrowth phase, the choices for CPU were simple—AMD or Intel. And when AMD’s Opteron became competitive, there was a genuine choice there. Frankly, even when AMD was on the downside of its competitive peak, there was still competition for winning business with CSPs and hyperscalers, because price-performance measurements were quite different than raw performance or synthetic benchmarks.
In the 2012–13 timeframe, Arm began to emerge as a potential server platform as companies including Qualcomm, Calxeda, Cavium, Applied Micro, and AMD (yes, even AMD) all adopted the Arm architecture for at least some of their chips. While these early projects all failed over time, they served an essential purpose in getting the software ecosystem to start embracing the Arm architecture. During this time, we saw groups such as Linaro drive open-source adoption of the Arm architecture. That was when we could see the currents starting to shift.
When Arm announced Neoverse (its datacenter portfolio) a few years later, the adoption by the open-source community—and cloud providers—seemed to accelerate. Every open-source project seemed to be optimizing for both Arm and x86, with Arm seeing considerable adoption for cloud-native workloads. Beyond Neoverse, Arm (and the Arm community) did a lot to set common standards around system readiness to assure software compatibility and base-level performance consistency.
Fast-forward to 2024, and we can see that Arm is everywhere in the cloud datacenter, either through in-house designs or through CPU vendors, especially Ampere.
Custom silicon — everybody’s doing it
In-house silicon design is nothing new for CSPs. In 2015, Google started using its internally designed Tensor Processing Unit (TPU) to accelerate machine learning. The company went public with the TPU in 2016 and made it available for third-party use in 2018.
Meanwhile, AWS acquired Annapurna Labs in 2015 for a reported price between $350 million and $370 million. In the nine years since the acquisition, the Annapurna team has delivered:
- Five generations of the AWS Nitro system
- Two generations of the AWS Trainium (training) accelerator
- Two generations of the AWS Inferentia (inferencing) accelerator
- Four generations of the AWS Graviton CPU
This is a serious return on investment for AWS. While the company also deploys CPUs, GPUs, and accelerators from traditional suppliers, by this point it can stand up its own silicon environment to support virtually any workload a customer may want to run.
Lastly, Microsoft has long been dabbling in custom silicon design and began supporting alternatives to the traditional x86 silicon vendors years ago. (Does anybody remember the Surface RT?) Its recent launch of the Maia AI accelerator and Cobalt CPU was covered extensively by Moor Insights & Strategy CEO Patrick Moorhead and me. You can read that analysis here.
As we exited 2023, AWS and Azure deployed CPUs designed in-house to support their single-socket workload demands. Google deploys Ampere’s Altra CPU, though I wouldn’t be surprised to see the company follow AWS and Azure’s lead with CPUs designed in-house. Finally, Oracle has also deployed Ampere at scale in Oracle Cloud Infrastructure (OCI). Given Oracle’s financial stake in Ampere, I don’t believe it will follow the in-house silicon design route.
If we zoom out a little, we can see the potential impact of these trends on chip makers starting with AMD and Intel. Replacing, or at least significantly reducing, the reliance on these companies for single-socket CPUs could significantly impact their revenues, which, in turn, would have several downstream impacts on investments the chip companies could make. This applies whether those investments are strategic or simply funding the continued delivery of a long roadmap.
AWS pushes the boundary
As mentioned above, virtually every cloud player is leveraging Arm for many single-socket workloads. Each of them is doing this for the cost savings and the ability to customize for its own specific power-performance requirements.
However, AWS’s launch of Graviton4 revealed both single- and dual-socket Graviton configurations built on the high-performing Neoverse V2 architecture. The real surprise was the two-socket configuration—which addresses the lion’s share of the cloud datacenter. If past performance is any indicator of the future, the company will aggressively deploy these platforms wherever possible. And this could also have a material impact on current chip suppliers.
What’s next?
This is the million-dollar—okay, billion-dollar—question. Given the momentum of Arm in the single-socket space and the relatively recent launch of Microsoft’s Cobalt and AWS’s Graviton4, how will AMD and Intel be impacted in 2024?
Here’s my prediction. AWS will be very aggressive in rolling out Graviton4 across its datacenters. In time, Azure will release a dual-socket version of its Cobalt CPU and follow suit. As mentioned previously, Google will announce its own CPU. And these moves will diminish the stronghold the x86 vendors have had in cloud computing.
Further, I believe we will see Arm gain traction in AI as NVIDIA demonstrates high performance for training with the Grace Hopper superchip (which uses the Arm architecture). This, combined with very consistent and performant single-threaded performance, will make Arm a strong alternative in the wave of AI inference that has yet to hit in full force.
This prediction is not to say that AMD and Intel are in danger. Both companies have leaned heavily into enabling AI and other workloads with distinct acceleration requirements. An example of this is the design work Intel has put into building discrete acceleration engines into its Xeon server CPUs to drive better performance for AI, analytics, and other data-intensive workloads.
Additionally, both companies have made significant investments in building out their respective acceleration portfolios—including GPUs, DPUs, ASICs, FPGAs, and other accelerator types. They are doing this through both in-house design and acquisition. And these investments are why I believe both companies are going to continue to be significant players in the datacenter.
Closing thoughts
If there has ever been a golden age of silicon, we live in it. The market dynamics (cloud, edge, mobile) intersecting with the workloads driving the modern enterprise have led to a wave of innovation we’ve never experienced.
This innovation along with all the new market entrants that are rushing in make for a crazy but exciting time for those who are tracking the market—each of us with our unique views and predictions. Only time will tell who was looking through the right crystal ball.