When Arm launched its Neoverse family of compute platforms in 2018, it marked a fundamental shift in the cloud service provider (CSP) market. Or more accurately, it set the stage for that shift. Ultimately, it was Arm’s launch of Neoverse, combined with AWS’s deployment of its Arm-based Graviton processor, that fundamentally shifted the CSP market. It was Arm’s rich, dedicated set of cores and IP, combined with the real-world proof of its success in servicing the biggest of CSP customers, that started this rush to infrastructure diversity.
Almost six years later, Arm’s presence in the cloud datacenter is well-established and growing. Cloud-native and scale-out support is expanding to higher-performing workloads and scale-ups. With the launch of its third-generation Neoverse platform, Arm is focusing on performance (ahem, AI), flexibility, and ecosystem—the three things that have led to its success so far.
In the following sections, I’ll dig into this third generation of Neoverse, its delivery via Arm’s compute subsystem (CSS) program, and what this means for CSPs. I’ll also dig a little deeper into this CSP–Arm relationship.
A Quick Overview of Neoverse and How the Current Lineup Is Positioned
Understanding that the needs of its customers are unique, Arm offers three Neoverse platforms from which partners can design CPUs. The E-Series is a power-efficient core targeted at networking and edge deployments. The N-Series is optimized for performance and power. CPUs built on the N2 are what we see primarily deployed in CSP datacenters (e.g., Azure Cobalt, AWS Graviton 3). Finally, the V-Series is optimized for performance. The V-Series core is the underlying architecture for NVIDIA’s Grace CPU and the recently launched AWS Graviton 4.
Neoverse has also been used to develop merchant silicon by Ampere Computing, which has emerged as the leading U.S. developer of CPUs for cloud providers. Ampere products include Altra (N-Series) and AmpereOne (V-Series). The company’s cloud customers include Azure (before the launch of Cobalt), Google Cloud Platform, and Oracle Cloud Infrastructure. Its OEM partners include HPE, Lenovo, and Supermicro.
With the release of the third-generation Neoverse, we’ve seen significant shifts in how each core is positioned. The V-Series, once primarily targeting HPC and other highly performant workloads, now has expanded coverage to include many of the workloads that power the cloud datacenter. The V3 can support up to 256 cores per socket. This gives designers many options for building workload- and environment-specific silicon.
The N3 is all about performance-per-watt (PPW), demonstrating a 20% PPW advantage over the N2. Arm also claims a significant gain in machine learning performance (196%) over the previous generation, thanks mainly to a private 2MB L2 cache per core. The N3 targets cloud, telco networking, and edge. Additionally, this option is a strong fit for data processing unit (DPU) support, such as networking (SmartNIC), security, and storage services. It is important to note that the N3, like the V3, is capable of supporting up to 256 cores per socket. However, given that its focus is on PPW and deployments such as telco, RAN, and enterprise networking, I don’t expect to see N3 CPUs with this core count hit the market.
The E-Series is focused on delivering low power and high throughput for networking services.
![](png/neoverse-workload-coverage.png)
For those who have some history with Neoverse, it may be noticed that the V-Series coverage has expanded to include what was traditionally N-Series coverage. Likewise, the N-Series seems to have crept down into coverage traditionally handled by the E-Series. This makes sense, given the emergence of the many different workloads now populating the cloud datacenter.
Arm CSS — Why It Matters
One of the shifts I noticed with the release of third-generation Neoverse is the emphasis on Neoverse Compute Subsystems. CSS is a program through which Arm takes Neoverse IP, optimizes it, and delivers customers a platform that can be further optimized for their specific needs.
CSS is the answer for silicon designers or organizations that want to reduce time-to-product, time-to-deployment, and time-to-value. Arm makes some incredible claims about CSS, citing one customer that saved up to 80 engineer-years in design and development. While this will surely not apply to every organization, the ability to shorten that all-important time-to-tape-out from years to months is an incredible benefit.
![](png/arm-css-time-to-market-advantages.png)
As I mentioned, Arm leans heavily on CSS as the delivery model for V3 and N3. And as I listened to the reasoning during briefings, I got it. Cloud providers and silicon designers alike want to better align software and hardware roadmaps to deliver fully optimized and highly efficient operating environments. As they do that, CSS helps remove the big gaps that have traditionally existed between the two development cycles—where software trends and innovations routinely outpace hardware design.
One more thing to note with CSS. While the V-Series and N-Series can support up to 256 cores each, there is no such CSS implementation at this time. If designers want to create a 256-core CPU based on V-Series or N-Series, they will have to do it from the ground up. Again, this makes sense because CSS is designed to address a large swath of the market, not the more specific scenarios that would require such a large number of cores.
Closing Thoughts
Arm has forever changed the CSP market, and the CSP market has, in turn, forever changed the CPU market. Because of Arm Neoverse and the company’s perseverance, cloud providers can now drive differentiation in their services like never before. Further, they can deliver this differentiation while driving down costs.
The adoption of Arm by CSPs continues to impact the CPU market. The prevalence of Arm across the cloud datacenter is undoubtedly putting pricing pressure on the incumbent suppliers. And as Arm’s footprint grows, so too will those pressures. Consider this: some estimates show Arm covering more than 10% of the cloud market already. This market share has been reached with just one major CSP having deployed Arm for a significant time (AWS). Further, this market share has been achieved by addressing only single-socket servers.
As Graviton 4 expands into two sockets, as Azure further develops Cobalt, as Oracle expands its Ampere deployments, and as Google gets more serious about its Arm deployments, we should look for even more significant gains from Arm.
This growth could, in turn, spur several different dynamics. For starters, I don’t expect to see AMD and Intel idly sit by as their market share goes elsewhere.
Whatever happens, buckle up. This is going to be a fun ride.