In March 2024, Broadcom launched its new observability platform, WatchTower Platform, to provide organizations with a better way to manage and improve the efficiency and reliability of mainframe operations.
Mainframe computers, especially from IBM, have powered the most mission-critical workloads for many prominent organizations for decades. As technology trends have come and gone and cloud computing has taken center stage, the mainframe is the one constant in global organizations’ datacenters. The reason for this is simple: mainframes are highly performant, reliable, and secure. WatchTower uses observability and AIOps to provide organizations with a comprehensive overview of their mainframes’ performance. It lets you predict potential problems by looking at past trends, stops minor issues from turning into big problems, and helps you actively oversee your mainframe operations.
In this article, I’ll explain what WatchTower does and how it fits Broadcom’s overall strategy. I have also asked my Moor Insights & Strategy colleagues Matt Kimball and Will Townsend—who have long experience in storage, networking, and datacenter operations—to weigh in about the impact of WatchTower on the mainframe industry and the challenges ahead.
Broadcom’s Transformation From Silicon To Systems
Broadcom offers a wide range of semiconductor and infrastructure software products, focusing on datacenters, enterprise networking, and the broadband, wireless, and industrial sectors. Originally Avago Technologies, the company bought Broadcom Corporation for $37 billion in 2016 and took on the Broadcom name.
Over the past several years, Broadcom has expanded its portfolio (and reach) through a series of acquisitions in the systems management software market. Some of its notable acquisitions include Computer Associates (2018), Symantec (2019), and VMware (2023). The company has leveraged these acquisitions to address the range of platforms and operations that drive the enterprise datacenter.
WatchTower is the latest example of the value-add the company seeks to create. It represents an opportunity for global organizations to gain deeper insights and operational efficiencies for their IT environments as Broadcom enters the observability market.
Mainframe Management
Mainframe computers are designed from the ground up with reliability in mind, which is why they have been a staple in IT organizations since their inception. (In fact, mainframes were around for several decades before the term “information technology” was first used.) However, like any compute platform, mainframes are not infallible. Even while they are running reliably, and despite the best efforts of IT professionals and COBOL programmers, mainframes can sometimes operate at less than peak performance. Managing the underlying compute, storage, and networking to support IT programs optimally is difficult, and virtually any IT organization can benefit from observability tools that abstract complexity and improve performance through automation.
WatchTower improves the management of mainframes by offering deep visibility and automation. This approach consolidates the detection and resolution of problems by aggregating and examining data from different sources. It provides dashboard views that illustrate the health of the mainframe and its applications. The platform uses AIOps to enhance efficiency and reliability, minimizing the need for manual intervention. WatchTower combines operational tools, workflows, data, and ML insights from across the IT environment into a single, easy-to-navigate platform that suits users of all skill levels. This enables WatchTower to identify early signs of problems by detecting patterns that must be addressed quickly. Meanwhile, OpenTelemetry collects detailed data on application performance, providing a clearer view of the mainframe’s operation and how the environment is interconnected. (More on OpenTelemetry below.)
The diagram above outlines the WatchTower platform’s primary components and capabilities. It includes essential elements such as OPS/MVS for automating responses to system events, SYSVIEW for overseeing mainframe performance, NetMaster Network for improving network access, Vantage Storage for more efficient storage management, and DCI/MAT for managing capacity. Among its capabilities are Alert Insights for incident workflow management, ML Insights for preemptive problem-solving intelligence, Topology for supporting better business results, and Real-Time Streaming for continuous transaction workflow observation.
The WatchTower platform is built to shift IT operations from reactive to proactive, enhancing performance and ensuring business continuity. The product’s open architecture also allows it to work with security tools such as Datadog and Splunk, facilitating deeper observability integrations. WatchTower brings a broad set of observability features to bear and does so within mainframe architectures that are saddled with legacy infrastructure integration challenges.
Mainframe Customers
As mentioned earlier, the largest organizations use mainframes to drive their most mission-critical business operations. The biggest banks, insurance providers, healthcare companies, retailers, and government entities all employ mainframes for a reason—because they need to process substantial amounts of data to handle thousands of transactions per second. These systems support high-availability operations such as processing financial transactions, managing large databases, and running complex applications. Banks use mainframes to process credit card transactions and maintain accurate account balances, while government agencies use them to deliver many services that constituents rely on daily.
For many years, mainframes have relied on standalone tools with disconnected workflows to manage operations. The challenge with such an operational environment lies in silos of data that are unshared and not considered in the fuller context that would be enabled by aggregating and viewing data more globally. WatchTower delivers that functionality so IT professionals can easily see cause and correlation across the application, data, and storage environments. This allows them to find new efficiencies and stave off performance degradation issues or system failures before they occur.
A good datacenter observability strategy needs to extend to both mainframes and non-mainframe elements. Mainframe observability offers real-time insight into mainframe performance, while non-mainframe observability monitors, troubleshoots, and analyzes servers, applications, and networks. Together, they provide a comprehensive view of the IT infrastructure, allowing for quick identification and resolution of issues.
Benefits from OpenTelemetry
OpenTelemetry is an open-source set of tools, APIs, and SDKs for gathering and exporting telemetry data—such as traces, metrics, and logs—from cloud-native applications. It is designed to be compatible with many observability platforms, enabling developers to add telemetry to their code easily. This helps them monitor system performance and behavior.
Integrating OpenTelemetry into WatchTower brings several benefits. For starters, it gives IT analysts real-time insights to identify and assess bottlenecks affecting user experience. The integration also improves visibility throughout applications, enhancing the efficiency of monitoring and troubleshooting. Also, OpenTelemetry provides standardization with its common set of data formats, which helps integrate mainframes into service management strategies. By using OpenTelemetry, WatchTower can incorporate broader enterprise observability tools, providing deeper levels of assurance and security.
AIOps Capabilities
While AIOps is a term that has been around for nearly a decade, in truth, this IT function has existed more in slideware than in the real world. The promise of automating IT operations through machine learning and advanced analytics—and the complexity of delivering on that promise—is what has fueled the success of companies such as ServiceNow. The popularity of that company is due in part to the strength of its solution, but it is augmented by the simple observation that the promise of AIOps has mostly gone unmet for years, and for good reason: AIOps is hard to deploy. Grabbing telemetry and performance data from across the enterprise to find correlation and causation before problems even manifest requires an advanced approach to systems management. Further, turning those findings into concrete corrective or even preventative actions elevates this approach to an entirely new level of difficulty.
AIOps is perhaps even more complex for the mainframe market, given that it has operated as an underserved segment for some time—despite supporting such critical workloads. WatchTower addresses this by using ML algorithms trained specifically for mainframes to better baseline and monitor operations. Besides its other benefits, this approach enables IT organizations to embrace automated IT operations at their own pace, moving from augmentation of traditional operations to full automation over time.
Broadcom’s support for OpenTelemetry demonstrates the company’s understanding of how WatchTower will be used in the enterprise. Beyond that, it shows that the company knows what will enable the solution to deliver on the openness and robustness that makes AIOps work best.
Wrapping Up
The fact that mainframes have existed for so many decades is a testament to their performance and resiliency. However, as with servers in the modern datacenter, a modernized approach to monitoring and managing mainframe operations enables even greater operational efficiency. While observability platforms have emerged to drive incredible value for the cloudified IT landscape, Moor Insights & Strategy has seen the mainframe environment as underserved in this respect.
With WatchTower, Broadcom is meeting the needs of large enterprise organizations that rely on mainframes for crucial functions and require a modernized IT operations platform across the entire datacenter. The company is delivering a highly capable observability platform that provides customers with deeper visibility to manage the complexity of mainframe operations—and in an automated way. By improving awareness and resilience like this, Broadcom has delivered a potential game changer.
Note: Moor Insights & Strategy principal analysts Will Townsend and Matt Kimball also contributed to this article.