RESEARCH PAPER: The Case for a Managed AI and ML Model Infrastructure

By Jason Andersen, Patrick Moorhead - December 5, 2024

As artificial intelligence (AI) and machine learning (ML) gain market momentum, foundation models are being consumed and managed differently. Increasingly, organizations want to build and manage their own models that balance cost, reliability, and specificity of model outputs. Over the past 12 months, this balance has become easier to achieve for these reasons:

  • More diverse and efficient foundation models — The recent proliferation of different sizes and types of AI and ML models has lowered the barriers for executing industry- or organization-specific training and fine-tuning. This leads to higher quality and more accurate outputs from existing foundation models or new models derived from them.
  • Improved tooling and IT operational alignment — We have seen a maturing of data scientist tools in the form of both notebooks and IDEs. The industry has further improved tooling by taking steps to better align traditional IT tools such as Kubernetes and CI/CD pipelines with AI and ML needs. This has removed barriers between data scientists and IT ops resources.
  • More robust infrastructure — Another key ingredient of the reduced costs of custom models has been the rampant pace of expansion and improvement for AI and ML infrastructure, including the development of GPUs, TPUs, and accelerators. This has led to a drastic reduction in costs, such that some training implementations cost a small fraction of what they did even a year ago.

While these factors are sure to further increase the pace of development for AI and ML applications, challenges still remain if an organization wants to host its own AI or ML infrastructure, including:

  • Choice — An internally built and managed Al and ML environment may limit an organization’s choices for infrastructure, networking, models, and tooling due to availability or budgetary constraints. Also, rapid changes in the market may prompt the organization to regret recent decisions.
  • Costs — The costs of change as infrastructure matures are well understood. But AI and ML infrastructure is a particularly complex and high-cost effort that requires constant human attention. There are also costs associated with how the GPUs are utilized and governed amid competing priorities and projects.
  • Time to market — An additional challenge associated with large GPU-driven neural networks is managing reliability issues as components inevitably fail. These failures raise the potential for training runs to be halted or lost, which slows time to market.

So, although the barriers to entry for building and training AI and ML models continue to shrink, organizations may want to consider a managed end-to-end offering such as Amazon SageMaker HyperPod to improve time to market and overcome the complexities and costs associated with on-premises models.

Click the logo below to download the research paper to read more.

AWS logo

Table of Contents

  • Introduction
  • The Amazon Web Services Approach to AI
  • A Closer Look at Infrastructure for Training and Inference
  • SageMaker and the Maturation of AI and ML
  • SageMaker HyperPod and Reducing Time to Market
  • Other Benefits of SageMaker HyperPod
  • Conclusion

Companies Cited:

  • AWS
Jason Andersen
+ posts

Jason Andersen is vice president and principal analyst covering application development platforms, technologies, and services. Jason brings over 25 years of experience in product management, product marketing, corporate strategy, sales, and business development at Red Hat, IBM, and Stratus to his work for MI&S and its advisory clients. Working both in the field and in the headquarters of some of the most innovative technology companies, Jason has a wealth of experience in building great products and driving their adoption across a broad spectrum of industries and use cases.

Patrick Moorhead

Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.