Start your day with intelligence. Get The OODA Daily Pulse.

Getting infrastructure right for generative AI

Facts, it has been said, are stubborn things. For generative AI, a stubborn fact is that it consumes very large quantities of compute cycles, data storage, network bandwidth, electrical power, and air conditioning. As CIOs respond to corporate mandates to “just do something” with genAI, many are launching cloud-based or on-premises initiatives. But while the payback promised by many genAI projects is nebulous, the costs of the infrastructure to run them is finite, and too often, unacceptably high. Infrastructure-intensive or not, generative AI is on the march. According to IDC, genAI workloads are increasing from 7.8% of the overall AI server market in 2022 to 36% in 2027. In storage, the curve is similar, with growth from 5.7% of AI storage in 2022 to 30.5% in 2027. IDC research finds roughly half of worldwide genAI expenditures in 2024 will go toward digital infrastructure. IDC projects the worldwide infrastructure market (server and storage) for all kinds of AI will double from $28.1 billion in 2022 to $57 billion in 2027. But the sheer quantity of infrastructure needed to process genAI’s large language models (LLMs), along with power and cooling requirements, is fast becoming unsustainable. “You will spend on clusters with high-bandwidth networks to build almost HPC [high-performance computing]-like environments,” warns Peter Rutten, research vice president for performance-intensive computing at IDC. “Every organization should think hard about investing in a large cluster of GPU nodes,” says Rutten, asking, “What is your use case? Do you have the data center and data science skill sets?”

Full feature : How companies are ensuring a cost-effective approach for delivering the massive storage, bandwidth, and computing resources necessary for genAI.