
Announcing Flow-IPC, an Open-Source Project for Developers to Create Low-Latency Applications
In a few words, IPC means separate programs sharing data structures — from file contents to configuration to algorithmic details — among one another.
In 2024, we witnessed an unprecedented explosion in artificial intelligence (AI) innovation, leaving many people in awe of the rapid advancements. Tech behemoths raced to secure the most powerful GPUs to train even more capable large language models (LLMs), and now AI is seamlessly entering into every nook and cranny of the world as we know it.
Amid the whirlwind of new AI companies, models, and applications, one trend emerged with clarity — the pendulum is swinging from AI training toward AI inference. Bloomberg suggests that AI inference will grow into a US$1.3 trillion market by 2032, which is echoed by a number of other recent reports. This shift in the market indicates that 2025 is the year that will accelerate distributed AI inferencing.
While training will continue to be pivotal in the production of robust AI models, the future will be focused on inference — the art of deploying these models to deliver real-time, actionable insights and outcomes for businesses and consumers alike — while injecting dynamic feedback loops from the edge back into the training process, fostering a cycle of continuous model improvement and adaptation.
AI inference is the point at which AI transforms from the promise of potential into practical application with real-world impact. Our customers are employing AI inference across a spectrum of industries and use cases. These include:
These examples merely scratch the surface of what customers can achieve with AI inferencing. As edge computing continues to evolve, we anticipate even more innovative applications across diverse sectors.
The delivery of such innovation has created some common challenges, including latency, cost, and scalability. At Akamai, we’ve been solving these issues in various contexts for decades.
The approach of consolidating swathes of generalized, overpowered GPUs into centralized data centers is no longer sufficient to deliver the outputs of well-trained AI models at the scale and with the responsiveness that the masses demand. We need to adopt an entirely new paradigm to enable inference architectures to be closer to users; that is, via a distributed cloud model.
There are unique considerations when delivering AI inference via a distributed cloud model, including:
These considerations are critical aspects of deploying AI on distributed, decentralized cloud infrastructure and are drawing the focus of organizations looking to use AI effectively in their businesses.
At Akamai, we’re building the world’s most distributed cloud. Our extensive global infrastructure, developed over nearly 30 years, includes 25+ core compute regions, a rapidly expanding set of distributed compute locations, and more than 4,000 edge points of presence. This robust ecosystem is primed to meet the AI inference needs of organizations, for today and tomorrow.
We recognize that while customers demand high performance, they’re increasingly wary of the exorbitant cost overruns that are common with traditional cloud vendors. Akamai Cloud is designed to address this growing concern.
Instead of stockpiling expensive, generalized GPUs that are overkill for AI inference tasks, we’ve opted to provide customers with a balanced GPU alternative: Nvidia’s RTX 4000 Ada series GPUs which offer a blend of performance and cost efficiency, making them ideal for AI inferencing and running small language models, and for specialized workloads like media transcoding.
This approach allows us to deliver superior AI capabilities closer to users, while maintaining cost-effectiveness for our customers. Our testing has shown more than 80% cost savings when running a generative AI Stable Diffusion model, when compared with equivalent GPU alternatives that are available on traditional public cloud providers.
We believe this approach yields the most powerful and cost-efficient outcomes, and can encourage novel AI use cases.
As we continue to enhance AI’s usefulness, we believe that distributed inferencing is more than just a technological advancement — it’s a fundamental reimagining of how we use AI. The shift from centralized, resource-intensive computing to a continuum of distributed, efficient edge computing isn’t just inevitable, it’s already underway.
At Akamai, we’re not just observing the transformation — we’re actively shaping it. By combining the strength of our global distributed network, strategic cloud computing investments (including inference-optimized GPUs), and a deep understanding of performance and cost-efficiency, we’re focused on enabling organizations to unlock the true potential of AI inference.
Organizations have recognized that it isn’t a question of if but rather how they should embrace AI inference. The edge is no longer just a destination for data — it’s becoming the primary arena where AI delivers its most impactful, real-time insights. Welcome to the next generation of computing.
Interested in learning more about AI inference performance benchmarks? Read our white paper.
In a few words, IPC means separate programs sharing data structures — from file contents to configuration to algorithmic details — among one another.
In 2024, we witnessed an unprecedented explosion in artificial intelligence (AI) innovation, leaving many people in awe of the rapid advancements.
The Akamai Female Learning and Mentoring Experience (Akamai FLAME) was created to support and empower female tech talent to help shape the future of our industry.