
Two Ways to Build Out Enterprise AI Infrastructure
May 15
3 min read

Enterprises are using technologies such as AI, ML, data analytics, and automation to power new product development, improve product quality, streamline operations, and improve marketing results, among other business functions.
According to BCC Research, the global enterprise AI market will grow at a 44% CAGR from $8.3 billion in 2022 to $68.9 billion by 2028.
To achieve this, interested enterprises must decide how to design the networking and compute infrastructure to support enterprise AI initiatives, while also achieving their business and technical goals.
A critical architectural choice for enterprises is where to perform data inferencing – on premises or in the cloud. A major deciding factor to this decision is the latency threshold for the use cases. Robotics and manufacturing, for example, need a real-time response in about 10 milliseconds.
Public Cloud Leads Market
Most of today’s enterprise AI/ML applications are dependent on public cloud for data processing and inferencing. This approach benefits from the scalability and computational power of the cloud, but it introduces transport network latency as data packets travel to the cloud server for processing and then back to the premises with the inference.
A 2020 National Science Foundation paper studied the latency difference between edge servers and cloud. The report found that more than half of edge cloud users experienced 10 ms latency whereas more than 50% of cloud users experienced 20 ms latency. Only between 3%-21% of public cloud users experienced 10 ms of latency thanks to cloud servers being physically closer to them.
Adding 20 ms of latency to the data processing time means that performance will not meet the stringent real-time requirements of applications like robotic automation or security monitoring. Additionally, network congestion and potential bottlenecks in cloud processing can further impact response times, making this solution less optimal for ultra-low-latency needs.
On Premises Provides Low Latency
The ideal solution is an on-premises AI architecture using edge servers connected via private 5G to deliver bounded latency for real-time inferencing workloads. This setup minimizes data transmission delays by keeping both sensor data and inferencing models within the local network, eliminating the need to send data to an external cloud.
Private 5G further enhances performance by offering a dedicated, high-bandwidth, and low-latency wireless connection between sensors and edge servers. This approach is ideal for applications like security cameras, robotics, and manufacturing, where split-second decision-making is critical. By processing data at the edge, enterprises can achieve real-time responsiveness while maintaining full control over the infrastructure.
Qualcomm’s New Edge AI Solution
One company that believes in edge for enterprise AI is Qualcomm, which debuted its Qualcomm AI On-Prem Appliance Solution at CES in January 2025. The appliance is one the first all-in-one enterprise AI edge servers on the market.
The server comes packaged in a standalone appliance that delivers local AI services like voice agents in-a-box, offload of small language models (SLMs), large language models (LLMs) and large multimodal models (LMMs), and retrieval of augmented generation (RAG) functions for intelligent indexed search and summarization, agentic AI, AI workflow automation, image generation, code generation, computer vision, and camera processing.
This appliance is a step toward simplifying edge-based enterprise AI architectures. Qualcomm’s recent acquisition of Edge Impulse furthers its commitment to this strategy by setting the stage for further innovation.
For many of the fastest growing enterprise AI applications, the need for low latency networks can’t be overstated.
Conclusion
Bounded latency is a requirement for real-time applications, as cloud-based inferencing cannot match the near-instantaneous response of a fully localized solution. Additionally, reliance on external infrastructure raises concerns about data privacy, compliance, and potential connectivity disruptions.
With edge networking becoming easier to deploy – thanks to new products from companies like Qualcomm, an on-premises edge AI architecture with private 5G is the superior choice.