White Paper: Build with Relentless Intelligence

Fast AI inference. Maximum efficiency. Real-world deployment.

As AI inference becomes the dominant workload, organisations face a new constraint: how to deliver high-performance AI at scale without overwhelming power, cooling, and operational capacity. SambaNova addresses this challenge with a purpose-built, full-stack AI platform designed from silicon to software for fast inference with unmatched energy efficiency.

At the core of the platform is SambaNova’s Reconfigurable Dataflow Unit (RDU), engineered specifically for AI workloads. Combined with air-cooled, power-optimised racks and a unified orchestration layer, SambaNova enables organisations to run the largest open-source models, including Llama 4 and DeepSeek 671B, with lightning-fast inference and significantly lower power consumption.

This paper outlines how SambaNova’s platform supports developers, enterprises, governments, and data-centre operators through three integrated offerings: SambaCloud for rapid development, SambaStack for sovereign and on-premises deployment, and SambaManaged for launching fully managed inference clouds in as little as 90 days.

The result is AI infrastructure that is efficient, scalable, and deployable in standard data-centre environments, without exotic cooling or specialist facilities, enabling organisations to confidently own and operate their AI future.

Product Brief: SambaRack™ SN40L-16

High-performance AI inference, engineered for efficiency at scale

SambaRack™ SN40L-16 is the hardware system designed to run the most demanding AI inference workloads with exceptional performance and industry-leading power efficiency. Built around SambaNova’s purpose-built SN40L Reconfigurable Dataflow Unit (RDU), SambaRack enables organisations to run the latest and largest open-source models, including multi-model and agentic workloads, within a compact, air-cooled footprint.

Unlike legacy GPU-based systems, SambaRack uses a dataflow architecture that dynamically maps AI algorithms directly to the processor, eliminating architectural redundancy and maximising tokens per watt. A single rack can run dozens of models simultaneously, switching between them in microseconds, while typically consuming around 10 kW, making it suitable for deployment in standard data-centre environments. 

The system’s three-tiered memory architecture combines large SRAM, high-bandwidth memory, and extensive attached DDR to support up to trillions of parameters across hundreds of models. This makes SambaRack particularly well suited to large-scale inference, custom checkpoints, chained models, and agentic AI configurations, without sacrificing latency or efficiency. 

SambaRack is deployable on-premises or in hosted data centres and forms the hardware foundation of both SambaCloud and SambaStack, enabling organisations to scale AI inference quickly, efficiently, and with full control over their data and models.

Platform Brief: SambaStack

The purpose-built, full-stack platform for high-speed AI inference

SambaStack is a full-stack AI inference platform designed to deliver accurate, production-scale inference at high speed and with exceptional efficiency. Built to support the most demanding workloads, SambaStack enables organisations to run ever-larger models, including Mixture of Experts (MoE), reasoning, and chain-of-thought models, without compromising latency or user experience. 

Unlike legacy platforms that run a single model per rack or require specialised liquid cooling, SambaStack is engineered for multi-model operation on a single, air-cooled system, typically consuming around 10 kW of power. This makes it suitable for deployment in standard data-centre environments while significantly reducing space, power, and operational cost. 

Powered by SambaNova’s SN40L Reconfigurable Dataflow Unit (RDU), SambaStack uses a dataflow architecture and three-tier memory design to maximise inference efficiency. Models are held in memory and can be switched in as little as 2 microseconds, enabling complex agentic AI configurations where multiple models are orchestrated together in real time. 

SambaStack supports both model bundles and bring-your-own checkpoints, allowing organisations to pre-train models on existing infrastructure and run them efficiently on SambaNova systems. The result is a platform that combines performance, efficiency, and control, giving enterprises and governments the ability to operate AI securely, at scale, and on their own terms.

White Paper: AI Model Ownership – Three Critical Considerations

Why owning your AI model matters more than ever

As generative AI moves from experimentation to enterprise and government deployment, organisations face a critical strategic decision: whether to rely on shared, vendor-owned models, or to own and control their AI models as core business assets.

This paper outlines three fundamental considerations that should guide that decision.

First, governance and regulatory requirements. Across many industries, transparency and explainability are no longer optional. Regulations such as GDPR and financial supervision rules increasingly require organisations to document model training lineage, data sources, and decision logic. Owning the model is often essential to meeting these obligations and maintaining customer trust.

Second, accuracy and domain-specific intelligence. Achieving high accuracy in enterprise use cases requires fine-tuning models with industry- and organisation-specific data. Shared, general-purpose models may perform well on broad tasks but frequently fall short in specialised domains, increasing the risk of errors and hallucinations where the consequences can be severe.

Third, long-term value creation. While point AI tools can improve individual tasks, they rarely create lasting competitive advantage. Organisations that invest in AI as an owned asset, trained on their own data and deployed on dedicated infrastructure, build intellectual property that grows in value over time and becomes a foundation for future innovation.

The paper concludes that organisations will increasingly be defined by how well they adopt and govern generative AI, and that building AI as an asset, not a dependency, is central to sustainable value, compliance, and competitive differentiation.