SambaNova SN40L technology

Inference infrastructure built around data movement.

SambaNova SN40L Reconfigurable Dataflow Units are built for the practical bottleneck in production AI: keeping large models moving fast, efficiently, and under operational control.

Talk to Argyll See Deployment Model

The technical case

Inference is not just a compute problem.

When AI moves from experiments into live products, the pressure shifts to sustained inference: moving model data, serving tokens, switching models, and keeping latency predictable. SN40L is designed around that data movement problem.

Conventional GPU path

Traditional accelerator designs often move data repeatedly between memory and compute. That can make production inference more power hungry and operationally complex at scale.

RDU dataflow path

RDU architecture maps workloads as a continuous dataflow, reducing unnecessary movement and improving the balance between throughput, latency, and energy use.

How the serving path works

API request

SambaStack orchestration

SN40L dataflow fabric

Three-tier memory

Fast token output

Architecture

Built for model residency, switching, and agent workloads.

The SN40L story is not just a chip story. It is a chip, rack, memory, and software stack designed to run large open models in production environments where utilisation matters.

Reconfigurable dataflow

The RDU fabric is designed to map AI workloads directly to a dataflow execution path. Less avoidable movement means better sustained serving economics.

Three-tier memory

SN40L combines on-chip SRAM, high-bandwidth memory, and attached DDR memory so large models and expert systems can be served efficiently.

Full-stack software

SambaStack turns the hardware into an enterprise platform: model serving, orchestration, deployment control, and API access in one stack.

AirDesigned for existing air-cooled data centre environments.

10kWAverage power reference for SambaRack in SambaNova platform material.

MultiMultiple models can remain available for agentic applications.

OpenSupports major open-weight models and customer checkpoints.

Why Argyll

We turn the rack into a sovereign AI service.

The commercial value is not owning unusual hardware. It is turning efficient inference into a reliable platform for UK-hosted workloads, regulated teams, and organisations that need control over data, deployment, and operating model.

Hosted inference

Start with managed access to fast open-weight models before committing to reserved infrastructure or a dedicated environment.

Dedicated capacity

Reserve infrastructure for predictable demand, production support, performance planning, and governance requirements.

Sovereign deployment

Design private, on-premises, hosted, or hybrid deployments around UK data residency, resilience, and organisational control.

Make SN40L practical for your organisation.

Argyll helps teams move from evaluation to hosted inference, dedicated capacity, or fully sovereign deployment without rebuilding their stack around hyperscaler assumptions.

Start a Technical Review

Source basis: Argyll pages for RDU chip, whitepapers, and services; SambaNova official material for SN40L RDU and SambaStack.