Supported Models

Argyll currently runs MiniMax M2.7 on our dedicated UK infrastructure. The models below are supported for customers who require custom configurations, tailored deployments, or specific routing requirements.

Talk to Sales

⚡

424.8 tok/s on M2.7

Up to 6× faster than GPU-based providers. SambaNova RDU architecture delivers industry-leading throughput with the lowest time-to-first-token.

MiniMax-M2.7

MiniMax

Primary Model Text

Context Lengths

8K–32K (BS 2, 4, 6, 8) · 160K (BS 2) · 192K (BS 2)

Capabilities

Function calling · Structured output

Coding agent SWE-Pro 56.2% GDPval-AA ELO 1,494

Model card → Chat completions

gpt-oss-120b

OpenAI

Featured Text

Context Lengths

8K–32K (BS 2, 4, 6, 8) · 64K (BS 2, 4) · 128K (BS 2)

Capabilities

Reasoning · Function calling · JSON mode · Logit masking

Main / planner agent Tool-calling agent

Model card → Chat completions

gpt-oss-20b

OpenAI

Text

Context Lengths

8K–32K (BS 2, 4, 6, 8) · 64K (BS 2, 4) · 128K (BS 2)

Capabilities

Function calling · JSON mode

Main / planner agent Tool-calling agent Reasoning

Model card → Chat completions

MiniMax-M2.5

MiniMax

Text

Context Lengths

8K–32K (BS 2, 4, 6, 8) · 160K (BS 2) · 192K (BS 2)

Capabilities

Function calling · Structured output

Coding agent

Model card → Chat completions

DeepSeek-R1-0528

DeepSeek

Reasoning Text

Context Lengths

4K (BS 4) · 8K (BS 1) · 16K (BS 1) · 32K (BS 1)

Capabilities

Function calling · JSON mode

Complex reasoning

Model card → Chat completions

DeepSeek-V3.2

DeepSeek

Text

Context Lengths

8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1) · 128K (BS 1)

Capabilities

Optional thinking mode · Function calling · JSON mode

Main / planner agent Tool-calling agent

Model card → Chat completions

DeepSeek-V3.1

DeepSeek

Reasoning Text

Context Lengths

4K (BS 4) · 8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1)

Capabilities

Function calling · JSON mode

Main / planner agent Tool-calling agent

Model card → Chat completions

DeepSeek-V3-0324

DeepSeek

Text

Context Lengths

4K (BS 4) · 8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1)

Capabilities

Function calling · JSON mode

Main / planner agent Tool-calling agent

Model card → Chat completions

DeepSeek-R1-Distill-Llama-70B

DeepSeek

Reasoning Text

Context Lengths

4K–128K · Up to BS 32 at shorter contexts

Capabilities

Speculative decoding · Custom checkpoints

Complex reasoning

Model card → Chat completions

Meta-Llama-3.3-70B-Instruct

Meta

Text

Context Lengths

4K–128K · Up to BS 32 at shorter contexts

Capabilities

Function calling · JSON mode · Speculative decoding · Custom checkpoints

Task agent Tool-calling agent Text to SQL / Cipher

Model card → Chat completions

Meta-Llama-3.1-8B-Instruct

Meta

Text

Context Lengths

4K (BS up to 128) · 8K (BS up to 64) · 16K (BS up to 8)

Capabilities

Function calling · JSON mode · Custom checkpoints

Gateway agent Validation agent

Model card → Chat completions

Meta-Llama-3.1-405B-Instruct

Meta

Text

Context Lengths

4K (BS 1, 2, 4) · 8K (BS 1) · 16K (BS 1)

Capabilities

Function calling · JSON mode · Speculative decoding

Task agent Tool-calling agent Code generation

Model card → Chat completions

Llama-4-Maverick-17B-128E-Instruct

Meta

Image Text

Context Lengths

8K–128K (BS 1)

Capabilities

Function calling · JSON mode

Image understanding Task agent Tool-calling agent

Model card → Chat completions

Mistral-Large-3-675B-Instruct-2512

Mistral AI · Preview

Text

Context Lengths

8K (BS 1)

Capabilities

Function calling

Multilingual Task agent Tool-calling agent

Model card → Chat completions

Qwen3-235B-A22B-Instruct-2507

Alibaba Cloud

Reasoning Text

Context Lengths

32K (BS 2, 4, 6, 8) · 128K (BS 2)

Capabilities

Non-thinking mode · Function calling · JSON mode

Agentic planner Multilingual

Model card → Chat completions

Qwen3-32B

Alibaba Cloud

Reasoning Text

Context Lengths

8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1, 2)

Capabilities

Thinking mode · Tool use · Multilingual

Task agent Multilingual

Model card → Chat completions

gemma-3-27b-it

Google

Text

Context Lengths

4K–128K (BS 2, 4, 6, 8)

Capabilities

Image understanding · JSON mode

Image understanding Task agent

Model card → Chat completions

gemma-3-12b-it

Google

Image Text

Context Lengths

128K (BS 2, 4, 6, 8)

Capabilities

Image understanding · JSON mode

Image understanding Task agent

Model card → Chat completions

gemma-4-31B-it

Google · Preview

Text

Context Lengths

128K (BS 2, 4, 6, 8)

Capabilities

Function calling · JSON mode

Task agent

Model card → Chat completions

Whisper-Large-v3

OpenAI

Audio

Context Lengths

4K (BS 1, 16, 32)

Capabilities

Audio transcription · Speech translation · Multilingual ASR

Speech recognition (ASR) Audio transcription

Model card → Translation · Transcription

E5-Mistral-7B-Instruct

intfloat

Embedding

Context Lengths

4K (BS 1, 4, 8, 16, 32)

Capabilities

Text embeddings · Semantic search · Retrieval

Vector storage & retrieval (RAG)

Model card → Embeddings

All models served from UK sovereign infrastructure via an OpenAI-compatible API.
Swap your base URL. Keep your code.

Start Building Talk to Sales