Supported Models

Argyll currently runs MiniMax M2.7 on our dedicated UK infrastructure. The models below are supported for customers who require custom configurations, tailored deployments, or specific routing requirements.

424.8 tok/s on M2.7
Up to 6× faster than GPU-based providers. SambaNova RDU architecture delivers industry-leading throughput with the lowest time-to-first-token.

MiniMax-M2.7

MiniMax
Primary Model Text
Context Lengths
8K–32K (BS 2, 4, 6, 8) · 160K (BS 2) · 192K (BS 2)
Capabilities
Function calling · Structured output
Coding agent SWE-Pro 56.2% GDPval-AA ELO 1,494
Model card → Chat completions

gpt-oss-120b

OpenAI
Featured Text
Context Lengths
8K–32K (BS 2, 4, 6, 8) · 64K (BS 2, 4) · 128K (BS 2)
Capabilities
Reasoning · Function calling · JSON mode · Logit masking
Main / planner agent Tool-calling agent
Model card → Chat completions

gpt-oss-20b

OpenAI
Text
Context Lengths
8K–32K (BS 2, 4, 6, 8) · 64K (BS 2, 4) · 128K (BS 2)
Capabilities
Function calling · JSON mode
Main / planner agent Tool-calling agent Reasoning
Model card → Chat completions

MiniMax-M2.5

MiniMax
Text
Context Lengths
8K–32K (BS 2, 4, 6, 8) · 160K (BS 2) · 192K (BS 2)
Capabilities
Function calling · Structured output
Coding agent
Model card → Chat completions

DeepSeek-R1-0528

DeepSeek
Reasoning Text
Context Lengths
4K (BS 4) · 8K (BS 1) · 16K (BS 1) · 32K (BS 1)
Capabilities
Function calling · JSON mode
Complex reasoning
Model card → Chat completions

DeepSeek-V3.2

DeepSeek
Text
Context Lengths
8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1) · 128K (BS 1)
Capabilities
Optional thinking mode · Function calling · JSON mode
Main / planner agent Tool-calling agent
Model card → Chat completions

DeepSeek-V3.1

DeepSeek
Reasoning Text
Context Lengths
4K (BS 4) · 8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1)
Capabilities
Function calling · JSON mode
Main / planner agent Tool-calling agent
Model card → Chat completions

DeepSeek-V3-0324

DeepSeek
Text
Context Lengths
4K (BS 4) · 8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1)
Capabilities
Function calling · JSON mode
Main / planner agent Tool-calling agent
Model card → Chat completions

DeepSeek-R1-Distill-Llama-70B

DeepSeek
Reasoning Text
Context Lengths
4K–128K · Up to BS 32 at shorter contexts
Capabilities
Speculative decoding · Custom checkpoints
Complex reasoning
Model card → Chat completions

Meta-Llama-3.3-70B-Instruct

Meta
Text
Context Lengths
4K–128K · Up to BS 32 at shorter contexts
Capabilities
Function calling · JSON mode · Speculative decoding · Custom checkpoints
Task agent Tool-calling agent Text to SQL / Cipher
Model card → Chat completions

Meta-Llama-3.1-8B-Instruct

Meta
Text
Context Lengths
4K (BS up to 128) · 8K (BS up to 64) · 16K (BS up to 8)
Capabilities
Function calling · JSON mode · Custom checkpoints
Gateway agent Validation agent
Model card → Chat completions

Meta-Llama-3.1-405B-Instruct

Meta
Text
Context Lengths
4K (BS 1, 2, 4) · 8K (BS 1) · 16K (BS 1)
Capabilities
Function calling · JSON mode · Speculative decoding
Task agent Tool-calling agent Code generation
Model card → Chat completions

Llama-4-Maverick-17B-128E-Instruct

Meta
Image Text
Context Lengths
8K–128K (BS 1)
Capabilities
Function calling · JSON mode
Image understanding Task agent Tool-calling agent
Model card → Chat completions

Mistral-Large-3-675B-Instruct-2512

Mistral AI · Preview
Text
Context Lengths
8K (BS 1)
Capabilities
Function calling
Multilingual Task agent Tool-calling agent
Model card → Chat completions

Qwen3-235B-A22B-Instruct-2507

Alibaba Cloud
Reasoning Text
Context Lengths
32K (BS 2, 4, 6, 8) · 128K (BS 2)
Capabilities
Non-thinking mode · Function calling · JSON mode
Agentic planner Multilingual
Model card → Chat completions

Qwen3-32B

Alibaba Cloud
Reasoning Text
Context Lengths
8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1, 2)
Capabilities
Thinking mode · Tool use · Multilingual
Task agent Multilingual
Model card → Chat completions

gemma-3-27b-it

Google
Text
Context Lengths
4K–128K (BS 2, 4, 6, 8)
Capabilities
Image understanding · JSON mode
Image understanding Task agent
Model card → Chat completions

gemma-3-12b-it

Google
Image Text
Context Lengths
128K (BS 2, 4, 6, 8)
Capabilities
Image understanding · JSON mode
Image understanding Task agent
Model card → Chat completions

gemma-4-31B-it

Google · Preview
Text
Context Lengths
128K (BS 2, 4, 6, 8)
Capabilities
Function calling · JSON mode
Task agent
Model card → Chat completions

Whisper-Large-v3

OpenAI
Audio
Context Lengths
4K (BS 1, 16, 32)
Capabilities
Audio transcription · Speech translation · Multilingual ASR
Speech recognition (ASR) Audio transcription
Model card → Translation · Transcription

E5-Mistral-7B-Instruct

intfloat
Embedding
Context Lengths
4K (BS 1, 4, 8, 16, 32)
Capabilities
Text embeddings · Semantic search · Retrieval
Vector storage & retrieval (RAG)
Model card → Embeddings

All models served from UK sovereign infrastructure via an OpenAI-compatible API.
Swap your base URL. Keep your code.

Start Building Talk to Sales