Supported Models
Argyll currently runs MiniMax M2.7 on our dedicated UK infrastructure. The models below are supported for customers who require custom configurations, tailored deployments, or specific routing requirements.
424.8 tok/s on M2.7
Up to 6× faster than GPU-based providers. SambaNova RDU architecture delivers industry-leading throughput with the lowest time-to-first-token.
MiniMax-M2.7
MiniMax
Context Lengths
8K–32K (BS 2, 4, 6, 8) · 160K (BS 2) · 192K (BS 2)Capabilities
Function calling · Structured output
Coding agent
SWE-Pro 56.2%
GDPval-AA ELO 1,494
Model card →
Chat completions
gpt-oss-120b
OpenAI
Context Lengths
8K–32K (BS 2, 4, 6, 8) · 64K (BS 2, 4) · 128K (BS 2)Capabilities
Reasoning · Function calling · JSON mode · Logit masking
Main / planner agent
Tool-calling agent
Model card →
Chat completions
gpt-oss-20b
OpenAI
Context Lengths
8K–32K (BS 2, 4, 6, 8) · 64K (BS 2, 4) · 128K (BS 2)Capabilities
Function calling · JSON mode
Main / planner agent
Tool-calling agent
Reasoning
Model card →
Chat completions
MiniMax-M2.5
MiniMax
Context Lengths
8K–32K (BS 2, 4, 6, 8) · 160K (BS 2) · 192K (BS 2)Capabilities
Function calling · Structured output
Coding agent
Model card →
Chat completions
DeepSeek-R1-0528
DeepSeek
Context Lengths
4K (BS 4) · 8K (BS 1) · 16K (BS 1) · 32K (BS 1)Capabilities
Function calling · JSON mode
Complex reasoning
Model card →
Chat completions
DeepSeek-V3.2
DeepSeek
Context Lengths
8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1) · 128K (BS 1)Capabilities
Optional thinking mode · Function calling · JSON mode
Main / planner agent
Tool-calling agent
Model card →
Chat completions
DeepSeek-V3.1
DeepSeek
Context Lengths
4K (BS 4) · 8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1)Capabilities
Function calling · JSON mode
Main / planner agent
Tool-calling agent
Model card →
Chat completions
DeepSeek-V3-0324
DeepSeek
Context Lengths
4K (BS 4) · 8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1)Capabilities
Function calling · JSON mode
Main / planner agent
Tool-calling agent
Model card →
Chat completions
DeepSeek-R1-Distill-Llama-70B
DeepSeek
Context Lengths
4K–128K · Up to BS 32 at shorter contextsCapabilities
Speculative decoding · Custom checkpoints
Complex reasoning
Model card →
Chat completions
Meta-Llama-3.3-70B-Instruct
Meta
Context Lengths
4K–128K · Up to BS 32 at shorter contextsCapabilities
Function calling · JSON mode · Speculative decoding · Custom checkpoints
Task agent
Tool-calling agent
Text to SQL / Cipher
Model card →
Chat completions
Meta-Llama-3.1-8B-Instruct
Meta
Context Lengths
4K (BS up to 128) · 8K (BS up to 64) · 16K (BS up to 8)Capabilities
Function calling · JSON mode · Custom checkpoints
Gateway agent
Validation agent
Model card →
Chat completions
Meta-Llama-3.1-405B-Instruct
Meta
Context Lengths
4K (BS 1, 2, 4) · 8K (BS 1) · 16K (BS 1)Capabilities
Function calling · JSON mode · Speculative decoding
Task agent
Tool-calling agent
Code generation
Model card →
Chat completions
Llama-4-Maverick-17B-128E-Instruct
Meta
Context Lengths
8K–128K (BS 1)Capabilities
Function calling · JSON mode
Image understanding
Task agent
Tool-calling agent
Model card →
Chat completions
Mistral-Large-3-675B-Instruct-2512
Mistral AI · Preview
Context Lengths
8K (BS 1)Capabilities
Function calling
Multilingual
Task agent
Tool-calling agent
Model card →
Chat completions
Qwen3-235B-A22B-Instruct-2507
Alibaba Cloud
Context Lengths
32K (BS 2, 4, 6, 8) · 128K (BS 2)Capabilities
Non-thinking mode · Function calling · JSON mode
Agentic planner
Multilingual
Model card →
Chat completions
Qwen3-32B
Alibaba Cloud
Context Lengths
8K (BS 1, 4) · 16K (BS 1) · 32K (BS 1, 2)Capabilities
Thinking mode · Tool use · Multilingual
Task agent
Multilingual
Model card →
Chat completions
gemma-3-27b-it
Google
Context Lengths
4K–128K (BS 2, 4, 6, 8)Capabilities
Image understanding · JSON mode
Image understanding
Task agent
Model card →
Chat completions
gemma-3-12b-it
Google
Context Lengths
128K (BS 2, 4, 6, 8)Capabilities
Image understanding · JSON mode
Image understanding
Task agent
Model card →
Chat completions
gemma-4-31B-it
Google · Preview
Context Lengths
128K (BS 2, 4, 6, 8)Capabilities
Function calling · JSON mode
Task agent
Model card →
Chat completions
Whisper-Large-v3
OpenAI
Context Lengths
4K (BS 1, 16, 32)Capabilities
Audio transcription · Speech translation · Multilingual ASR
Speech recognition (ASR)
Audio transcription
Model card →
Translation · Transcription
E5-Mistral-7B-Instruct
intfloat
Context Lengths
4K (BS 1, 4, 8, 16, 32)Capabilities
Text embeddings · Semantic search · Retrieval
Vector storage & retrieval (RAG)
Model card →
Embeddings
All models served from UK sovereign infrastructure via an OpenAI-compatible API.
Swap your base URL. Keep your code.