Mistral
Mistral Medium 3.5
NIM API Target
mistralai/mistral-medium-3.5-128bPerformance
⚡ Very Fast (~0.6s)
Coding Rating
⭐⭐⭐⭐
Ideal use case: Daily driver — fast, clean English
9 free NIM models, curated for Claude Code — map any to Sonnet, Opus, or Haiku.
Model Slot Mapping
Claude Code relies on internal sub-slots (Sonnet, Opus, and Haiku) for optimal tooling workflows. This chart outlines exactly how to pipe those Anthropic models into free NVIDIA endpoints for top-tier performance.
| Claude Slot | CLI Command | Target NVIDIA Model | Optimization |
|---|---|---|---|
| Sonnet (default) | /model claude-sonnet-4-6 | mistralai/mistral-medium-3.5-128b | Daily coding — fast and reliable |
| Opus (powerful) | /model claude-opus-4-6 | deepseek-ai/deepseek-v4-pro | Complex reasoning & multi-file work |
| Haiku (quick) | /model claude-haiku-4-5 | deepseek-ai/deepseek-v4-flash | Background tasks — fast |
| Specialty — Mistral | /model claude-mistral | mistralai/mistral-medium-3.5-128b | Fast general coding alternative |
| Specialty — DeepSeek | /model claude-deepseek | deepseek-ai/deepseek-v4-pro | Deep reasoning (explicit access) |
| Specialty — DeepSeek Flash | /model claude-deepseek-flash | deepseek-ai/deepseek-v4-flash | Fast reasoning, 1M context |
| Specialty — GLM | /model claude-glm | z-ai/glm-5.1 | Long agentic sessions |
| Specialty — MiniMax | /model claude-minimax | minimaxai/minimax-m3 | General purpose coding + vision |
| Specialty — Gemma | /model claude-gemma | google/gemma-4-31b-it | Vision + code (screenshots, UI work) |
| Specialty — Step | /model claude-step | stepfun-ai/step-3.7-flash | Heavy reasoning specialist (slow) |
| Specialty — Kimi | /model claude-kimi | moonshotai/kimi-k2.6 | Vision specialist — use sparingly |
| Specialty — Nemotron | /model claude-nemotron | nvidia/nemotron-3-ultra-550b-a55b | NVIDIA flagship — complex instruction following |
Available Models
Browse the full registry of open-weights models available on NVIDIA's platform compatible through your LiteLLM proxy.
Mistral
NIM API Target
mistralai/mistral-medium-3.5-128bPerformance
⚡ Very Fast (~0.6s)
Coding Rating
⭐⭐⭐⭐
Ideal use case: Daily driver — fast, clean English
DeepSeek
NIM API Target
deepseek-ai/deepseek-v4-proPerformance
⚡ Fast (~6s)
Coding Rating
⭐⭐⭐⭐⭐
Ideal use case: Deep reasoning, hard bugs, multi-file work
DeepSeek
NIM API Target
deepseek-ai/deepseek-v4-flashPerformance
⚡ Fast
Coding Rating
⭐⭐⭐⭐
Ideal use case: Fast coding & background tasks
NIM API Target
google/gemma-4-31b-itPerformance
⚡ Very Fast
Coding Rating
⭐⭐⭐⭐
Ideal use case: Vision + fast coding, screenshots/UI
Z.AI
NIM API Target
z-ai/glm-5.1Performance
🔵 Medium
Coding Rating
⭐⭐⭐⭐
Ideal use case: Long agentic sessions, tool-heavy workflows
MiniMax
NIM API Target
minimaxai/minimax-m3Performance
⚡ Fast
Coding Rating
⭐⭐⭐⭐
Ideal use case: General coding + vision, strong reasoning
StepFun
NIM API Target
stepfun-ai/step-3.7-flashPerformance
🔵 Medium (~2.6s, reasoning-heavy)
Coding Rating
⭐⭐⭐⭐
Ideal use case: Reasoning specialist — switch to explicitly
Moonshot AI
NIM API Target
moonshotai/kimi-k2.6Performance
🐢 Slow (can exceed 2min on free tier)
Coding Rating
⭐⭐⭐⭐⭐
Ideal use case: Vision specialist — heavy, use sparingly
NVIDIA
NIM API Target
nvidia/nemotron-3-ultra-550b-a55bPerformance
🔵 Medium
Coding Rating
⭐⭐⭐⭐⭐
Ideal use case: NVIDIA's flagship — complex reasoning & instruction following
Coding Capability Rating
⭐⭐⭐⭐⭐
Exceptional for agentic coding & complex refactors
⭐⭐⭐⭐
Reliable fallback and fast general generation
⭐⭐⭐
Basic coding support and documentation
Stay Updated
NVIDIA NIM constantly adds new models. Visit the official registry to explore the full catalog and discover the latest models for your coding needs.