Skip to main content

Step-by-step setup

A beginner-friendly setup for macOS, Linux, WSL, and Windows PowerShell using NVIDIA NIM + LiteLLM.

1

Get NVIDIA API Key

Create a free account at build.nvidia.com. No credit card required. Verify your phone number, navigate to any model, and click 'Get API Key'.

# 1. Go to: https://build.nvidia.com
# 2. Create free account & verify phone
# 3. Click any model → "Get API Key"
# 4. Copy the key — it looks like:
nvapi-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Keep your key private — never share it or commit it to git.

2

Install Claude Code CLI

Install the Claude Code CLI using the method for your platform. The native installer is recommended — it auto-updates in the background.

macOS / Linux
# macOS / Linux (recommended)
curl -fsSL https://claude.ai/install.sh | bash

# Verify install
claude --version
Windows
# Windows — open PowerShell and run:
irm https://claude.ai/install.ps1 | iex

# Requires Git for Windows first:
# https://git-scm.com/downloads/win
macOS (Homebrew)
# macOS via Homebrew
brew install --cask claude-code
claude --version

After installing, open a new terminal window before running claude --version.

3

Create config.yaml

This file tells LiteLLM which NVIDIA NIM models to use for each Claude model slot. Create a folder and save this file inside it.

mkdir -p ~/litellm-nim
cd ~/litellm-nim
config.yaml
model_list:
  # ─── SONNET - Fast daily coding ───
  - model_name: claude-sonnet-4-6
    litellm_params:
      model: nvidia_nim/qwen/qwen3.5-122b-a10b
      api_key: os.environ/NVIDIA_NIM_API_KEY

  # ─── OPUS - Complex multi-file work ───
  - model_name: claude-opus-4-6
    litellm_params:
      model: nvidia_nim/nvidia/nemotron-3-super-120b-a12b
      api_key: os.environ/NVIDIA_NIM_API_KEY

  # ─── HAIKU - Quick answers ───
  - model_name: claude-haiku-4-5
    litellm_params:
      model: nvidia_nim/mistralai/mistral-small-4-119b-2603
      api_key: os.environ/NVIDIA_NIM_API_KEY

  # ─── CUSTOM 1: Kimi - Code from UI/screenshots ───
  - model_name: kimi-vision
    litellm_params:
      model: nvidia_nim/moonshotai/kimi-k2.5
      api_key: os.environ/NVIDIA_NIM_API_KEY

  # ─── CUSTOM 2: GLM-5 - Long agentic sessions ───
  - model_name: glm5-agentic
    litellm_params:
      model: nvidia_nim/z-ai/glm5
      api_key: os.environ/NVIDIA_NIM_API_KEY

  # ─── CUSTOM 3: Llama 3.3 - General coding ───
- model_name: llama-3.3-70b
  litellm_params:
    model: nvidia_nim/meta/llama-3.3-70b-instruct
    api_key: os.environ/NVIDIA_NIM_API_KEY

# ─── CUSTOM 4: DeepSeek V3.2 - Advanced reasoning ───
- model_name: deepseek-v3.2
  litellm_params:
    model: nvidia_nim/deepseek-ai/deepseek-v3_2
    api_key: os.environ/NVIDIA_NIM_API_KEY

# ─── CUSTOM 5: Mistral - Stable fallback ───
  - model_name: mistral-stable
    litellm_params:
      model: nvidia_nim/mistralai/mistral-small-4-119b-2603
      api_key: os.environ/NVIDIA_NIM_API_KEY

  - model_name: minimax-m2.7
    litellm_params:
      model: nvidia_nim/minimaxai/minimax-m2.7
      api_key: os.environ/NVIDIA_NIM_API_KEY

litellm_settings:
  drop_params: true

general_settings:
  master_key: "sk-litellm-local"

Save this as config.yaml inside the ~/litellm-nim folder.

4

Start LiteLLM via Docker

Run LiteLLM as a local proxy. This translates Claude Code's API calls into NVIDIA NIM API calls. Make sure Docker Desktop is running first.

macOS / Linux
cd ~/litellm-nim

docker run -d \
  -p 4000:4000 \
  -e NVIDIA_NIM_API_KEY="nvapi-YOUR_KEY_HERE" \
  -v $(pwd)/config.yaml:/app/config.yaml \
  --name litellm-nim \
  --restart always \
  docker.litellm.ai/berriai/litellm:main-stable \
  --config /app/config.yaml

# Verify it started correctly
docker logs litellm-nim
Windows
cd $env:USERPROFILE\litellm-nim

docker run -d `
  -p 4000:4000 `
  -e NVIDIA_NIM_API_KEY="nvapi-YOUR_KEY_HERE" `
  -v ${PWD}/config.yaml:/app/config.yaml `
  --name litellm-nim `
  --restart always `
  docker.litellm.ai/berriai/litellm:main-stable `
  --config /app/config.yaml

# Verify it started
docker logs litellm-nim
Expected Output
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:4000

Replace nvapi-YOUR_KEY_HERE with your actual NVIDIA API key. The --restart always flag means the container auto-starts whenever Docker Desktop opens.

5

Add Shell Alias

Add the claude-nim alias to your shell config file. This sets the environment variables that route Claude Code through your local LiteLLM proxy instead of Anthropic's servers.

macOS / Linux
# Add to ~/.zshrc (macOS) or ~/.bashrc (Linux)
export NVIDIA_NIM_API_KEY="nvapi-YOUR_KEY_HERE"

alias claude-nim='\
  ANTHROPIC_BASE_URL="http://localhost:4000" \
  ANTHROPIC_API_KEY="sk-litellm-local" \
  ANTHROPIC_MODEL="claude-sonnet-4-6" \
  ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-6" \
  ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-6" \
  ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-4-5" \
  claude'
Windows
# Add to PowerShell profile — run: notepad $PROFILE
$env:NVIDIA_NIM_API_KEY = "nvapi-YOUR_KEY_HERE"

function claude-nim {
  $env:ANTHROPIC_BASE_URL = "http://localhost:4000"
  $env:ANTHROPIC_API_KEY = "sk-litellm-local"
  $env:ANTHROPIC_MODEL = "claude-sonnet-4-6"
  $env:ANTHROPIC_DEFAULT_OPUS_MODEL = "claude-opus-4-6"
  $env:ANTHROPIC_DEFAULT_SONNET_MODEL = "claude-sonnet-4-6"
  $env:ANTHROPIC_DEFAULT_HAIKU_MODEL = "claude-haiku-4-5"
  claude @args
}

Replace nvapi-YOUR_KEY_HERE with your actual key. On Windows, open PowerShell profile with: notepad $PROFILE

6

Launch Claude Code

Reload your shell config and launch Claude Code. When asked about API key, select Yes to use the LiteLLM key.

macOS / Linux
# macOS / Linux
source ~/.zshrc   # or source ~/.bashrc on Linux
claude-nim
Windows
# Windows PowerShell
. $PROFILE
claude-nim

When Claude Code asks "Do you want to use this API key (sk-litellm-local)?" — select Yes (option 1).

Quick validation (recommended)

Before adding aliases or advanced model routing, run these checks to confirm your base setup is working.

# 1) Confirm container is running
docker ps --filter "name=litellm-nim"

# 2) Confirm proxy can list models
# macOS / Linux / WSL
curl http://localhost:4001/v1/models

# Windows PowerShell
iwr http://localhost:4001/v1/models

# 3) Launch Claude through LiteLLM
# macOS / Linux / WSL
ANTHROPIC_BASE_URL="http://localhost:4001" ANTHROPIC_API_KEY="sk-litellm-local" ANTHROPIC_MODEL="claude-sonnet-4-6" claude

# Windows PowerShell
$env:ANTHROPIC_BASE_URL="http://localhost:4001"; $env:ANTHROPIC_API_KEY="sk-litellm-local"; $env:ANTHROPIC_MODEL="claude-sonnet-4-6"; claude

Tip: After this works, then add permanent aliases/functions to your shell profile.

!

Important: API Key Setup

When you first run claude-nim, Claude Code will ask you:

Do you want to use this API key (sk-litellm-local)?

Important: You must select Yes (option 1). This tells Claude Code to use the LiteLLM proxy key instead of looking for an Anthropic API key in your environment.

Warning: If you select "No", Claude Code will not know where to find your API key and will fail to connect. You may need to restart the session with claude-nim if you accidentally select No.

How it works: The sk-litellm-localkey is just a local proxy password that tells LiteLLM to route your requests to NVIDIA NIM. It's safe and required for this setup to work.