Skip to main content

Model providers

This page covers LLM/model providers (not chat channels like WhatsApp/Telegram). For model selection rules, see /concepts/models.

Quick rules

  • Model refs use provider/model (example: ollama/qwen3-coder:32b).
  • If you set agents.defaults.models, it becomes the allowlist.
  • CLI helpers: datzi onboard, datzi models list, datzi models set <provider/model>.
Datzi runs on Ollama (qwen3-coder:14b). No external API keys required.

Built-in providers

Datzi runs on Ollama (qwen3-coder:14b). No external API keys required.

Providers via models.providers (custom/base URL)

Datzi runs on Ollama (qwen3-coder:14b). No external API keys required.
Ollama is a local LLM runtime that provides an OpenAI-compatible API:
  • Provider: ollama
  • Auth: None required (local server)
  • Example model: ollama/qwen3-coder:14b
  • Installation: https://ollama.ai
# Install Ollama, then pull the recommended model:
ollama pull qwen3-coder:14b
{
  agents: {
    defaults: {
      model: {
        primary: 'ollama/qwen3-coder:14b'
      }
    }
  }
}
Ollama is automatically detected when running locally at http://127.0.0.1:11434/v1. See Local models for model recommendations and custom configuration.

vLLM

vLLM is a local (or self-hosted) OpenAI-compatible server:
  • Provider: vllm
  • Auth: Optional (depends on your server)
  • Default base URL: http://127.0.0.1:8000/v1
To opt in to auto-discovery locally (any value works if your server doesn’t enforce auth):
export VLLM_API_KEY="vllm-local"
Then set a model (replace with one of the IDs returned by /v1/models):
{
  agents: {
    defaults: {
      model: {
        primary: 'vllm/your-model-id'
      }
    }
  }
}
See Local models for details on local proxy configuration.

Local proxies (LM Studio, vLLM, LiteLLM, etc.)

Example (OpenAI‑compatible):
{
  agents: {
    defaults: {
      model: {
        primary: 'lmstudio/minimax-m2.1-gs32'
      },
      models: {
        'lmstudio/minimax-m2.1-gs32': {
          alias: 'Minimax'
        }
      }
    }
  },
  models: {
    providers: {
      lmstudio: {
        baseUrl: 'http://localhost:1234/v1',
        apiKey: 'LMSTUDIO_KEY',
        api: 'openai-completions',
        models: [
          {
            id: 'minimax-m2.1-gs32',
            name: 'MiniMax M2.1',
            reasoning: false,
            input: ['text'],
            cost: {
              input: 0,
              output: 0,
              cacheRead: 0,
              cacheWrite: 0
            },
            contextWindow: 200000,
            maxTokens: 8192
          }
        ]
      }
    }
  }
}
Notes:
  • For custom providers, reasoning, input, cost, contextWindow, and maxTokens are optional. When omitted, Datzi defaults to:
    • reasoning: false
    • input: ["text"]
    • cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }
    • contextWindow: 200000
    • maxTokens: 8192
  • Recommended: set explicit values that match your proxy/model limits.

CLI examples

datzi onboard --auth-choice opencode-zen
datzi models set ollama/qwen3-coder:32b
datzi models list
See also: /gateway/configuration for full configuration examples.