Configuration Reference

Configuration files are loaded in order, with later sources overriding earlier ones:

~/.config/kodacode/config.yaml — global defaults
./kodacode.yaml — project-specific overrides

Environment variables (${VAR_NAME}) are expanded automatically in api_key, base_url, env, and headers fields.

Merge Behavior

Global config is loaded first, then project-local kodacode.yaml is merged on top. Merge behavior is field-specific:

Shape	Behavior	Examples
Scalars and simple nested fields	Override when the field is present in project config	`default_agent`, `session.max_retries`, `tui.auto_resume`
Keyed collections	Merge by stable key; matching entries are replaced or merged by field rules	`providers` by `id`, `mcp.servers` by `name`/`command`, `lsp.servers` by `name`, `session.models` by model key, `skills.models` by model key
Additive lists	Appended uniquely to the inherited value	`allowed_paths`, `ignore_patterns`
Ordered lists	Replaced wholesale	`fallback_models`, `instruction_files`

Some keyed collections have additional field rules:

lsp.servers: project entries merge field-by-field with the inherited server of the same name.
skills.models: allow_all uses the project value when present; deny lists are appended uniquely.
permission: merged per tool at the pattern level — patterns from all layers are concatenated (defaults first, then global, then project/agent). Last matching pattern wins, so project patterns override inherited ones for the same glob while preserving unmentioned patterns. String shorthand rules (e.g. bash: allow) are expanded to * → allow before merging.

Providers

providers:
  - id: anthropic
    api_key: ${ANTHROPIC_API_KEY}          # leave empty for OAuth
    thinking_budget: 10000                 # default for all models from this provider
    thinking_type: adaptive                # "adaptive" (default) or "enabled"

  - id: openai
    api_key: ${OPENAI_API_KEY}
    models:                                 # optional static model definitions
      - id: custom-model
        name: Custom Model
        context_size: 128000
        thinking_budget: 8000              # per-model reasoning token budget

  - id: google
    api_key: ${GOOGLE_API_KEY}             # leave empty for OAuth
    models:
      - id: gemini-3-pro-preview
        thinking_budget: 10000             # enable thinking for this model

  - id: groq
    api_key: ${GROQ_API_KEY}
    base_url: https://api.groq.com/openai/v1

  - id: ollama
    base_url: http://localhost:11434/v1    # local models auto-discovered

  - id: lmstudio
    base_url: http://localhost:8000/v1     # local models auto-discovered

Thinking type (Anthropic OAuth only)

The thinking_type field controls how thinking is injected for Anthropic OAuth/subscription requests:

Value	Behavior
`adaptive` (default)	Model decides how much to think based on query complexity. More token-efficient.
`enabled`	Fixed token budget from `thinking_budget`. Predictable but uses more tokens.

# Adaptive: model decides thinking depth (recommended)
providers:
  - id: anthropic
    thinking_type: adaptive

# Fixed budget: always reserve 10K tokens for thinking
providers:
  - id: anthropic
    thinking_type: enabled
    thinking_budget: 10000

When thinking_type is adaptive, the thinking_budget is ignored for OAuth requests (the model decides). When thinking_type is enabled, the thinking_budget sets the token budget (defaults to 10,000 if not specified).

Per-model thinking budget

The thinking_budget field controls reasoning tokens. Models without this field do not use extended thinking. Priority order:

Agent frontmatter reasoning_budget (highest)
Variant toggle (/variant cycling low/high/max)
Per-model config thinking_budget
Provider-level thinking_budget

Models

utility_model: anthropic/claude-haiku-4-5-20251001  # for background tasks (titles, compaction)
reviewer_model: anthropic/claude-sonnet-4-6         # for the reviewer subagent (defaults to utility_model)
default_agent: engineer                             # agent for new sessions

fallback_models:                                    # tried in order when primary fails
  - anthropic/claude-sonnet-4-6
  - openai/gpt-4o

The model for each session is selected via the /models command or the model picker on the home screen. The utility_model is used for lightweight background tasks (title generation, context compaction) and by agents configured with model: utility. The reviewer_model (format: provider/model-id) is used by the reviewer subagent; if not set, it defaults to utility_model.

Session

session:
  compaction_threshold: 0.8     # compact at 80% of context window
  compaction_keep_turns: 10     # preserve last 10 turns after compaction
  prune_protect_tokens: 40000   # protect last 40K tokens from pruning
  prune_min_savings: 20000      # only prune if saves 20K+ tokens
  context_limit: 0.9            # stop tool loop at 90% of context
  max_retries: 5                # retry transient provider/API failures
  tool_call_argument_timeout: 300 # seconds to finish streaming one tool call's JSON args
  engineer_review_limit: 3      # engineer review/stall retries before asking you what to do
  max_subagents: 10             # max concurrent subagent sessions
  subagent_timeout: 5           # minutes before a subagent is cancelled (default: 5)
  plan_approval: true           # require user approval before executing plans (default: true)
  background_auto_react: true   # auto-prompt model when background task completes
  snapshot: false               # enable per-turn git snapshots (/replay to navigate)
  primary_max_steps: 250        # max tool calls per primary turn (default: 250)
  subagent_max_steps: 30        # max tool calls per subagent turn (default: 30)
  budget: 0.0                   # max dollars per session (0 = unlimited)
  budget_warn: 0.8              # warn at this fraction of budget

  # Per-model overrides
  models:
    openai/gpt-4o:
      compaction_threshold: 0.9
      prune_protect_tokens: 80000
      context_limit: 0.95
    gpt-4.1:
      max_input_tokens: 64000   # override when provider caps below context
    ollama/qwen2.5-coder:
      tool_call_argument_timeout: 900 # slower local model: allow 15 minutes

Field	Default	Description
`compaction_threshold`	`0.8`	Trigger compaction at this fraction of context window
`compaction_keep_turns`	`10`	Number of recent turns to preserve after compaction
`prune_protect_tokens`	`40000`	Tokens protected from pruning
`prune_min_savings`	`20000`	Minimum token savings to trigger pruning
`context_limit`	`0.9`	Stop tool loop at this fraction of context
`max_retries`	`5`	Max retries for transient provider/API errors
`tool_call_argument_timeout`	`300`	Max seconds to let a model finish streaming one tool call’s JSON arguments
`engineer_review_limit`	`3`	Max automatic engineer reviewer reruns and stalled-task retries before KodaCode asks how to proceed
`max_subagents`	`10`	Max concurrent subagent sessions
`subagent_timeout`	`5`	Minutes before a subagent is cancelled
`plan_approval`	`true`	Require user approval before executing plans
`background_auto_react`	`true`	Auto-send background task results to model
`snapshot`	`false`	Enable per-turn git snapshots
`primary_max_steps`	`250`	Max tool calls per primary agent turn
`subagent_max_steps`	`30`	Max tool calls per subagent turn
`budget`	`0`	Max session cost in dollars (0 = unlimited)
`budget_warn`	`0.8`	Fraction of budget that triggers a warning

Tool Call Argument Timeout

tool_call_argument_timeout only controls how long kodacode waits for the model to finish streaming one tool call’s JSON arguments.

It does not limit how long the tool itself may run.
It is most useful for slower local models that emit tool-call JSON gradually.
If a local model keeps tripping this limit, raise it globally under session or per model under session.models.

Engineer workflow controls

max_retries and engineer_review_limit cover different failure modes:

max_retries
- retries transient provider/API failures such as rate limits or temporary upstream errors
- does not control engineer task review or execution-stall handling
engineer_review_limit
- controls how many times engineer will automatically retry reviewer verification for a task
- also controls how many times engineer will retry a stalled active task before asking you what to do
- default is 3

The engineer workflow can ask these questions when it reaches a decision point:

How would you like to proceed?
- Save plan and proceed
- Proceed without saving plan files
- Reject plan
Reviewer retry limit reached. How should KodaCode proceed?
- Continue fixing this task without more automatic reviews
- Accept unresolved reviewer findings and proceed
- Stop execution and wait for me
Execution is stalled on the current task. How should KodaCode proceed?
- Keep working on the current task
- Mark the current task blocked and stop
- Stop and wait for me

Server

server:
  port: 0    # 0 = random available port (recommended)

TUI

tui:
  input_max_height: 8       # max rows for input textarea
  theme: default             # theme name from ~/.config/kodacode/themes/
  display_turns: 4           # recent turns to keep rendered (0 = unlimited)
  error_display_time: 0      # seconds before error toast dismisses (0 = persist)
  auto_resume: false         # resume last session on startup instead of home screen
  max_attachment_size: 20971520  # max file attachment size in bytes (default: 20 MB)

Field	Default	Description
`input_max_height`	`8`	Maximum height of the input textarea in rows
`theme`	`""`	Theme name from `~/.config/kodacode/themes/`
`display_turns`	`4`	Number of recent turns to render (0 = all)
`error_display_time`	`0`	Seconds before error toasts auto-dismiss (0 = persist until next action)
`auto_resume`	`false`	Skip home screen and resume last session
`max_attachment_size`	`20971520`	Maximum file attachment size in bytes (default: 20 MB)

Themes and Syntax Highlighting

Each theme file in ~/.config/kodacode/themes/ can specify a syntax_style to control code block highlighting. This maps to a chroma style name:

name: my-theme
syntax_style: dracula        # any valid chroma style name

palette:
  primary: "#bd93f9"
  surface: "#282a36"
  text: "#f8f8f2"
  # ...

If syntax_style is omitted, kodacode picks one automatically based on the theme name or palette brightness.

Theme	`syntax_style`
Catppuccin Mocha	`catppuccin-mocha`
Dracula	`dracula`
Gruvbox Dark	`gruvbox`
Nord	`nord`
One Dark	`onedark`
Rosé Pine Moon	`rose-pine-moon`
Solarized Dark	`solarized-dark`
Tokyo Night	`tokyonight-night`
Any light theme	`github`

Browse the chroma style gallery to preview available styles.

Permissions

permission:
  bash:
    "*": ask
    "ls *": allow
    "go test *": allow
    "rm *": deny
  read:
    "*": allow
    "*.env": deny
    "*.env.example": allow
  write: ask
  edit: allow
  glob: allow
  grep: allow

Permission rules are merged across layers (defaults → global → project → agent) at the pattern level. Patterns from each layer are appended in order, and the last matching pattern wins. This means a project config can override specific globs without losing patterns from earlier layers.

For example, if defaults define read: { "*": allow, "*.env": ask } and a project config adds read: { "*.env": deny }, the merged result is [* → allow, *.env → ask, *.env → deny] — the project’s deny wins for .env files while the default catch-all allow is preserved for other files.

String shorthand (e.g. bash: allow) is expanded to * → allow before merging.

See Permissions for full documentation.

Sandbox

allowed_paths:               # directories allowed without prompting
  - /tmp/shared-data
  - ~/shared-libs

ignore_patterns:             # appended to built-in defaults
  - "*.generated.go"
  - "vendor/**"

See Sandbox for built-in defaults and details.

Memory

memory_budget: 4000          # max characters for auto-memories in system prompt

LSP Servers

Three servers are configured by default (gopls, vtsls, pyright). Additional servers are auto-discovered from your project. Servers start on demand — the first query to a matching file extension starts the server.

lsp:
  servers:
    - name: gopls
      command: gopls
      extensions: [".go"]
      env:
        GOFLAGS: "-mod=mod"
    - name: vtsls
      command: vtsls
      args: ["--stdio"]
      extensions: [".ts", ".tsx", ".js", ".jsx"]
    - name: pyright
      command: pyright-langserver
      args: ["--stdio"]
      extensions: [".py", ".pyi"]
    - name: rust-analyzer
      command: rust-analyzer
      extensions: [".rs"]
      enabled: true              # set false to disable

Field	Required	Description
`name`	Yes	Unique server identifier
`command`	Yes	Executable command
`args`	No	Command-line arguments
`env`	No	Additional environment variables
`extensions`	Yes	File extensions this server handles
`enabled`	No	Set `false` to disable (default: `true`)

Project-local kodacode.yaml entries merge with global lsp.servers by name. Matching servers are overridden field-by-field, so you can disable or tweak one inherited server without re-declaring the rest.

MCP Servers

mcp:
  servers:
    - name: filesystem
      type: stdio
      command: npx
      args: ["-y", "@modelcontextprotocol/server-filesystem", "/path"]
      env:
        NODE_ENV: production
    - name: github
      type: stdio
      command: npx
      args: ["-y", "@modelcontextprotocol/server-github"]
      env:
        GITHUB_PERSONAL_ACCESS_TOKEN: ${GITHUB_TOKEN}
    - name: custom-sse
      type: sse
      url: https://example.com/mcp
      headers:
        Authorization: Bearer ${MCP_TOKEN}
      enabled: true              # set false to disable

Field	Required	Description
`name`	Yes	Unique server name
`type`	Yes	`stdio` or `sse`
`command`	stdio	Command to run
`args`	No	Command arguments
`url`	sse	SSE endpoint URL
`headers`	No	HTTP headers (SSE only)
`env`	No	Environment variables (stdio only)
`enabled`	No	Set `false` to disable (default: `true`)

See MCP Integration for setup details.

Instruction Files

instruction_files:          # additional files loaded into system prompt
  - AGENTS.md
  - CLAUDE.md
  - CONTEXT.md
max_instruction_size: 32768  # max bytes per instruction file (0 = unlimited)

KodaCode also auto-detects KODACODE.md, KODACODE.local.md, AGENTS.md, CLAUDE.md, and CONTEXT.md in your project root. See Project Memory for details.

Skills

skills:
  models:
    "openai/gpt-4o":
      allow_all: true         # allow all skills for this model
      deny: ["dangerous-skill"]  # except these

Project-local skills.models entries merge by model key. allow_all uses the project value when set, while deny lists are appended uniquely so project config can further restrict a globally allowed model.

Debug

debug: false                # enable debug logging to kodacode.log
log_prompts: false          # log full prompts sent to providers
model_refresh_interval: 7   # days between model metadata refresh (0 = every startup)

Complete Example

providers:
  - id: anthropic
    api_key: ${ANTHROPIC_API_KEY}
  - id: openai
    api_key: ${OPENAI_API_KEY}
  - id: ollama
    base_url: http://localhost:11434/v1

utility_model: anthropic/claude-haiku-4-5-20251001
reviewer_model: anthropic/claude-sonnet-4-6
default_agent: engineer
fallback_models:
  - openai/gpt-4o

session:
  compaction_threshold: 0.8
  compaction_keep_turns: 10
  context_limit: 0.9
  max_retries: 5
  max_subagents: 10
  subagent_timeout: 5
  background_auto_react: true
  snapshot: false
  primary_max_steps: 250
  subagent_max_steps: 30
  budget: 5.00
  budget_warn: 0.8

tui:
  input_max_height: 8
  display_turns: 4
  auto_resume: false

permission:
  bash:
    "*": ask
    "go test *": allow
    "go build *": allow
    "rm *": deny
  read: allow
  write: ask
  edit: allow

memory_budget: 4000
model_refresh_interval: 7