Skip to content

Configuration Reference

Configuration files are loaded in order, with later sources overriding earlier ones:

  1. ~/.config/kodacode/config.yaml — global defaults
  2. ./kodacode.yaml — project-specific overrides

Environment variables (${VAR_NAME}) are expanded automatically in api_key, base_url, env, and headers fields.

Global config is loaded first, then project-local kodacode.yaml is merged on top. Merge behavior is field-specific:

ShapeBehaviorExamples
Scalars and simple nested fieldsOverride when the field is present in project configdefault_agent, session.max_retries, tui.auto_resume
Keyed collectionsMerge by stable key; matching entries are replaced or merged by field rulesproviders by id, mcp.servers by name/command, lsp.servers by name, session.models by model key, skills.models by model key
Additive listsAppended uniquely to the inherited valueallowed_paths, ignore_patterns
Ordered listsReplaced wholesalefallback_models, instruction_files

Some keyed collections have additional field rules:

  • lsp.servers: project entries merge field-by-field with the inherited server of the same name.
  • skills.models: allow_all uses the project value when present; deny lists are appended uniquely.
  • permission: merged per tool at the pattern level — patterns from all layers are concatenated (defaults first, then global, then project/agent). Last matching pattern wins, so project patterns override inherited ones for the same glob while preserving unmentioned patterns. String shorthand rules (e.g. bash: allow) are expanded to * → allow before merging.
providers:
- id: anthropic
api_key: ${ANTHROPIC_API_KEY} # leave empty for OAuth
thinking_budget: 10000 # default for all models from this provider
thinking_type: adaptive # "adaptive" (default) or "enabled"
- id: openai
api_key: ${OPENAI_API_KEY}
models: # optional static model definitions
- id: custom-model
name: Custom Model
context_size: 128000
thinking_budget: 8000 # per-model reasoning token budget
- id: google
api_key: ${GOOGLE_API_KEY} # leave empty for OAuth
models:
- id: gemini-3-pro-preview
thinking_budget: 10000 # enable thinking for this model
- id: groq
api_key: ${GROQ_API_KEY}
base_url: https://api.groq.com/openai/v1
- id: ollama
base_url: http://localhost:11434/v1 # local models auto-discovered
- id: lmstudio
base_url: http://localhost:8000/v1 # local models auto-discovered

The thinking_type field controls how thinking is injected for Anthropic OAuth/subscription requests:

ValueBehavior
adaptive (default)Model decides how much to think based on query complexity. More token-efficient.
enabledFixed token budget from thinking_budget. Predictable but uses more tokens.
# Adaptive: model decides thinking depth (recommended)
providers:
- id: anthropic
thinking_type: adaptive
# Fixed budget: always reserve 10K tokens for thinking
providers:
- id: anthropic
thinking_type: enabled
thinking_budget: 10000

When thinking_type is adaptive, the thinking_budget is ignored for OAuth requests (the model decides). When thinking_type is enabled, the thinking_budget sets the token budget (defaults to 10,000 if not specified).

The thinking_budget field controls reasoning tokens. Models without this field do not use extended thinking. Priority order:

  1. Agent frontmatter reasoning_budget (highest)
  2. Variant toggle (/variant cycling low/high/max)
  3. Per-model config thinking_budget
  4. Provider-level thinking_budget
utility_model: anthropic/claude-haiku-4-5-20251001 # for background tasks (titles, compaction)
reviewer_model: anthropic/claude-sonnet-4-6 # for the reviewer subagent (defaults to utility_model)
default_agent: engineer # agent for new sessions
fallback_models: # tried in order when primary fails
- anthropic/claude-sonnet-4-6
- openai/gpt-4o

The model for each session is selected via the /models command or the model picker on the home screen. The utility_model is used for lightweight background tasks (title generation, context compaction) and by agents configured with model: utility. The reviewer_model (format: provider/model-id) is used by the reviewer subagent; if not set, it defaults to utility_model.

session:
compaction_threshold: 0.8 # compact at 80% of context window
compaction_keep_turns: 10 # preserve last 10 turns after compaction
prune_protect_tokens: 40000 # protect last 40K tokens from pruning
prune_min_savings: 20000 # only prune if saves 20K+ tokens
context_limit: 0.9 # stop tool loop at 90% of context
max_retries: 5 # retry transient provider/API failures
tool_call_argument_timeout: 300 # seconds to finish streaming one tool call's JSON args
engineer_review_limit: 3 # engineer review/stall retries before asking you what to do
max_subagents: 10 # max concurrent subagent sessions
subagent_timeout: 5 # minutes before a subagent is cancelled (default: 5)
plan_approval: true # require user approval before executing plans (default: true)
background_auto_react: true # auto-prompt model when background task completes
snapshot: false # enable per-turn git snapshots (/replay to navigate)
primary_max_steps: 250 # max tool calls per primary turn (default: 250)
subagent_max_steps: 30 # max tool calls per subagent turn (default: 30)
budget: 0.0 # max dollars per session (0 = unlimited)
budget_warn: 0.8 # warn at this fraction of budget
# Per-model overrides
models:
openai/gpt-4o:
compaction_threshold: 0.9
prune_protect_tokens: 80000
context_limit: 0.95
gpt-4.1:
max_input_tokens: 64000 # override when provider caps below context
ollama/qwen2.5-coder:
tool_call_argument_timeout: 900 # slower local model: allow 15 minutes
FieldDefaultDescription
compaction_threshold0.8Trigger compaction at this fraction of context window
compaction_keep_turns10Number of recent turns to preserve after compaction
prune_protect_tokens40000Tokens protected from pruning
prune_min_savings20000Minimum token savings to trigger pruning
context_limit0.9Stop tool loop at this fraction of context
max_retries5Max retries for transient provider/API errors
tool_call_argument_timeout300Max seconds to let a model finish streaming one tool call’s JSON arguments
engineer_review_limit3Max automatic engineer reviewer reruns and stalled-task retries before KodaCode asks how to proceed
max_subagents10Max concurrent subagent sessions
subagent_timeout5Minutes before a subagent is cancelled
plan_approvaltrueRequire user approval before executing plans
background_auto_reacttrueAuto-send background task results to model
snapshotfalseEnable per-turn git snapshots
primary_max_steps250Max tool calls per primary agent turn
subagent_max_steps30Max tool calls per subagent turn
budget0Max session cost in dollars (0 = unlimited)
budget_warn0.8Fraction of budget that triggers a warning

tool_call_argument_timeout only controls how long kodacode waits for the model to finish streaming one tool call’s JSON arguments.

  • It does not limit how long the tool itself may run.
  • It is most useful for slower local models that emit tool-call JSON gradually.
  • If a local model keeps tripping this limit, raise it globally under session or per model under session.models.

max_retries and engineer_review_limit cover different failure modes:

  • max_retries

    • retries transient provider/API failures such as rate limits or temporary upstream errors
    • does not control engineer task review or execution-stall handling
  • engineer_review_limit

    • controls how many times engineer will automatically retry reviewer verification for a task
    • also controls how many times engineer will retry a stalled active task before asking you what to do
    • default is 3

The engineer workflow can ask these questions when it reaches a decision point:

  1. How would you like to proceed?

    • Save plan and proceed
    • Proceed without saving plan files
    • Reject plan
  2. Reviewer retry limit reached. How should KodaCode proceed?

    • Continue fixing this task without more automatic reviews
    • Accept unresolved reviewer findings and proceed
    • Stop execution and wait for me
  3. Execution is stalled on the current task. How should KodaCode proceed?

    • Keep working on the current task
    • Mark the current task blocked and stop
    • Stop and wait for me
server:
port: 0 # 0 = random available port (recommended)
tui:
input_max_height: 8 # max rows for input textarea
theme: default # theme name from ~/.config/kodacode/themes/
display_turns: 4 # recent turns to keep rendered (0 = unlimited)
error_display_time: 0 # seconds before error toast dismisses (0 = persist)
auto_resume: false # resume last session on startup instead of home screen
max_attachment_size: 20971520 # max file attachment size in bytes (default: 20 MB)
FieldDefaultDescription
input_max_height8Maximum height of the input textarea in rows
theme""Theme name from ~/.config/kodacode/themes/
display_turns4Number of recent turns to render (0 = all)
error_display_time0Seconds before error toasts auto-dismiss (0 = persist until next action)
auto_resumefalseSkip home screen and resume last session
max_attachment_size20971520Maximum file attachment size in bytes (default: 20 MB)

Each theme file in ~/.config/kodacode/themes/ can specify a syntax_style to control code block highlighting. This maps to a chroma style name:

~/.config/kodacode/themes/my-theme.yaml
name: my-theme
syntax_style: dracula # any valid chroma style name
palette:
primary: "#bd93f9"
surface: "#282a36"
text: "#f8f8f2"
# ...

If syntax_style is omitted, kodacode picks one automatically based on the theme name or palette brightness.

Themesyntax_style
Catppuccin Mochacatppuccin-mocha
Draculadracula
Gruvbox Darkgruvbox
Nordnord
One Darkonedark
Rosé Pine Moonrose-pine-moon
Solarized Darksolarized-dark
Tokyo Nighttokyonight-night
Any light themegithub

Browse the chroma style gallery to preview available styles.

permission:
bash:
"*": ask
"ls *": allow
"go test *": allow
"rm *": deny
read:
"*": allow
"*.env": deny
"*.env.example": allow
write: ask
edit: allow
glob: allow
grep: allow

Permission rules are merged across layers (defaults → global → project → agent) at the pattern level. Patterns from each layer are appended in order, and the last matching pattern wins. This means a project config can override specific globs without losing patterns from earlier layers.

For example, if defaults define read: { "*": allow, "*.env": ask } and a project config adds read: { "*.env": deny }, the merged result is [* → allow, *.env → ask, *.env → deny] — the project’s deny wins for .env files while the default catch-all allow is preserved for other files.

String shorthand (e.g. bash: allow) is expanded to * → allow before merging.

See Permissions for full documentation.

allowed_paths: # directories allowed without prompting
- /tmp/shared-data
- ~/shared-libs
ignore_patterns: # appended to built-in defaults
- "*.generated.go"
- "vendor/**"

See Sandbox for built-in defaults and details.

memory_budget: 4000 # max characters for auto-memories in system prompt

Three servers are configured by default (gopls, vtsls, pyright). Additional servers are auto-discovered from your project. Servers start on demand — the first query to a matching file extension starts the server.

lsp:
servers:
- name: gopls
command: gopls
extensions: [".go"]
env:
GOFLAGS: "-mod=mod"
- name: vtsls
command: vtsls
args: ["--stdio"]
extensions: [".ts", ".tsx", ".js", ".jsx"]
- name: pyright
command: pyright-langserver
args: ["--stdio"]
extensions: [".py", ".pyi"]
- name: rust-analyzer
command: rust-analyzer
extensions: [".rs"]
enabled: true # set false to disable
FieldRequiredDescription
nameYesUnique server identifier
commandYesExecutable command
argsNoCommand-line arguments
envNoAdditional environment variables
extensionsYesFile extensions this server handles
enabledNoSet false to disable (default: true)

Project-local kodacode.yaml entries merge with global lsp.servers by name. Matching servers are overridden field-by-field, so you can disable or tweak one inherited server without re-declaring the rest.

mcp:
servers:
- name: filesystem
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "/path"]
env:
NODE_ENV: production
- name: github
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: ${GITHUB_TOKEN}
- name: custom-sse
type: sse
url: https://example.com/mcp
headers:
Authorization: Bearer ${MCP_TOKEN}
enabled: true # set false to disable
FieldRequiredDescription
nameYesUnique server name
typeYesstdio or sse
commandstdioCommand to run
argsNoCommand arguments
urlsseSSE endpoint URL
headersNoHTTP headers (SSE only)
envNoEnvironment variables (stdio only)
enabledNoSet false to disable (default: true)

See MCP Integration for setup details.

instruction_files: # additional files loaded into system prompt
- AGENTS.md
- CLAUDE.md
- CONTEXT.md
max_instruction_size: 32768 # max bytes per instruction file (0 = unlimited)

KodaCode also auto-detects KODACODE.md, KODACODE.local.md, AGENTS.md, CLAUDE.md, and CONTEXT.md in your project root. See Project Memory for details.

skills:
models:
"openai/gpt-4o":
allow_all: true # allow all skills for this model
deny: ["dangerous-skill"] # except these

Project-local skills.models entries merge by model key. allow_all uses the project value when set, while deny lists are appended uniquely so project config can further restrict a globally allowed model.

debug: false # enable debug logging to kodacode.log
log_prompts: false # log full prompts sent to providers
model_refresh_interval: 7 # days between model metadata refresh (0 = every startup)
providers:
- id: anthropic
api_key: ${ANTHROPIC_API_KEY}
- id: openai
api_key: ${OPENAI_API_KEY}
- id: ollama
base_url: http://localhost:11434/v1
utility_model: anthropic/claude-haiku-4-5-20251001
reviewer_model: anthropic/claude-sonnet-4-6
default_agent: engineer
fallback_models:
- openai/gpt-4o
session:
compaction_threshold: 0.8
compaction_keep_turns: 10
context_limit: 0.9
max_retries: 5
max_subagents: 10
subagent_timeout: 5
background_auto_react: true
snapshot: false
primary_max_steps: 250
subagent_max_steps: 30
budget: 5.00
budget_warn: 0.8
tui:
input_max_height: 8
display_turns: 4
auto_resume: false
permission:
bash:
"*": ask
"go test *": allow
"go build *": allow
"rm *": deny
read: allow
write: ask
edit: allow
memory_budget: 4000
model_refresh_interval: 7