Configuration
KodaCode keeps runtime configuration in ~/.config/kodacode/config.yaml and credentials in ~/.config/kodacode/auth.yaml.
This page documents the current persisted config.yaml surface. If a key is
not listed here, current releases do not persist or apply it from
config.yaml.
The bootstrapped config.yaml includes a yaml-language-server schema comment
that points at the published schema in schema/config.schema.json, so editors
with YAML schema support can validate keys as you type.
One important omission is intentional: current releases do not load a
top-level lsp: block from config.yaml. Language-server defaults and project
discovery are runtime-managed today. See
Code Intelligence & LSP.
Storage locations
Section titled “Storage locations”KodaCode uses separate config, data, and state roots:
- config root:
$XDG_CONFIG_HOME/kodacodeor~/.config/kodacode - data root:
$XDG_DATA_HOME/kodacodeor~/.local/share/kodacode - state root:
$XDG_STATE_HOME/kodacodeor~/.local/state/kodacode
Important default files and directories:
config.yaml: runtime configurationauth.yaml: provider credentials and OAuth stateAGENTS.md: optional global instructions file loaded from the config rootprompts/: optional prompt overrides in the config rootkodacode.db: main SQLite-backed session, event, and artifact store in the data rootmodels-cache.json: model catalog cache in the data rootops.log: standard runtime log in the data rootdebug.log: verbose debug log in the data root whenlogging.debug: truesearch/: default hybrid search index root in the state root
Current releases document the SQLite-backed runtime storage layout above.
Easy example
Section titled “Easy example”This is a good starting point for one person working in the TUI with one main model:
version: 1
providers: - id: openai - id: anthropic
model: primary: openai/gpt-5
utility_model: openai/gpt-5-mini
sessions: budget: 5 # default: 0 (no limit) budget_warn: 0.8 # default: 0 (no warning) response_style: terse # default: terse
execution: permission_mode: auto # default: auto network: disabled # default: disabled
permissions: external_directory: "*": ask "/Users/you/src/personal/**": allow bash: "*": ask "git status*": allow web_fetch: "*": ask "https://docs.go101.org/**": allow network_target: "*": ask "api.github.com": allow
tui: theme: ayu-dark # display_turns: 6Why this works well:
model.primarygives you one clear default routeutility_modelkeeps runtime utility text work cheapersessions.budgetandbudget_warnmake cost visible before it becomes a surprisesessions.response_styledefaults toterse, which trims normal model prose without affecting safety-critical explanationsexecution.network: disabledkeeps command-backed tools conservative by defaultpermissionslets runtime default repeated safe cases toallow,ask, ordenywithout moving permission ownership into prompts
OpenAI account types
Section titled “OpenAI account types”KodaCode keeps OpenAI Platform API access and ChatGPT/Codex OAuth access as separate provider routes:
| Provider ID | Account type | Example model ref |
|---|---|---|
openai | OpenAI Platform API key | openai/gpt-5 |
openai-codex | ChatGPT/Codex OAuth account | openai-codex/gpt-5 |
The TUI /connect OpenAI OAuth flow configures the ChatGPT/Codex account
route. Use openai-codex/... for models selected through that account.
If both account types are available, both routes can be connected at the same
time. If only ChatGPT/Codex OAuth is available, stored openai/... model refs
are normalized to openai-codex/... at runtime so older selections do not fail
with provider not configured.
The openai-codex model list comes from the ChatGPT/Codex account endpoint.
models.dev currently publishes OpenAI metadata under openai; KodaCode uses
that metadata only to enrich matching openai-codex model IDs with context,
output, capability, and cost information. It does not make Platform-only models
available through openai-codex.
Advanced example
Section titled “Advanced example”This is a better fit for a longer-running workspace where you want stronger session controls, hybrid search, and a local compatible provider:
version: 1
providers: - id: openai prompt_cache_retention: 24h # optional OpenAI prompt cache retention responses_store: false # default: false encrypted_reasoning_replay: true # default: true - id: anthropic - id: local base_url: http://localhost:11434/v1
model: primary: openai/gpt-5
utility_model: openai/gpt-5-miniutility_model_timeout_seconds: 20 # default: 0 (no timeout)utility_model_retry_attempts: 1 # default: 1utility_model_retry_after_max_seconds: 5 # default: 5
model_overrides: - ref: local/qwen2.5-coder-32b-instruct name: Local Qwen Coder context_size: 32768 max_output_tokens: 8192 default_output_tokens: 4096 tool_calls: true cost_input: 0 cost_output: 0
output_budgets: session_title: 128 utility_text: 2048 review: 4096 agent_turn: 8192 agent_turn_thinking: 16000 workspace_compress: 1200 session_compaction: 4096
sessions: db_path: /Users/you/.local/share/kodacode/work.db budget: 15 # default: 0 (no limit) budget_warn: 0.75 # default: 0 (no warning) total_budget: 100 # default: 0 (no limit) total_budget_warn: 0.8 # default: 0 (no warning) response_style: terse # default: terse compaction_threshold: 0.8 # default: 0.8 compaction_target_threshold: 0.60 # default: 0.60 max_provider_requests_per_turn: 32 # default: 32 max_output_continuations: 1 # default: 1, max: 2 max_retries: 2 # default: 2
search: skip_dirs: - coverage - .next embeddings_model: openai/text-embedding-3-small embeddings_dimensions: 1536 prewarm_embeddings: true # default: false
context_packet: enabled_sections: # default: [] - repo - git - git_dirty_summary - diagnostics
model_cache: expiry_days: 7 # default: 7
retention: expiry_days: 14
logging: debug: true # default: false
execution: permission_mode: auto # default: auto network: disabled # default: disabled
permissions: external_directory: "*": ask "/Users/you/src/personal/**": allow bash: "*": ask "git status*": allow "git diff*": allow "git commit*": deny web_fetch: "*": ask "https://docs.go101.org/**": allow network_target: "*": ask "api.github.com": allow "package registries": ask
tui: theme: ayu-dark display_turns: 6Why you would reach for this:
- a local compatible provider gives you a low-cost fallback path
model_overridesfills in missing metadata so cost display and routing stay usefuloutput_budgetskeeps requested output tokens below provider ceilings while leaving room for explicit per-model overrides- cross-session budgets help when you work in the same repo over many days
- leaving
response_styleunset keeps the defaulttersebehavior; set it todefaultonly when you want fuller ordinary narration - compaction thresholds keep long sessions usable without hiding what changed
- debug logging gives you a real trail when provider routing or tool execution needs investigation
permissionsreduces repeat approval churn while keeping the runtime as the permission authoritysearch.prewarm_embeddings: truespends background work up front so hybrid search can feel faster in active workspacessearch.skip_dirsadds extra exact directory names to ignore while keeping built-in skips for.git,node_modules, andvendor
Search embedding examples
Section titled “Search embedding examples”Two good example embedding routes are:
openai/text-embedding-3-smallnomic-embed-text-v1.5through either Ollama or LM Studio
Complete search example:
version: 1
providers: - id: openai
search: index_dir: /Users/you/.local/state/kodacode/search skip_dirs: - coverage - dist - .next embeddings_model: openai/text-embedding-3-small embeddings_dimensions: 1536 prewarm_embeddings: trueOpenAI:
providers: - id: openai
search: skip_dirs: - coverage - .next embeddings_model: openai/text-embedding-3-small embeddings_dimensions: 1536 prewarm_embeddings: trueOllama:
providers: - id: ollama base_url: http://localhost:11434/v1
search: embeddings_model: ollama/nomic-embed-text-v1.5 # Omit embeddings_dimensions unless your local provider requires it. prewarm_embeddings: trueLM Studio:
providers: - id: lmstudio base_url: http://localhost:1234/v1
search: embeddings_model: lmstudio/nomic-embed-text-v1.5 # Omit embeddings_dimensions unless your local provider requires it. prewarm_embeddings: trueThe provider prefix in search.embeddings_model must match the provider ID you
configured above. embeddings_dimensions is useful when the provider expects an
explicit width; for text-embedding-3-small that is 1536. For local
OpenAI-compatible models such as Ollama or LM Studio, leave
embeddings_dimensions unset unless the provider or model explicitly requires a
fixed width.
prewarm_embeddings is optional and defaults to false. When you set it to
true, kodacode will build and refresh file embedding caches in the
background for tracked workspaces. It requires search.embeddings_model.
skip_dirs is optional. It adds extra exact directory names to skip during
lexical traversal, hybrid indexing, and background refresh. Built-in skips for
.git, node_modules, and vendor always apply.
If you do not set search.index_dir, the default index location is
$XDG_STATE_HOME/kodacode/search or ~/.local/state/kodacode/search.
Hybrid ranking uses fixed internal reciprocal-rank fusion and simple path-aware adjustments so source files tend to rank above docs, tests, and generated paths. Those ranking details are not part of the public config surface.
The public search config surface is intentionally small:
search.index_dirsearch.skip_dirssearch.embeddings_modelsearch.embeddings_dimensionssearch.prewarm_embeddings
Web search
Section titled “Web search”web_search is a separate runtime capability for query-based web discovery. It
is not configured through the top-level providers: block.
Keep these boundaries in mind:
providers:configures model and embedding providersweb_search.providers:configures web-search backends such as Exa and Parallelweb_fetchremains the direct-URL retrieval tool
web_search is only registered when runtime has at least one valid
web_search provider plus matching auth. Auth alone does not enable it.
Current built-in web_search backends:
exaparallel
Provider requirements:
- each configured
web_search.providersentry needs akind - supported kinds are
exaandparallel base_urlis optionaltimeout_msis optional- if exactly one provider is configured,
default_providermay be omitted - if more than one provider is configured,
default_provideris required - every configured provider needs an API key at startup, even if it is not the default provider
Default backend values when you omit them:
- Exa
base_url:https://api.exa.ai - Parallel
base_url:https://api.parallel.ai timeout_ms:15000
Single-provider example:
web_search: providers: exa: kind: exaTwo-provider example with Exa as the default:
web_search: default_provider: exa providers: exa: kind: exa base_url: https://api.exa.ai parallel: kind: parallel base_url: https://api.parallel.aiCurrent multi-provider behavior is runtime-owned and simple: kodacode always
uses web_search.default_provider. The model does not choose between Exa and
Parallel per call.
One subtle but important rule: auth lookup uses the provider ID key under
web_search.providers, not the vendor kind. For example:
web_search: providers: research: kind: exaThat config expects an auth entry named research, not exa.
Top-level config areas
Section titled “Top-level config areas”providers: provider IDs plus optional base URLs; credentials stay inauth.yamlweb_search: query-based web discovery backends; credentials also stay inauth.yamlmodel: primary model selectionmodel_overrides: local model metadata and pricing overridesoutput_budgets: default requested output-token budgets by runtime role; provider/model max tokens remain hard ceilingsutility_model: cheaper route for runtime utility text work; unset by defaultutility_model_timeout_seconds: timeout in seconds for utility-model requests; default0(no timeout)utility_model_retry_attempts: retries allowed after the initial utility request attempt; default1utility_model_retry_after_max_seconds: maximum provider backoff delay accepted before utility falls back; default5mcp: MCP server configurationworkflow: review mode and optional review model; defaultreview_mode: manualsearch: search index path, extra skipped directories, embedding model settings, and optional background cache prewarmsessions: persistence, budgets, terse/default response style, compaction, retries, turn limitsexecution: permission mode (default: auto), network policy (default: disabled), and optional shell defaults for command-backed toolspermissions: runtime-owned default approval policy for external directories, bash previews, web fetch URLs, and network targetsmodel_cache: cloud catalog cache expiry; defaultexpiry_days: 7retention: unified expiry policy for derived DB artifacts, search cache files, and app logs; defaultexpiry_days: 0logging: log directory and debug loggingtui: current theme selection and optional transcript display limit
There is currently no persisted top-level lsp config area.
MCP server example
Section titled “MCP server example”mcp.servers configures external Model Context Protocol tools. Current
releases support stdio servers:
mcp: servers: - name: docs type: stdio command: node args: - /Users/me/tools/company-docs-mcp/dist/server.js env: DOCS_TOKEN: "replace-with-your-token" tool_hints: search: summary: Search internal product documentation. guidance: Use for billing, plan, and support-procedure questions.Startup trust is per workspace and per server fingerprint. If type,
command, args, or environment variable keys change, KodaCode asks for trust
again. See MCP Servers for the user workflow and agent
permission examples.
OpenAI Responses state
Section titled “OpenAI Responses state”The OpenAI Platform provider may opt into longer-lived prompt cache retention and choose how Responses API state is handled:
providers: - id: openai prompt_cache_retention: 24h responses_store: false encrypted_reasoning_replay: trueprompt_cache_retention allowed values are in_memory and 24h. Leave the
setting unset to use the provider default.
responses_store defaults to false, keeping OpenAI Responses requests
stateless and locally replayable. With responses_store: false,
encrypted_reasoning_replay defaults to true; KodaCode requests
reasoning.encrypted_content, stores the returned encrypted reasoning item in
the local event log, and replays it in later tool-loop requests. Set
encrypted_reasoning_replay: false only if you do not want encrypted reasoning
items requested, persisted, or replayed locally. Set responses_store: true
only when OpenAI-managed response storage is acceptable for your workspace.
Credentials still stay in auth.yaml.
QwenCloud model overrides
Section titled “QwenCloud model overrides”QwenCloud is OpenAI-compatible, but current metadata sources are incomplete for automatic capability detection:
models.devdoes not currently publish a QwenCloud provider catalog- QwenCloud’s OpenAI-compatible
/modelsendpoint returns model IDs but not capability metadata such as context size, max output, thinking budget, or tool-call support
Use model_overrides for the QwenCloud models you rely on:
providers: - id: qwencloud base_url: https://dashscope-intl.aliyuncs.com/compatible-mode/v1
model_overrides: - ref: qwencloud/qwen3.7-max name: Qwen3.7 Max context_size: 1000000 max_input_tokens: 1000000 max_output_tokens: 64000 default_output_tokens: 16000 reasoning: true tool_calls: true vision: true
- ref: qwencloud/qwen3.6-plus name: Qwen3.6 Plus context_size: 1000000 max_input_tokens: 1000000 max_output_tokens: 64000 default_output_tokens: 16000 reasoning: true tool_calls: true vision: true
- ref: qwencloud/qwen3.6-flash name: Qwen3.6 Flash context_size: 1000000 max_input_tokens: 1000000 max_output_tokens: 64000 default_output_tokens: 16000 reasoning: true tool_calls: true vision: trueQwenCloud documents thinking budget separately from visible max output. Keep
max_output_tokens set to the documented visible output ceiling and do not add
thinking budget to it.
Output budgets
Section titled “Output budgets”Provider catalogs expose hard model ceilings such as context size and maximum
output tokens. KodaCode does not treat those ceilings as ordinary defaults.
Instead, output_budgets controls the requested output-token budget by runtime
role, and each request is still clamped to the selected model’s provider or
model_overrides.max_output_tokens ceiling:
output_budgets: session_title: 128 # default: 128 utility_text: 2048 # default: 2048 review: 4096 # default: 4096 agent_turn: 8192 # default: 8192 agent_turn_thinking: 16000 # default: 16000 workspace_compress: 1200 # default: 1200 session_compaction: 4096 # default: 4096Use model_overrides.default_output_tokens when one model needs a different
ordinary agent-turn budget:
model_overrides: - ref: anthropic/claude-sonnet-4-6 max_output_tokens: 64000 # hard ceiling default_output_tokens: 12000 # ordinary agent turn requestIf a provider reports a lower max output limit than the configured budget, KodaCode requests the lower provider limit. Anthropic max-token retries may expand to the selected model’s hard ceiling after a provider response proves the initial budget was too small.
Granular permission policy
Section titled “Granular permission policy”The optional top-level permissions block lets runtime default certain actions
to allow, ask, or deny without replacing durable approvals.
Supported subjects today:
external_directory: resolved absolute paths outside the workspace and any already-granted rootsbash: normalized command preview stringsweb_fetch: canonical request URLsnetwork_target: normalized hosts, command-scoped targets, and symbolic targets such asexternal network,package registries, andgit remotes
web_search is intentionally not a granular permission subject. If you enable
it, the configured backend plus matching auth is the approval surface, and
calls do not create per-query approval prompts.
Example:
permissions: external_directory: "*": ask "/Users/you/src/personal/**": allow "/tmp/**": deny
bash: "*": ask "git status*": allow "git diff*": allow "git commit*": deny
web_fetch: "*": ask "https://docs.go101.org/**": allow
network_target: "*": ask "api.github.com": allow "registry.npmjs.org": denyHow runtime evaluates these rules:
- rule order matters within each subject; the last matching rule wins
- when multiple subjects apply to one action, runtime combines them by
deny>ask>allow allowskips the extra approval prompt for that concernaskuses the normal durable approval flowdenyblocks immediately with an explicit runtime error- explicit session approvals and session grants still remain durable state
permissionsis evaluated only inautoandread_onlyfull_accessbypasses routine approval prompts and granular permission rules
The practical use is to keep workspace-first defaults while removing repeated prompts for cases you have already decided are fine.
Session settings explained
Section titled “Session settings explained”The sessions block is where you trade off cost, persistence, and long-turn
behavior.
db_path: move the event log database if you want a custom storage location; default is~/.local/share/kodacode/kodacode.dbbudget: cap estimated spend for one session; default0(no limit)budget_warn: fraction ofbudgetat which a warning is shown; default0(no warning); requiresbudget > 0total_budget: cap estimated spend across all stored sessions; default0(no limit)total_budget_warn: fraction oftotal_budgetat which a warning is shown; default0(no warning); requirestotal_budget > 0response_style:defaultorterse; defaulttersecompaction_threshold: start rebuilding older session history into the durableHistory Summarywhen replayed context reaches this fraction of the context limit; default0.8compaction_target_threshold: accept any history-summary candidate at or below this fraction so the session gets breathing room; must be lower thancompaction_threshold; default0.60max_provider_requests_per_turn: stop one turn from making provider requests forever; use-1to remove the cap entirely; default32max_output_continuations: automatically continue once when a provider explicitly reports an output-length stop; use0to disable; maximum2; default1max_retries: limit provider retry attempts when a route is flaky; default2
The practical advantage of these settings is that kodacode can stay useful in
long-lived workspaces without silently running up cost or letting one bad turn
spin forever.
The related default storage layout is:
- runtime database:
$XDG_DATA_HOME/kodacode/kodacode.dbor~/.local/share/kodacode/kodacode.db - model catalog cache:
$XDG_DATA_HOME/kodacode/models-cache.json - search index:
$XDG_STATE_HOME/kodacode/searchor~/.local/state/kodacode/search
Context packet settings
Section titled “Context packet settings”The context_packet block controls disabled-by-default deterministic context
that runtime can add before a provider request. It is meant for small,
inspectable summaries that reduce avoidable discovery calls without dumping the
repository into the prompt or repeating stable metadata every turn.
enabled_sections: list of bounded runtime sections to inject; default[]- supported section keys:
repo,git,git_dirty_summary,diagnostics
Section meanings:
| Section | Includes | Helps with |
|---|---|---|
repo | Workspace name and Git detected/not detected | Basic orientation without an initial discovery call |
git | Branch and changed-file count | Quick awareness of current worktree scale |
git_dirty_summary | Bounded changed-file list with Git status codes | Review, debugging, planning, and prompts that depend on the current changed files |
diagnostics | Bounded LSP diagnostics for changed files when the request is diagnostics-related | Failure, lint, compile, typecheck, and test questions |
The current sections only include repository metadata, branch/change counts, at most 20 changed-file entries, and a diagnostics summary for at most three changed files when the user request is diagnostics-related. They never include file contents or patch hunks.
Because these sections add input tokens, they are useful only when they avoid more expensive repeated discovery. If the request is already under input pressure, runtime omits the packet and records the omitted token estimate.
Section refresh is intentionally conservative:
repois sent on the first session turn, then only if its deterministic content changes.gitandgit_dirty_summaryare sent on the first session turn, when their deterministic content changes, or when the user asks for current Git/worktree state.diagnosticsis collected only for bounded changed files when the user asks about failures, diagnostics, compile/typecheck/lint issues, or tests.- runtime continuation turns do not re-inject the packet.
Workflow settings
Section titled “Workflow settings”The workflow block controls runtime review behavior for the structured
engineer and reviewer path:
workflow: review_mode: manual # off | manual | auto; default: manual review_model: primary: openai/gpt-5-minioff: no review pass is ever triggeredmanual: review remains agent-driven;engineercan still delegate toreviewer, but runtime does not inject a follow-up review turn automaticallyauto: after anengineerturn completes, runtime starts a reviewer turn only when all current tasks are already completed and at least one completed task still lacks a durable review outcome
review_model is optional. When set, the review turn uses that model instead of the session primary. A cheaper model here reduces the cost of review passes without affecting agent turns.
Execution settings
Section titled “Execution settings”The execution block sets runtime defaults for command-backed tools such as
bash and test:
permission_mode:auto,read_only, orfull_access; defaultautonetwork:disabledorenabled; defaultdisabledallow_login_shell: whether shell-backed executions may honorlogin: true; defaultfalseshell_program: optional shell path override; when unset, runtime falls back to the platform default shell resolution
Example:
execution: permission_mode: auto network: disabled allow_login_shell: false shell_program: /bin/zshLogging settings
Section titled “Logging settings”The logging block controls where runtime logs are written and whether the
debug log is enabled:
dir: log directory; defaults to$XDG_DATA_HOME/kodacodeor~/.local/share/kodacodedebug: enable verbose debug logging; defaultfalse
When logging is enabled, current releases write:
ops.log: normal operations logdebug.log: verbose debug log whendebug: true
Example:
logging: dir: /Users/you/.local/share/kodacode debug: trueRetention settings
Section titled “Retention settings”The retention block controls one expiry policy surface for derived runtime
artifacts:
expiry_days: remove old tool-result blobs and background logs fromkodacode.db, remove old persisted search-cache files, and remove old app log files; default0(disabled)
This policy does not prune durable session history in session_events.
Session deletion is explicit. Use /sessions when you want to remove or purge
stored sessions themselves.
Retention is applied during runtime startup. Search-cache cleanup is file-based
and uses persisted file modification times because the search cache still lives
on disk instead of inside kodacode.db.
Example:
retention: expiry_days: 14Credentials
Section titled “Credentials”config.yaml is intentionally not the secret store. Provider credentials and OAuth state live in auth.yaml.
You can populate those credentials through the TUI with /connect, or by editing auth.yaml directly if you want fully manual control.
For web_search, the auth entry key must match the provider ID you used under
web_search.providers.
Example:
exa: type: api access: YOUR_EXA_API_KEY
parallel: type: api access: YOUR_PARALLEL_API_KEYThis pairs with:
web_search: default_provider: exa providers: exa: kind: exa parallel: kind: parallelIf you use custom provider IDs, the auth keys must match those custom IDs:
research: type: api access: YOUR_EXA_API_KEYpaired with:
web_search: providers: research: kind: exaTop-level providers: entries are not required just to enable web_search.