Agent execution
How an agent should drive work in this codebase
Execution discipline for an agent working this codebase. Engineering posture lives in philosophy; this is how to run a turn.
MUST
- Continue to the next task while autonomous-feasible work remains; identify it and start. Why: idle and handoff are the costliest parts of the loop.
- Lock the full surfaced scope into the project doc repo the moment it rounds out and ship every item in one pass; the locked set is the immovable target. Why: a “this round / next round” split is where items go to die.
- Parallelize independent work against any known wait (build, codegen, install, network) — the parent grinds another file while a subagent runs. Why: idle wait is the costliest non-stop state in the loop.
- Self-decide reversible, config-only, or rule-settled choices; surface only a genuine fork as one question + options (each with pros/cons) + recommendation + reasoning. Why: most “decisions” are already settled by the rules, and a stacked or unreasoned MCQ drops answer quality.
- Exhaust code, docs, git history, and memory before asking; ask only what cannot be discovered. Why: the discovery cost is already paid.
- Run the action yourself; never ask the user to run a command the agent can run. Why: handing work upward breaks the loop.
- State an expected outcome and deadline before any observable action (build, navigate, poll); flag stuck the moment it deviates. Why: no criterion means silent stuck loops.
- Scan vendor issue trackers and changelogs before declaring a third-party blocker. Why: the training cutoff lags the ecosystem by months.
- Commit the moment a bug is found and again when fixed during any verification loop. Why: a per-bug trail maps failure to fix.
- Foreground any command under ~2 min; background only with concurrent work in flight, never background-then-poll. Why: background-then-poll is an idle pattern.
- Dispatch concurrent subagents sliced by file/dir/rule boundary, packed in one message; restrict edit-only subagents to read/edit/write/grep/glob; verify build-green on the shared branch first; audit their self-reports before any destructive cleanup. Why: throughput without thrash, and agents misreport.
- Parent-author the shared types/signatures before dispatching an edit wave; subagents edit call sites against the stable signature. Why: divergent parallel authoring causes stuck-thrash.
- Use a write-capable subagent type (
general-purposeor a custom edit agent) for any wave that produces edits; neverExplore(read-only by construction). Why: read-only agents return plans, zero edits — the whole wave is wasted. - Run dependent subagent waves sequentially after the sibling lands; never brief a subagent to wait on a notification. Why: the subagent runtime has no wait-for-event — it terminates without work.
- Keep command output terse — a single
okor silence on success, full detail only on failure. Why: every line spends the context budget. - Wrap streaming-CLI chatter (
bun install,docker,gh,wrangler,convex deploy) with redirection —… >/dev/null 2>&1 && echo ok || cat err. Why: progress text burns the context budget. - Hold at most one long-running background slot; reap it after consuming its output. Why: the runtime leaves stale “(running)” slots that mislead.
- Check image dimensions before Read on
.png/.jpg/.heic/.webp; downscale viasips -Z 2000 "$f" --out /tmp/$(basename "$f")when over 2000px. Why: a many-image request over 2000px locks the whole session until/compact. - Confirm the exact target with the user before any destructive or irreversible action — delete, overwrite,
git push --force,reset --hard,git clean,DROP,TRUNCATE, prune, or mass mutation; name precisely what will be affected and wait for an explicit go. Why: irreversible loss has no undo, so the autonomy default never extends to it. - Scope a delete/cleanup task to the precise paths named and confirmed first; remove only the named child, never a parent directory that also holds unrelated data. Why: deleting a folder to clear its contents takes every sibling inside it, unrecoverably.
NEVER
- Stop at a status summary while autonomous-feasible work remains, or close with “want me to / should I / which one / ready?” (confirming a destructive or irreversible action is the sole exception). Cost: a wasted turn seeking permission instead of progress.
- Enumerate remaining items and ask which to do. Cost: the cue is to do all of them.
- Treat effort, size, or “diminishing returns” as a stop reason. Cost: real work dressed up as a judgment call.
- Propose dropping scope to simplify — deleting an enum case/type/prop, removing a UI affordance, replacing a control with a label, collapsing screens, or hiding behind a flag. Cost: ONLY MORE, NEVER LESS — a surface shipped narrower than approved makes the absence the new baseline; fix the cause, never amputate the feature.
- Ask for a fact the agent could discover after one consent, or guess one not in source. Cost: re-explaining paid-for evidence, or shipping code on an unverified value.
- Invent a file path, export name, config key, version pin, or API signature, or pattern-match against what similar projects “usually” do — read to confirm first, quote verbatim or do not cite. Cost: a fabricated identifier or assumed convention ships as a real bug.
- Idle through a wait — background-then-poll, heartbeat, or “a subagent will handle it”. Cost: an idle agent is the most expensive state in the loop.
- Delegate diagnostics (“paste the log”, “tell me what you see”). Cost: inverts loop ownership, slows everything.
- Orchestrate other AI providers in the dev loop (GPT/Gemini juries, cross-provider supervisors). Cost: Claude-Code subagents are the parallelism primitive; multi-provider adds coordination cost without gain.
rmor overwrite a parent directory to remove something within it — target the specific child. Cost: sibling data is destroyed with it (e.g. session transcripts beside amemory/folder).- Read “be autonomous, never ask” as permission for a destructive or irreversible step. Cost: autonomy covers reversible work only; irreversible loss needs explicit consent.
Valid stops — only these
- The user says stop or pivots.
- A hard external blocker — a credential or decision the agent cannot obtain.
- All work done, verified, and green.
Pairs with
philosophy(engineering posture);testing(cheapest harness; verify by running).