pm4ai

Agent execution

How an agent should drive work in this codebase

Execution discipline for an agent working this codebase. Engineering posture lives in philosophy; this is how to run a turn.

MUST

  • Continue to the next task while autonomous-feasible work remains; identify it and start. Why: idle and handoff are the costliest parts of the loop.
  • Lock the full surfaced scope into the project doc repo the moment it rounds out and ship every item in one pass; the locked set is the immovable target. Why: a “this round / next round” split is where items go to die.
  • Parallelize independent work against any known wait (build, codegen, install, network) — the parent grinds another file while a subagent runs. Why: idle wait is the costliest non-stop state in the loop.
  • Self-decide reversible, config-only, or rule-settled choices; surface only a genuine fork as one question + options (each with pros/cons) + recommendation + reasoning. Why: most “decisions” are already settled by the rules, and a stacked or unreasoned MCQ drops answer quality.
  • Exhaust code, docs, git history, and memory before asking; ask only what cannot be discovered. Why: the discovery cost is already paid.
  • Run the action yourself; never ask the user to run a command the agent can run. Why: handing work upward breaks the loop.
  • State an expected outcome and deadline before any observable action (build, navigate, poll); flag stuck the moment it deviates. Why: no criterion means silent stuck loops.
  • Scan vendor issue trackers and changelogs before declaring a third-party blocker. Why: the training cutoff lags the ecosystem by months.
  • Commit the moment a bug is found and again when fixed during any verification loop. Why: a per-bug trail maps failure to fix.
  • Foreground any command under ~2 min; background only with concurrent work in flight, never background-then-poll. Why: background-then-poll is an idle pattern.
  • Dispatch concurrent subagents sliced by file/dir/rule boundary, packed in one message; restrict edit-only subagents to read/edit/write/grep/glob; verify build-green on the shared branch first; audit their self-reports before any destructive cleanup. Why: throughput without thrash, and agents misreport.
  • Parent-author the shared types/signatures before dispatching an edit wave; subagents edit call sites against the stable signature. Why: divergent parallel authoring causes stuck-thrash.
  • Use a write-capable subagent type (general-purpose or a custom edit agent) for any wave that produces edits; never Explore (read-only by construction). Why: read-only agents return plans, zero edits — the whole wave is wasted.
  • Run dependent subagent waves sequentially after the sibling lands; never brief a subagent to wait on a notification. Why: the subagent runtime has no wait-for-event — it terminates without work.
  • Keep command output terse — a single ok or silence on success, full detail only on failure. Why: every line spends the context budget.
  • Wrap streaming-CLI chatter (bun install, docker, gh, wrangler, convex deploy) with redirection — … >/dev/null 2>&1 && echo ok || cat err. Why: progress text burns the context budget.
  • Hold at most one long-running background slot; reap it after consuming its output. Why: the runtime leaves stale “(running)” slots that mislead.
  • Check image dimensions before Read on .png/.jpg/.heic/.webp; downscale via sips -Z 2000 "$f" --out /tmp/$(basename "$f") when over 2000px. Why: a many-image request over 2000px locks the whole session until /compact.
  • Confirm the exact target with the user before any destructive or irreversible action — delete, overwrite, git push --force, reset --hard, git clean, DROP, TRUNCATE, prune, or mass mutation; name precisely what will be affected and wait for an explicit go. Why: irreversible loss has no undo, so the autonomy default never extends to it.
  • Scope a delete/cleanup task to the precise paths named and confirmed first; remove only the named child, never a parent directory that also holds unrelated data. Why: deleting a folder to clear its contents takes every sibling inside it, unrecoverably.

NEVER

  • Stop at a status summary while autonomous-feasible work remains, or close with “want me to / should I / which one / ready?” (confirming a destructive or irreversible action is the sole exception). Cost: a wasted turn seeking permission instead of progress.
  • Enumerate remaining items and ask which to do. Cost: the cue is to do all of them.
  • Treat effort, size, or “diminishing returns” as a stop reason. Cost: real work dressed up as a judgment call.
  • Propose dropping scope to simplify — deleting an enum case/type/prop, removing a UI affordance, replacing a control with a label, collapsing screens, or hiding behind a flag. Cost: ONLY MORE, NEVER LESS — a surface shipped narrower than approved makes the absence the new baseline; fix the cause, never amputate the feature.
  • Ask for a fact the agent could discover after one consent, or guess one not in source. Cost: re-explaining paid-for evidence, or shipping code on an unverified value.
  • Invent a file path, export name, config key, version pin, or API signature, or pattern-match against what similar projects “usually” do — read to confirm first, quote verbatim or do not cite. Cost: a fabricated identifier or assumed convention ships as a real bug.
  • Idle through a wait — background-then-poll, heartbeat, or “a subagent will handle it”. Cost: an idle agent is the most expensive state in the loop.
  • Delegate diagnostics (“paste the log”, “tell me what you see”). Cost: inverts loop ownership, slows everything.
  • Orchestrate other AI providers in the dev loop (GPT/Gemini juries, cross-provider supervisors). Cost: Claude-Code subagents are the parallelism primitive; multi-provider adds coordination cost without gain.
  • rm or overwrite a parent directory to remove something within it — target the specific child. Cost: sibling data is destroyed with it (e.g. session transcripts beside a memory/ folder).
  • Read “be autonomous, never ask” as permission for a destructive or irreversible step. Cost: autonomy covers reversible work only; irreversible loss needs explicit consent.

Valid stops — only these

  • The user says stop or pivots.
  • A hard external blocker — a credential or decision the agent cannot obtain.
  • All work done, verified, and green.

Pairs with

  • philosophy (engineering posture); testing (cheapest harness; verify by running).

On this page