Worktrees are not required for parallel Claude Code agents under active human steering
Worktrees are not required for parallel Claude Code agents under active human steering
What you expect
The received wisdom in the agent-parallelism space is that running multiple coding agents on the same repository requires worktree isolation. The dominant tooling (ccpm, cmux, crystal, 1code) and Anthropic’s own experimental AGENT_TEAMS feature all frame worktrees as the prerequisite for safe parallel execution. The collision avoidance argument: agents writing to the same files will corrupt each other’s output, so filesystem separation is mandatory.
What actually happens
Peter Steinberger (iOS developer with ~2M reach, creator of PSPDFKit) documented running 4 parallel Claude Code instances on the same repo without worktrees, explicitly after rejecting them. The core finding: worktree isolation solves a collision problem that does not exist when a human is actively steering all agents and partitioning work by file domain.
The mechanism is discipline, not tooling: the human picks tasks carefully so agents operate on non-overlapping files. This is not accidental. It is a deliberate architectural choice with a different tradeoff profile than worktree isolation.
The two modes are not competing — they solve different supervision levels
| Approach | Isolation mechanism | Human role | Right when |
|---|---|---|---|
| Worktree isolation (ccpm, cmux) | Git worktrees — filesystem separation | Async oversight, approve PRs | Agents run semi-autonomously, blast radius high |
| Active steering (Steinberger) | Domain partitioning — careful task selection | Continuous in-the-loop | Human watching all agents, low blast radius |
Agent count depends on blast radius, not a fixed ceiling
The pattern is not “run 4 agents always.” The count depends on task type:
- Refactoring: 1–2 agents. File domains overlap by definition; collisions are more likely.
- Cleanup / tests / UI: up to 4 agents. Low blast radius work partitions cleanly.
Known failure modes the docs don’t surface
Credit exhaustion produces degraded behavior, not a clean stop. When an agent hits account credit limits mid-task, the observed failure mode is not a clean halt — the agent continues executing while degraded, actively modifying files in ways that break previously passing code. In a 4-agent setup with split attention, this is harder to detect than in a single-agent session. Credit limits must be monitored per-agent window; they cannot be assumed to produce safe failures.
Context cache resume is a cost multiplier across all sessions simultaneously. Sessions idle for more than one hour trigger context cache invalidation on resume, producing a 10x+ token cost spike to rebuild the working context. In a 4-agent setup, a break longer than an hour multiplies this cost across all open sessions at once. The correct mitigation: keep sessions under one hour, or stagger resume times across agents rather than resuming all four simultaneously. This cost spike appears only in the invoice — there is no runtime signal when it fires.
Plan mode before parallel execution reduces divergence. Running agents directly into parallel execution without a shared plan causes divergence — agents interpret open-ended tasks differently and produce incoherent results. Iterating in plan mode first and reaching agreement on scope before parallel execution is the discipline that prevents this.
What this means for you
If you are running unsupervised or semi-supervised parallel agents, worktrees remain the correct isolation primitive. The active-steering pattern requires sustained human attention — it is not a way to run more agents in the background with fewer infrastructure requirements.
If you are actively steering a small parallel session (4 agents or fewer, you are watching all windows), the worktree setup overhead is a net negative. Domain partitioning is lighter, faster to set up, and does not require the merge coordination that worktree-based workflows impose.
The tradeoff is real, not resolved. The agent-parallelism literature and tooling overstate the universality of worktree isolation. The correct question is: how supervised is this session? Supervised → domain discipline. Unsupervised → worktrees.
What to do
- Assess supervision level before choosing isolation strategy. If you will watch all agents continuously, domain partitioning is sufficient.
- Partition by file domain, not by feature. UI components, test files, and documentation are natural non-overlapping domains. Cross-cutting concerns (auth, config) require serialization regardless of isolation strategy.
- Run plan mode on all agents first. Agree on scope before parallel execution to prevent divergence.
- Monitor credit per agent window, not globally. Credit exhaustion fails silently at the task level; catching it requires per-window observation.
- Stagger session resume times if sessions will be idle for more than one hour to avoid simultaneous cache rebuild cost spikes across all agent windows.
- Cap at 4 agents for low-blast-radius work; use 1–2 for refactoring. More agents do not produce proportionally more throughput and increase the monitoring burden past what active steering can sustain.
Falsification criterion: This finding would be disproved by evidence that domain-partitioned parallel agents without worktrees produce file collisions or merge conflicts at materially higher rates than worktree-isolated agents in practitioner deployments — specifically, data showing that active steering is insufficient to prevent cross-agent collisions in real sessions. It would also be partially falsified by Claude Code shipping a native enforcement mechanism that makes worktree isolation mandatory regardless of supervision level.
Evidence
| Tool | Version | Evidence | Result |
|---|---|---|---|
| Claude Code | Validated 2026-05-24 | independently-confirmed | 4 parallel instances on domain-partitioned files operate without worktrees; source: steipete.me |
| Claude Code | Validated 2026-05-24 | independently-confirmed | Credit exhaustion produces degraded active modification behavior, not a clean stop — observed in live 4-agent sessions |
| Claude Code | Validated 2026-05-24 | independently-confirmed | Context cache invalidation on sessions idle >1 hour produces 10x+ token cost spike, multiplied across all open agent sessions |
| Inngest | Reviewed 2026-04-29 | docs-reviewed | Concurrent step limits in orchestration platforms cap effective agent parallelism at the infrastructure layer, independent of LLM account limits |
Confidence: empirical — 8 claims from practitioner-documented, reproducible workflow. Core pattern independently confirmed from published post with verifiable setup details. Evidence type is independently-confirmed (practitioner observation), not runtime-tested by Theory Delta directly.
Strongest case against: The pattern depends entirely on consistent human attention across all 4 agent windows. Any distraction, context switch, or break longer than an hour degrades the supervision quality that makes domain partitioning safe. Practitioners without Steinberger’s experience level (extensive background in parallel agent workflows) may misapply this pattern to tasks with higher blast radius than they recognize, producing exactly the collisions worktrees are designed to prevent. The pattern is not a shortcut — it is a skill.
Open questions: Does the pattern hold at 6–8 agents, or is 4 the practical ceiling for single-human active steering? What is the rate at which credit exhaustion triggers degraded-behavior failures in real sessions? Has anyone replicated this pattern on a Windows or Linux host, or is it specific to the macOS+Ghostty+ultra-wide display combination documented in the original post?
Seen different? Contribute your evidence — share a repro or counter-example and we’ll review it against this finding. Reader evidence is what keeps these findings accurate.