/refocus — concept-bound session forking for Claude Code, designed for a multi-year ecosystem.
You're working in ~/projects/customer-portal/ on a checkout-flow polish. Mid-session, you notice the order-history page's refund button needs a partial-refund variant — and that's an API contract change in ~/projects/billing-service/refund-flow/, a subsystem with its own deploy cycle, its own stakeholders, its own multi-month roadmap.
The work just shifted concepts. The customer-portal session is the wrong place to design a refund-flow improvement. /refocus moves the conversation to where the artifacts will land — and where the reasoning will be findable later.
customer-portal/ ├── CLAUDE.md ├── src/checkout/ ← you started here │ ├── ReviewStep.tsx │ └── PaymentStep.tsx ├── src/orders/ ← noticed the issue here │ └── RefundButton.tsx ├── docs/ └── tests/
Concept: the customer's UI experience. UI work, copy changes, A/B tests.
billing-service/ ├── CLAUDE.md ├── refund-flow/ ← spawn HERE │ ├── CLAUDE.md ← own context │ ├── api/contracts/ │ ├── docs/ │ └── tests/ ├── invoicing/ └── reconciliation/
Concept: the refund subsystem. Own API contract, own DB schema, own SLA — a distinct conceptual thought with its own life.
Stay at customer-portal: the refund-flow reasoning gets buried in someone else's transcript and the design rationale is lost the moment you close the session. Spawn to refund-flow: it accumulates its own focused history; future sessions there inherit the full reasoning by just cd-ing in.
The directories on disk don't change. What changes is whether one session accumulates everything and mis-tags most of it, or two sessions each accumulate what belongs to them. Claude Code stores every session by the cwd it was opened in — so the transcript lands wherever you launched it, regardless of where the work eventually touches.
~/.claude/projects/
└── -home-administrator-projects-customer-portal/
└── <session>.jsonl
├── UI design discussion ← belongs here
├── Refund-flow API debate ← mis-tagged
├── Reviewer pushback rationale ← mis-tagged
└── Final design decisions ← mis-tagged
Future-me opens refund-flow/ looking for the design rationale. Finds none — it's buried in customer-portal's transcript, indistinguishable from UI work.
~/.claude/projects/ ├── -home-administrator-projects-customer-portal/ │ └── <parent>.jsonl │ └── UI design discussion only ← clean │ └── -home-administrator-projects-billing-service-refund-flow/ └── <child SID>.jsonl ├── Refund-flow API debate ← right home ├── Reviewer pushback rationale ← right home └── Final design decisions ← right home
Future-me opens refund-flow/, runs claude --resume <SID>, gets the entire reasoning history. Inherits the design context naturally.
The technical pivot that makes this work: when /refocus spawns a child, it generates a UUIDv4 session ID up front and tells the user to run cd <dest> && claude --session-id <SID>. The child's transcript lands at the destination's encoded-cwd location — exactly where future sessions there will discover it.
Same refund-flow example, after /refocus-complete finishes. Three distinct places hold three different shapes of knowledge: the transcript (full reasoning), the per-event audit (what work happened), and the canonical state (current truth that future sessions auto-load).
~/projects/billing-service/refund-flow/ ← the destination ├── src/... (code — the actual work) ├── tests/... ├── CLAUDE.md ← auto-loads docs/context/* on session start └── docs/ ├── context/ ← LAYER 3 — canonical state │ ├── architecture.md ← updated by /context-save │ ├── interfaces.md ← updated by /context-save │ ├── conventions.md │ └── gotchas.md ← updated by /context-save └── refocus/ ← LAYER 2 — per-event audit ├── 2026-04-26-refund-redesign-a3f91c.md (Brief + Result + material_changes) └── INDEX.md (gitignored, auto-regenerated) ~/.claude/projects/ ← LAYER 1 — full transcript └── -home-administrator-projects-billing-service-refund-flow/ └── a3f91c2e-7b8d-…-jsonl ← every word of reasoning, resumable
Color-coded by layer: green = canonical state, teal = per-event audit, orange = full transcript. Next slide: how content actually gets into each layer, and the trigger chain that makes Layer 3 happen reliably.
Each layer is populated by a different mechanism. The load-bearing one is Layer 3 (canonical state) — it's how material decisions escape the per-event brief and become the current truth that the next session at this directory auto-loads on startup.
"What did we discuss?" Full reasoning, every word. Created by the pre-generated --session-id. Opt-in inheritance via claude --resume <SID>.
"What work happened, what changed?" Brief + Result, immutable, git-tracked. Written by /refocus and /refocus-complete.
"What does this directory look like NOW?" Architecture, interfaces, gotchas. Promoted by chained /context-save. Auto-loaded by every future session via CLAUDE.md.
The chain: child enumerates material_changes → /refocus-complete appends Result → invokes /context-save → docs/context/* updated → next session inherits the new truth. Three triggers ensure it happens: cognitive (mandatory enumeration in Result), mechanical (chained invocation), belt-and-suspenders (SessionEnd hook for abandonment).
This deck was designed in a session that started about agent-memory, drifted into a platform-wide MCP standard, then drifted again into this skill. Three distinct conceptual scopes — three sets of artifacts in three different directories. But the reasoning behind every decision lives in one transcript, mis-tagged under the first directory.
This is what every session looks like in a multi-year ecosystem with thousands of project directories. Drift is normal. The cost is invisible until future-me opens a directory looking for the reasoning that produced its artifacts — and finds the artifacts but not the reasoning.
If the work belongs in a different directory, the conversation does too. This is the philosophy — every other decision flows from it.
Strict tree, not a DAG. Mirrors call-stack semantics. Eliminates loops and orchestration ambiguity by construction.
Briefs and results in docs/refocus/ — discoverable by every standard tool. Built for thousands of directories across years.
Principles ordered by dependency: philosophy first (why), topology second (how the work behaves), substrate third (where it lives). Reverse the order and the rest doesn't make sense.
The model that makes the rest of this deck make sense: every concept is developed by a conversation that produces artifacts in a directory. When all three line up, knowledge accumulates cleanly. When they don't, conversations bleed into wrong directories and reasoning gets lost.
A bounded thought with its own scope, lifecycle, stakeholders. Examples: "the platform MCP standard", "the agent-memory storage layer", "the auth subsystem of myapp".
A persistent transcript that develops one concept over time. Includes the dead ends, the reviewer pushback, the design pivots — the why, not just the what.
Where the concept's artifacts live. Has its own CLAUDE.md, its own deployment, its own tests. The concept's home in the filesystem.
The drift problem is what happens when one conversation develops three concepts whose homes are three different directories. The /refocus answer is to make conversation boundaries match concept boundaries, automatically and with auditable history.
Five stages. The parent stays in its session; the child gets a curated brief, runs to completion, and returns its result for the parent to act on. Call-stack semantics.
Parent runs /refocus <dest>. Skill drafts brief, generates child session ID, writes <dest>/docs/refocus/<id>.md.
User runs cd <dest> && claude --session-id <SID>. Child opens, auto-discovers brief, status flips to in-progress.
Child works toward Definition-of-Done. Surfaces follow-ups for the parent — never spawns its own /refocus.
Child runs /refocus-complete. Enumerates material_changes, appends Result, then chains /context-save to promote canonical updates into <dest>/docs/context/.
Parent resumes, auto-discovers the returned result. Decides: ship, spawn siblings to clear blockers, or absorb the work itself.
Manual launch (step 2) is the default. The pre-generated session ID is the architectural pivot — it ensures the child's transcript lands at the destination's encoded-cwd location, so future sessions in <dest> can claude --resume <SID> and inherit the full reasoning.
The granularity question: when does a sub-service deserve its own conversation? Five signals + four counter-cases.
→Strict-tree topology. Why parent owns all spawn decisions. The hard guardrail that enforces it.
→Pre-generated --session-id at destination cwd. Three options considered, one chosen. Why subagent-as-fork was rejected.
All five layers in one table — transcript, per-event audit, INDEX cache, canonical state, and the agent-memory MCP scope boundary.
→Two firing conditions, both required. Codex's framing: "if work stopped now, would the next session naturally start in dest rather than src?"
→Four real concerns, each with a mitigation. Plus the open judgment-call problem we haven't fully solved.
→v2 design done, review-board cycle 1 integrated. Build comes next. Dogfood candidate already identified.
→Nine extensions deferred from v1. Single-developer ships now; multi-engineer / 24×7 / audit-grade upgrades documented for when friction appears.
→The meta-question every power user eventually asks. When a custom skill earns its keep — and when it's just dead weight waiting for the next CLI release.
→Imagine you're working at ~/projects/myapp/ and the work shifts to authentication. Three options: stay at myapp/, spawn to myapp/auth/, or carve out an entirely new project. Granularity is a judgment call, not a formula — but five signals push spawn and four push stay.
This is the part of the design we'll keep refining as we use the skill. The signals on the next slide are the starting heuristic; expect them to evolve as patterns emerge.
If three or more hold, spawn. If only one or two, ask: would future-me, opening that subdirectory cold, expect to find a focused history there? If yes, spawn. If "I'd just look at the parent," stay.
CLAUDE.md (or equivalent local context). The directory already announces "I am a separate concept" — the conversation should match.Granularity errors run in both directions. Spawning too eagerly fragments reasoning across micro-sessions; staying too long bleeds context into wrong directories. These are the four cases where not refocusing is the right call.
sub/ changemyapp/auth/ — multi-week, has own CLAUDE.md, distinct test cyclemyapp/billing/ — own deploy, own stakeholdersWhen in doubt: start at the parent and spawn later. It's cheaper to refocus mid-conversation than to discover three weeks later that the reasoning lives in the wrong directory.
The spawn graph is a strict tree from a single root, not a DAG. This is enforced by a hard guardrail in the skill — not just a written rule in the brief. It mirrors call-stack semantics every enterprise programmer already knows: a function call returns to its caller; the caller decides what to call next.
parent ├── child A ← runs, returns ├── child B ← runs in parallel, returns └── child C ← runs, returns (parent reads results, decides what to spawn next)
parent
└── child A
└── grandchild ← refused
ERROR: Architectural violation.
Children cannot fork.
Surface this in suggested_follow_ups
and return to parent.
Detection: when a session is launched via a refocus brief, a SessionStart hook sets SESSION_REFOCUS_ROLE=child. The /refocus command reads this env var and refuses to execute.
The honest failure mode: a child session, while doing its assigned work, discovers that a blocker requires changes in a different directory it has no jurisdiction over. The strict-tree rule says it can't fork to clear the blocker. The procedure:
/refocus-complete with Status: blocked. Result section explicitly distinguishes "completed successfully" from "blocked, returning to parent for orchestration."parent_refocus_id chained to the original. The full chain is reconstructable.The architectural pivot from review-board cycle 1: pre-generate a UUIDv4 child session ID at spawn time, embed it in the brief, and pass it to claude --session-id when the user launches the child. The transcript lands at ~/.claude/projects/<encoded-dest-cwd>/<SID>.jsonl — exactly where future sessions in <dest> can find it via claude --resume.
$ /refocus ~/projects/mcp --slug mcp-standard
→ wrote brief at ~/projects/mcp/docs/refocus/2026-04-26-mcp-standard-a3f91c.md
→ child_session_id: a3f91c2e-7b8d-4e5f-9a1b-c2d3e4f5a6b7
→ relaunch: cd ~/projects/mcp && claude --session-id a3f91c2e-7b8d-4e5f-9a1b-c2d3e4f5a6b7
(later, after child completes)
$ cd ~/projects/mcp
$ claude --resume a3f91c2e-7b8d-4e5f-9a1b-c2d3e4f5a6b7
→ resumed in destination directory's encoded-cwd location
→ full transcript history available
This is the property that makes the directory-bound history actually work. Without --session-id, the child gets a random ID, the user can't find it later by name, and the "future-me opens dest, gets the reasoning" promise breaks.
Each option fails one of three criteria: cwd-bound transcript, interactivity for child, parent doesn't block. The picked one is the only path that satisfies all three.
| Option | Cwd-bound transcript | Interactive child | Parent free | Verdict |
|---|---|---|---|---|
Manual relaunch with pre-generated --session-id — PRIMARY |
✓ — transcript at encoded-dest-cwd | ✓ — child opens interactively | ✓ — parent continues (or pauses cognitively) | Pick. Optimizes for the multi-year history-accumulation goal. |
Bash subprocess claude --session-id -p — FALLBACK |
✓ — same encoded-dest-cwd location | ✗ — -p is non-interactive (headless) |
✗ — parent's Bash blocks for child duration | Useful as --oneshot for non-interactive deliverables. Not the default. |
Subagent-as-fork via Agent tool — REJECTED |
✗ — transcript stored under PARENT's session dir | ✗ — subagents can't prompt user mid-run | ✓ — synchronous return | Breaks the directory-bound-history property. Rejected even though cwd override and persistence both work today. |
Subagents have a real role — drafting the brief itself is a clean subagent task. But they aren't the fork mechanism, because their transcripts don't land at the right directory.
Five places hold knowledge in this ecosystem. Each has a specific scope, a specific updater, and a specific lifetime. Refocus's mainline outputs live in rows 1–4 (files); row 5 is a different scope and rarely involved.
| # | Layer | Lives at | Updated by | Git |
|---|---|---|---|---|
| 1 | Full transcript | ~/.claude/projects/<encoded-cwd>/<SID>.jsonl |
Claude Code (auto, as session runs) | not tracked |
| 2 | Per-event audit (Brief + Result) | <dir>/docs/refocus/<id>.md |
/refocus, /refocus-complete |
tracked |
| 3 | Refocus index (cache) | <dir>/docs/refocus/INDEX.md |
skill regen on every action | gitignored |
| 4 | Canonical state "what is true now?" | <dir>/docs/context/* |
/context-save (chained from /refocus-complete) |
tracked |
| 5 | agent-memory MCP (cross-project facts only — rare for refocus) |
postgres + ~/.claude/projects/<cwd>/memory/ |
mcp__agent-memory__write (opt-in only) |
DB-backed |
Garbage collection (rows 2 + 3): per-event files at status=result for >90 days move to docs/refocus/archive/YYYY-QN/; INDEX surfaces only live entries plus the last 30 days. Archive is grep-able offline — keeps the index scannable across years.
Scope boundary (row 5): the agent-memory MCP server (~/projects/agents/memory) holds cross-project user/feedback facts — bullet-fact-shaped, outside any single project's scope. Refocus's project-scoped outputs (rows 2 & 4) live in the files above, never in agent-memory. The two systems compose at different scopes; they don't overlap. Test for the rare exception: outcome is bullet-fact-shaped and can't live in any project's docs/context/*.
Two firing conditions, both required. The second one is the test that makes the call instinctual instead of rules-based:
The work would create or modify files in a different ~/projects/<X>/ subtree than the current cwd. Necessary but not sufficient.
"If work stopped now, would the next session naturally start in dest rather than src?" If yes, the work earns its own session at dest. If no, finish here.
Tighten the call: estimated work >30 min · source already >40% context · >1 artifact in destination subtree. Any one helps; none required.
When NOT to refocus: quick lookups, debug interludes, cross-cutting work that genuinely needs the source's full reasoning context (the canonical example: a design that's three review-board cycles deep — the child won't make sense without the rejection history).
cd <dest> && claude --session-id <SID> in a new terminal. Acceptable for a v1 — subagent-as-fork was rejected for breaking the cwd-bound-history property; manual is the cost of getting that property right.cat always works.The skill isn't built yet — this is a design walkthrough, not a launch announcement. v2 of the design is locked after one full review-board cycle (Gemini + Codex + Claude review-board), and the pieces are in sequence.
Foundational structure, three locked decisions. Dispatched to review board.
Three reviewers, two re-challenges. --session-id insight changed architecture.
Convergent feedback integrated. Strict-tree topology added (D4). Substrate locked.
Stakeholder review of the design. You're reading it.
SKILL.md + supporting scripts. First production use likely the code-executor migration already shaped as a refocus.
Once /refocus exists, every session in the ecosystem can stop being an unintentional accumulator of cross-cutting context. Knowledge lives where work is. Future sessions inherit history naturally just by cd-ing in.
v1 ships single-developer. Items below close gaps that matter only at multi-engineer / 24×7 scale — deferred until friction is real, mapped here before we need them.
| # | Extension | Solves | Trigger to adopt |
|---|---|---|---|
| 1 | assigned_to: brief frontmatter |
24×7 ownership across shifts & timezones | Rotation forms |
| 2 | Commit→refocus linkage (commit-msg hook prepends [refocus:<id>]) |
git log --grep=refocus:<id> recovers every commit even after the agent is gone |
Audit asks "what shipped from this work?" |
| 3 | alternatives_considered: brief field |
Postmortem "did we consider X?" answerable without sharing transcripts | Postmortem demands richer audit |
| 4 | decision_rationale: + reviewer_pushback: fields |
Preserve dissent that briefs currently flatten | A decision is later challenged |
| 5 | Cross-dir global INDEX (opt-in nightly cron) | No central registry across 100+ refocuses spanning many repos | ~50+ INDEX entries org-wide |
| 6 | Opt-in transcript-to-repo commit (prompt at /refocus-complete; copies child .jsonl to docs/refocus/transcripts/; sets transcript_saved: true) |
Full reasoning trail visible to other engineers / auditors when needed | Brief alone insufficient for audit |
| 7 | /scrub-transcript <path> helper skill |
Privacy/secrets gate for #6 — regex-redact tokens, Authorization:, password=; diff-confirm before commit |
Ship together with #6 |
| 8 | *.jsonl filter=lfs in .gitattributes |
Transcript blobs (often MB) bloat normal git history | Repos that opt into #6 routinely |
| 9 | Default-NO consent on save prompt | Privacy-first — engineer must explicitly type y; never default-on |
Hard requirement of #6 |
Not on this list — concurrent /context-save: each engineer works off a local clone and merges via git. Parallel updates surface as ordinary merge conflicts; no locking machinery needed.
Sequencing if/when adopted: 1, 3, 4 (brief-schema additions) are zero-risk and can ship anytime. 6–9 cluster: don't ship #6 without #7 and #9. Items 2 and 5 are independent quality-of-life additions that can land last.
Anthropic optimizes for the median user (one session that grows; auto-compaction; subagents; long context). Power users with specific workflows are expected to deviate. The real question isn't "does Anthropic do this?" — it's "does this pay for itself vs the default I'd reach for instead?"
Examples: /refocus, /context-save.
--resume <id>) already covers it; you're just wrapping for ergonomics.Examples: branch-auto-resume wrapper, "force one session per repo" hook.
That alignment is the whole reason for /refocus. Everything else — the topology, the fork mechanism, the file layout — is just the engineering required to make it work at the scale of a multi-year ecosystem.