disinto/docs
dev-bot 41f0210abf
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
docs: document Claude Code OAuth concurrency model and external flock rationale (#637)
## Summary

Adds `docs/CLAUDE-AUTH-CONCURRENCY.md` documenting why the external `flock` on `${HOME}/.claude/session.lock` in `lib/agent-sdk.sh` is load-bearing rather than belt-and-suspenders, and provides a decision matrix for adding new containers that run Claude Code.

Pure docs change. No code touched.

## Why

The factory runs N+1 concurrent Claude Code processes across containers (`disinto-agents` plus every transient container spawned by `docker/edge/dispatcher.sh`), all sharing `~/.claude` via bind mount. The historical "agents losing auth, frequent re-logins" issue that motivated the original `session.lock` flock is the OAuth refresh race — and the flock is the only thing currently protecting against it.

A reasonable assumption when looking at Claude Code is that its internal `proper-lockfile.lock(claudeDir)` (in `src/utils/auth.ts:1491` of the leaked TS source) handles the refresh race, making the external flock redundant. **It does not**, in our specific bind-mount layout. Empirically verified:

- `proper-lockfile` defaults to `<target>.lock` as a sibling file when no `lockfilePath` is given
- For `claudeDir = /home/agent/.claude`, the lock lands at `/home/agent/.claude.lock`
- `/home/agent/` is **not** bind-mounted in our setup — it is the container's local overlay filesystem
- Each container creates its own private `.claude.lock`, none shared
- Cross-container OAuth refresh race is therefore unprotected by Claude Code's internal lock

The external flock works because the lock file path `${HOME}/.claude/session.lock` is **inside** the bind-mounted directory, so all containers see the same inode.

This came up during design discussion of the chat container in #623, where the temptation was to mount the existing `~/.claude` and skip the external flock for interactive responsiveness. The doc captures the analysis so future implementers don't take that shortcut.

## Changes

- New file: `docs/CLAUDE-AUTH-CONCURRENCY.md` (~135 lines): rationale, empirical evidence, decision matrix for new containers, pointer to the upstream fix
- `lib/AGENTS.md`: one-line **Concurrency** addendum to the `lib/agent-sdk.sh` row pointing at the new doc

## Test plan

- [ ] Markdown renders correctly in Forgejo
- [ ] Relative link from `lib/AGENTS.md` to `docs/CLAUDE-AUTH-CONCURRENCY.md` resolves (`../docs/CLAUDE-AUTH-CONCURRENCY.md`)
- [ ] Code references in the doc still match the current state of `lib/agent-sdk.sh:139,144` and `docker/agents/entrypoint.sh:119-125`

## Refs

- #623 — chat container, the issue this analysis was driven by; #623 has a comment with the same analysis pointing back here once merged

Co-authored-by: Claude <noreply@anthropic.com>
Reviewed-on: #637
Co-authored-by: dev-bot <dev-bot@disinto.local>
Co-committed-by: dev-bot <dev-bot@disinto.local>
2026-04-10 18:01:18 +00:00
..
AGENT-DESIGN.md fix: chore(26a): delete action-agent.sh, action-poll.sh, and action/AGENTS.md (#65) 2026-03-31 19:42:25 +00:00
BLAST-RADIUS.md fix: docs/BLAST-RADIUS.md + vault/SCHEMA.md: document blast-radius tier system (#440) 2026-04-08 19:59:51 +00:00
CLAUDE-AUTH-CONCURRENCY.md docs: document Claude Code OAuth concurrency model and external flock rationale (#637) 2026-04-10 18:01:18 +00:00
EVAL-MCP-SERVER.md fix: tech-debt: sweep cron-isms from code comments, helpers, lib, and public site copy (#548) 2026-04-10 08:54:11 +00:00
EVIDENCE-ARCHITECTURE.md fix: {project}-ops repo — separate operations from code (#757) (#767) 2026-03-26 19:55:12 +01:00
OBSERVABLE-DEPLOY.md fix: feat: observable addressables — engagement measurement for deployed artifacts (#718) 2026-03-26 11:57:19 +00:00
PHASE-PROTOCOL.md fix: chore: remove dead tmux-based session code (agent-session.sh, phase-handler.sh) (#262) 2026-04-05 22:25:53 +00:00
updating-factory.md fix: fix: make _generate_compose_impl the canonical compose source — remove tracked docker-compose.yml + update docs (#603) 2026-04-10 16:40:44 +00:00
VAULT.md fix: feat: vault PRs should auto-merge after approval (#170) 2026-04-03 06:29:35 +00:00