disinto/lib
dev-bot 41f0210abf
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
docs: document Claude Code OAuth concurrency model and external flock rationale (#637)
## Summary

Adds `docs/CLAUDE-AUTH-CONCURRENCY.md` documenting why the external `flock` on `${HOME}/.claude/session.lock` in `lib/agent-sdk.sh` is load-bearing rather than belt-and-suspenders, and provides a decision matrix for adding new containers that run Claude Code.

Pure docs change. No code touched.

## Why

The factory runs N+1 concurrent Claude Code processes across containers (`disinto-agents` plus every transient container spawned by `docker/edge/dispatcher.sh`), all sharing `~/.claude` via bind mount. The historical "agents losing auth, frequent re-logins" issue that motivated the original `session.lock` flock is the OAuth refresh race — and the flock is the only thing currently protecting against it.

A reasonable assumption when looking at Claude Code is that its internal `proper-lockfile.lock(claudeDir)` (in `src/utils/auth.ts:1491` of the leaked TS source) handles the refresh race, making the external flock redundant. **It does not**, in our specific bind-mount layout. Empirically verified:

- `proper-lockfile` defaults to `<target>.lock` as a sibling file when no `lockfilePath` is given
- For `claudeDir = /home/agent/.claude`, the lock lands at `/home/agent/.claude.lock`
- `/home/agent/` is **not** bind-mounted in our setup — it is the container's local overlay filesystem
- Each container creates its own private `.claude.lock`, none shared
- Cross-container OAuth refresh race is therefore unprotected by Claude Code's internal lock

The external flock works because the lock file path `${HOME}/.claude/session.lock` is **inside** the bind-mounted directory, so all containers see the same inode.

This came up during design discussion of the chat container in #623, where the temptation was to mount the existing `~/.claude` and skip the external flock for interactive responsiveness. The doc captures the analysis so future implementers don't take that shortcut.

## Changes

- New file: `docs/CLAUDE-AUTH-CONCURRENCY.md` (~135 lines): rationale, empirical evidence, decision matrix for new containers, pointer to the upstream fix
- `lib/AGENTS.md`: one-line **Concurrency** addendum to the `lib/agent-sdk.sh` row pointing at the new doc

## Test plan

- [ ] Markdown renders correctly in Forgejo
- [ ] Relative link from `lib/AGENTS.md` to `docs/CLAUDE-AUTH-CONCURRENCY.md` resolves (`../docs/CLAUDE-AUTH-CONCURRENCY.md`)
- [ ] Code references in the doc still match the current state of `lib/agent-sdk.sh:139,144` and `docker/agents/entrypoint.sh:119-125`

## Refs

- #623 — chat container, the issue this analysis was driven by; #623 has a comment with the same analysis pointing back here once merged

Co-authored-by: Claude <noreply@anthropic.com>
Reviewed-on: #637
Co-authored-by: dev-bot <dev-bot@disinto.local>
Co-committed-by: dev-bot <dev-bot@disinto.local>
2026-04-10 18:01:18 +00:00
..
hooks fix: Remove Matrix integration — notifications move to forge + OpenClaw (#732) 2026-03-26 14:53:56 +00:00
agent-sdk.sh fix: correct flock idiom to hold lock during claude invocation 2026-04-10 17:48:33 +00:00
AGENTS.md docs: document Claude Code OAuth concurrency model and external flock rationale (#637) 2026-04-10 18:01:18 +00:00
branch-protection.sh fix: bug: init branch-protection setup gives up after 3 short retries — forgejo needs more time to index freshly-created branches (#588) 2026-04-10 15:41:55 +00:00
build-graph.py fix: use undirected reachability for reviewer affected-objectives tracing 2026-03-24 21:31:55 +00:00
ci-debug.sh fix: SECURITY: Unquoted curl URLs with variables in API calls (#60) 2026-03-31 18:48:29 +00:00
ci-helpers.sh fix: fix: duplicated label ID lookup — ensure_blocked_label_id vs _ilc_ensure_label_id (#282) 2026-04-06 10:05:04 +00:00
ci-log-reader.py fix: feat: CI log access — disinto ci-logs + dev-agent CI failure context (#136) 2026-04-02 08:20:21 +00:00
ci-setup.sh fix: mock-forgejo path parsing bug + non-fatal cron in smoke-init (#586) 2026-04-10 15:08:43 +00:00
env.sh fix: fix: env.sh unbound WOODPECKER_TOKEN crashes all cron agents under set -u (#475) 2026-04-09 07:26:07 +00:00
forge-push.sh fix: fix: stop baking credentials into git remote URLs — use clean URLs + existing credential helper everywhere (#604) 2026-04-10 17:04:10 +00:00
forge-setup.sh fix: bug: bin/disinto init rotates all bot tokens and passwords on every run, invalidating existing cloned repos with embedded credentials (#584) 2026-04-10 14:18:27 +00:00
formula-session.sh fix: fix flock/binding issues with claude_run_with_watchdog 2026-04-10 17:41:39 +00:00
generators.sh fix: fix: agents container should clone project repo on first startup; treat init's host clone as operator-side only (#605) 2026-04-10 17:19:33 +00:00
git-creds.sh fix: fix: stop baking credentials into git remote URLs — use clean URLs + existing credential helper everywhere (#604) 2026-04-10 17:04:10 +00:00
guard.sh fix: tech-debt: sweep cron-isms from code comments, helpers, lib, and public site copy (#548) 2026-04-10 08:54:11 +00:00
hire-agent.sh fix: separate poll_interval from compact_pct in local-model agent TOML config 2026-04-10 05:56:18 +00:00
issue-lifecycle.sh fix: bug: dev-poll stale detection races with issue_claim — blocks freshly claimed issues (#471) 2026-04-09 14:45:30 +00:00
load-project.sh fix: feat: generate_compose() should support local-model agent containers (#520) 2026-04-09 20:09:38 +00:00
mirrors.sh fix: SECURITY: Replace eval usage with safer alternatives (#59) 2026-03-31 18:21:55 +00:00
ops-setup.sh fix: fix: stop baking credentials into git remote URLs — use clean URLs + existing credential helper everywhere (#604) 2026-04-10 17:04:10 +00:00
parse-deps.sh fix: parse-deps.sh inline regex matches every line — awk /pattern/i flag is invalid (#600) 2026-03-23 10:59:47 +00:00
pr-lifecycle.sh fix: fix: standardize logging across all agents — capture errors, log exit codes, consistent format (#367) 2026-04-07 21:15:36 +00:00
release.sh fix: keep GITHUB_TOKEN/CODEBERG_TOKEN secrets in release vault action 2026-04-10 06:36:59 +00:00
secret-scan.sh fix: Replace Codeberg dependency with local Forgejo instance (#611) 2026-03-23 16:57:12 +00:00
stack-lock.sh fix: feat: stack lock protocol for singleton project stack access (#255) 2026-04-06 07:09:26 +00:00
tea-helpers.sh fix: tea_relabel uses edit subcommand, add sha256 checksum for tea binary (#666) 2026-03-25 13:34:58 +00:00
vault.sh fix: dispatcher.sh: handle direct-commit low-tier vault actions (#439) 2026-04-08 20:15:26 +00:00
worktree.sh fix: Extract lib/worktree.sh — create, recover, cleanup (#797) 2026-03-27 19:06:31 +00:00