feat: drop chat rate-limiting — remove per-user hour/day request caps and token cap (reverts #711) #1084

Closed
opened 2026-04-20 15:18:27 +00:00 by disinto-admin · 0 comments

Goal

Remove all rate-limiting logic from the chat server. Reverts the hour/day request caps and the daily token cap added in #711.

Rationale (from operator): chat is single-operator, not a public service. The rate limits were added defensively in case of prompt-injection leading to Claude loops, but that concern is better addressed by simply ending the chat session. The rate-limit code adds complexity and false failure modes (e.g. overnight reflection loops getting capped) without catching the real risks.

The change

In docker/chat/server.py

Delete:

  • _check_rate_limit(user) (around line 220)
  • _record_request(user) (around line 262)
  • _record_tokens(user, tokens) (around line 267)
  • Any module-level state these functions use (look for dicts keyed by user storing timestamps / counters)
  • Call sites in handle_chat(user) — search for _check_rate_limit and _record_request / _record_tokens; delete the conditional blocks and the increment calls

Keep the existing _parse_stream_json and conversation-history logic — those are independent.

In docker-compose.yml

Delete the three env vars from the chat: service (or edge: if #NNNN "chat-in-edge" has already landed):

  • CHAT_MAX_REQUESTS_PER_HOUR
  • CHAT_MAX_REQUESTS_PER_DAY
  • CHAT_MAX_TOKENS_PER_DAY

In .env.example (if present)

Remove the same three entries.

Acceptance criteria

  • grep -r 'CHAT_MAX_' docker/chat/ docker-compose.yml .env.example 2>/dev/null returns no matches
  • grep -r '_check_rate_limit\|_record_request\|_record_tokens' docker/chat/ returns no matches
  • Chat responds to POST /chat without rate-limit failure paths regardless of request volume (smoke test: send 100 messages in a tight loop, all return 200 or a Claude error, none return 429)
  • docker/chat/server.py has no references to per-user rate-limit state
  • shellcheck clean on any shell touched (none expected — this is Python + YAML only)
  • No broken imports in server.py after the deletion (e.g. time may be unused if only rate-limit code used it — verify)
  • #711 — original rate-limit implementation. Close as "superseded by #NNNN" once this lands.
  • Companion issue: move chat into edge container (#NNNN). Both simplifications from the same architectural pass.
## Goal Remove all rate-limiting logic from the chat server. Reverts the hour/day request caps and the daily token cap added in #711. **Rationale (from operator):** chat is single-operator, not a public service. The rate limits were added defensively in case of prompt-injection leading to Claude loops, but that concern is better addressed by simply ending the chat session. The rate-limit code adds complexity and false failure modes (e.g. overnight reflection loops getting capped) without catching the real risks. ## The change ### In `docker/chat/server.py` Delete: - `_check_rate_limit(user)` (around line 220) - `_record_request(user)` (around line 262) - `_record_tokens(user, tokens)` (around line 267) - Any module-level state these functions use (look for dicts keyed by user storing timestamps / counters) - Call sites in `handle_chat(user)` — search for `_check_rate_limit` and `_record_request` / `_record_tokens`; delete the conditional blocks and the increment calls Keep the existing `_parse_stream_json` and conversation-history logic — those are independent. ### In `docker-compose.yml` Delete the three env vars from the `chat:` service (or `edge:` if #NNNN "chat-in-edge" has already landed): - `CHAT_MAX_REQUESTS_PER_HOUR` - `CHAT_MAX_REQUESTS_PER_DAY` - `CHAT_MAX_TOKENS_PER_DAY` ### In `.env.example` (if present) Remove the same three entries. ## Acceptance criteria - [ ] `grep -r 'CHAT_MAX_' docker/chat/ docker-compose.yml .env.example 2>/dev/null` returns no matches - [ ] `grep -r '_check_rate_limit\|_record_request\|_record_tokens' docker/chat/` returns no matches - [ ] Chat responds to POST `/chat` without rate-limit failure paths regardless of request volume (smoke test: send 100 messages in a tight loop, all return 200 or a Claude error, none return 429) - [ ] `docker/chat/server.py` has no references to per-user rate-limit state - [ ] `shellcheck` clean on any shell touched (none expected — this is Python + YAML only) - [ ] No broken imports in `server.py` after the deletion (e.g. `time` may be unused if only rate-limit code used it — verify) ## Related - #711 — original rate-limit implementation. Close as "superseded by #NNNN" once this lands. - Companion issue: move chat into edge container (#NNNN). Both simplifications from the same architectural pass.
disinto-admin added the
backlog
label 2026-04-20 15:18:27 +00:00
dev-qwen2 self-assigned this 2026-04-20 15:24:39 +00:00
dev-qwen2 added
in-progress
and removed
backlog
labels 2026-04-20 15:24:39 +00:00
dev-qwen2 removed their assignment 2026-04-20 16:41:52 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: disinto-admin/disinto#1084
No description provided.