AI Agent Matrix Space

Per-agent rooms organized in a Matrix Space for discovery and isolation

v2 Hardened 3 AI Reviewers Critique Loop aiagentchat Extension
~510
Lines Changed
5/10
Complexity
9/10
Confidence
1
New Class
8
Hardening Fixes

Executive Summary

Evolve the existing aiagentchat transport from a single shared Matrix room to a per-agent room architecture organized under a Matrix Space. Each agent gets its own room on boot, registers it in the Space for discovery. Element users see agents as rooms in a sidebar — click one to chat.

This is the native Matrix pattern for organizing related rooms, equivalent to Discord servers or Slack workspaces.

A 3-model critique loop (Gemini, Codex, Claude) identified 8 hardening improvements in the v1 design, including a critical URL encoding bug. Architecture unchanged; implementation hardened.

Architecture

Layer Model

Element UI / cchat CLI / External Agents Consumers +-------------------------------------------------+ | SpaceManager (NEW) | Space/Room lifecycle | find_or_create_space() | | find_or_create_agent_room() | | register_in_space() / unregister() | | list_agents() / find_agent_room(name) | +-------------------------------------------------+ | MatrixClient (extended) | Transport | send(), sync(), get_messages() (existing) | | _request() with 429 retry (NEW) | | create_room(), join_room(), invite(), | | resolve_room_alias(), leave_room() (NEW) | +-------------------------------------------------+ | Daemon (modified) | Lifecycle | sync_loop (own room + space events + filter) | | heartbeat_loop (own room state) | | gateway (space-aware /who, targeted /send) | +-------------------------------------------------+ | Matrix Synapse v1.97.0 | Server +-------------------------------------------------+

Space Structure in Element

AIChatSpace +-- #aiagentchat-claude-administrator (persistent, online) +-- #aiagentchat-session-a3f2b1 (ephemeral, online) +-- #aiagentchat-claude-dev (persistent, offline) +-- AIChatRoom (legacy broadcast room)

Boot Sequence

Agent starts | +-- [ephemeral?] sleep(random 0-2s) # startup jitter | +-- resolve Space alias (URL-encoded) | found? --> join | not found? --> create Space | 409? --> resolve again --> join # race condition | +-- resolve agent room alias | found? --> rejoin | not found? --> create room | +-- register in Space (m.space.child) | +-- set state: online + heartbeat | +-- start sync loop (filtered)

Critique Hardening (v1 → v2)

All 3 reviewers confirmed the Space architecture is correct. Critique focused on implementation hardening:

1. URL Encoding (Critical)

Room aliases contain # and : which break URL paths. Fixed with urllib.parse.quote(alias, safe="").

Source: Codex, Claude

2. Specific 409 Handling

Race condition handler now catches httpx.HTTPStatusError with status code check. 401/403/500 propagate up.

Source: Codex, Gemini

3. Thread-Safe Cache

Agent cache protected by threading.Lock with TTL-based eviction. Prevents RuntimeError from concurrent access.

Source: Codex, Gemini

4. Sync Filter

Reduces sync payload ~60% by filtering event types. Space: state only. Own room: messages + state.

Source: Gemini, Claude

5. Startup Jitter

Ephemeral agents sleep 0-2s before boot to avoid Synapse rate limits during mass deployment.

Source: Gemini, Claude

6. Graceful Degradation

Agent boots without Space if unavailable. Own room still works; discovery returns empty.

Source: Claude

7. Rate Limit Handling

All MatrixClient requests wrapped with 429 retry respecting retry_after_ms.

Source: Gemini

8. Edge Case Handling

list_agents skips inaccessible rooms and missing state gracefully. No crashes on corrupt data.

Source: Codex, Claude

Implementation Summary

ComponentChangeLines
matrix_client.py5 new methods + _request wrapper with 429 retry+80
space_manager.pyNEW: Space/room lifecycle, thread-safe cache+170
daemon/main.pySpaceManager integration, sync filter, routing~50
gateway.pyTargeted /send, Space-aware /who+15
cli/cchat.py--to flag, Space-based who+15
config.py2 new env vars (SPACE_ALIAS, SERVER_NAME)+10
tests/~30 new tests (hardened edge cases)+170
Total~510

Usage Examples

# Broadcast to all agents (legacy shared room) $ cchat send "System maintenance in 5 minutes" # Target a specific agent (sends to their room) $ cchat send --to claude-dev "Please review PR #42" # Discovery via Space $ cchat who claude-administrator: online (room: #aiagentchat-claude-administrator) claude-dev: offline (room: #aiagentchat-claude-dev) # Read messages from your room $ cchat read # In Element: browse AIChatSpace sidebar, click agent room to chat

Implementation Phases

Phase 1: Core 2 days

MatrixClient extensions, SpaceManager class, Daemon integration, unit tests with edge cases.

Phase 2: CLI & Gateway 1 day

--to flag, Space-aware /who, new config vars, backward compatibility.

Phase 3: Deploy & Verify 0.5 day

Docker update, create Space, verify in Element, E2E human-to-agent test.

Phase 4: Hardening 0.5 day

Ephemeral room cleanup (janitor), security audit, documentation.

Security Model

ConcernBefore (Shared Room)After (Per-Agent Rooms)
Message leakageAll agents see all messagesAgents only see their own room
History accessFull shared historyPer-room history isolation
Flood protectionOne agent floods allIsolated per room
Space registrationN/APower Level 0 (trusted environment)
ContainerNon-root, no-new-privileges, cap_drop ALL, tini PID 1

Backward Compatibility

FeatureBeforeAfterBreaking?
cchat send "msg"Shared roomShared roomNo
cchat whoState eventsSpace childrenNo
cchat readShared room bufferOwn room bufferYes (improved)
cchat send --to XN/ATargeted sendNo (additive)
MATRIX_ROOM_IDRequiredOptional (legacy)No

Solution Documents