aiagent-matrix-space

Implementation Plan - Matrix Space Architecture for aiagentchat

Final Plan v2.0 Hardened 3 AI Plans + 2 Reviews 2026-02-05
420 New LOC
5/10 Complexity
9/10 Confidence
2.5-3 Days (Parallel)
14 Tasks
8 Hardening Reqs

Phase Timeline

1 Design 0.5 days
1.1 Architecture Review & ADR
Architect 2h
1.2 Security & Network Review
Security 2h
1.3 GitLab Board Setup
PM 1h
2 Implement 1.5 days
2.1 MatrixClient Extensions (~80 LOC)
Developer Group A 6h
2.2 SpaceManager Class (~170 LOC)
Developer Group A 8h
2.3 Test Scaffolding (~30 stubs)
QA Group A 4h
3 Integrate 0.5 days
3.1 Daemon Integration (~60 LOC)
Developer 5h
3.2 Gateway & CLI Updates (~30 LOC)
Developer 3h
3.3 Configuration + Existing Test Updates
Developer 2h
3.4 Complete Unit Tests (~30 tests)
QA 6h
4 QA & Deploy 0.5 days
4.1 CI Pipeline & Validation
QA Group C 2h
4.2 Production Deploy & Element Verify
Developer Group C 3h
4.3 Acceptance Testing (8 tests)
QA 2h
4.4 Documentation & Release v2.0.0
PM 2h

8 Hardening Requirements

#1 URL-Encode Aliases CRITICAL
resolve_room_alias must encode # and : chars. Without encoding, all alias lookups fail.
#2 Catch 409 Specifically HIGH
Race condition handler catches only HTTPStatusError with 409, not bare Exception.
#3 Thread-Safe Cache HIGH
Agent cache uses threading.Lock + TTL eviction (1h). Prevents RuntimeError on concurrent access.
#4 Sync Filter MEDIUM
Ad-hoc filter on /sync reduces payload ~60%. Only receives message, state, and member events.
#5 Startup Jitter MEDIUM
Ephemeral agents sleep random 0-2s before Space operations. Prevents 429 storms on mass boot.
#6 Graceful Degradation MEDIUM
Agent boots and operates even if Space is unavailable. Sets space_id=None, runs in v1 mode.
#7 Rate Limit (429) MEDIUM
_request wrapper retries with retry_after_ms from response body. 3 attempts max.
#8 list_agents Edge Cases MEDIUM
Skip rooms returning 403, missing state, or empty via field. Never crash on inaccessible rooms.

Dependency Graph

Phase 1 (Sequential) 1.1 [Architect] ADR + Filter Spec + Permissions |-> 1.2 [Security] Security Review |-> 1.3 [PM] GitLab Board Setup Phase 2 (Parallel Group A) 2.1 [Dev-1] MatrixClient Extensions --+ 2.2 [Dev-2] SpaceManager Class --+-- all depend on 1.1 2.3 [QA] Test Scaffolding --+ Phase 3 (Sequential + Group B) 3.1 [Dev] Daemon Integration <-- depends on 2.1, 2.2 3.2 [Dev] Gateway/CLI <-- depends on 2.2 3.3 [Dev] Config + Test Updates <-- depends on 3.1 3.4 [QA] Complete Tests <-- depends on 2.1, 2.2, 3.1, 3.2, 3.3 Phase 4 (Parallel Group C + Sequential) 4.1 [QA] CI Pipeline --+-- depend on 3.4 4.2 [Dev] Deploy --+ 4.3 [QA] Acceptance <-- depends on 4.1, 4.2 4.4 [PM] Release <-- depends on 4.3

Agent Utilization (48h total effort)

PM
P1
P4
3h
Architect
P1
2h
Security
P1
2h
Developer
P2 (14h)
P3 (10h)
P4
27h
QA
P2 (4h)
P3 (6h)
P4 (4h)
14h

Risk Register (13 items)

IDRiskProbImpactMitigation
R1429 rate limit stormsMediumHighStartup jitter + _request retry
R2Space creation race (409)HighMediumSpecific 409 handling, resolve fallback
R3Thread safety bugsMediumHighLock + TTL + concurrent tests
R4Config test breakageMediumMediumUpdate tests in Task 3.3
R5Orphaned roomsMediumMediumIdempotent registration + heartbeat
R6Backward compat breakLowHighv2 heartbeats to both rooms
R9URL encoding bugLowCriticalUnit test verifies encoding
R11Daemon routing complexityMediumMediumPer-source tests, clear routing
R12Permission misconfigurationLowHighSecurity review validates model

Peer Review Feedback (Incorporated)

Success Metrics

MetricTarget
Tests Passing~81 (51 existing + ~30 new)
Test Coverage≥85% for new code
Element UISpace visible with nested agent rooms
Bidirectional MessagingElement ↔ Agent works
Targeted Sendcchat send --to agent works
Backward Compatibilitycchat send, cchat who unchanged
Heartbeat LifecycleOnline → offline → online
8 Hardening RequirementsAll implemented and tested
Deployment Downtime<10 minutes