11-Stage Pipeline Deep Dive

What happens at each stage, what artifacts are produced, and where the handoffs occur

End-to-End Flow

This sequence diagram shows how an issue flows through the complete pipeline, including the feedback loops when problems are detected.

sequenceDiagram
    participant H as Human
    participant GL as GitLab
    participant CI as CI/CD Pipeline
    participant A as AI Agents
    participant KB as Kanban Board

    H->>GL: Create issue
    GL->>CI: Trigger pipeline (ISSUE_IID)
    CI->>A: Stage 1: Triage (PM)
    A->>KB: Apply status::triage + type labels
    CI->>A: Stage 2: Clarification (PM)

    alt Issue is vague
        A->>GL: Post clarification questions
        A->>KB: Add needs-clarification label
        Note over H,GL: Pipeline pauses - awaiting human
        H->>GL: Answer questions in comment
        GL->>CI: Re-trigger pipeline
    end

    A->>KB: Update status::specification
    Note over H,CI: MANUAL GATE: Human triggers Stage 3
    CI->>A: Stage 3: Specification (Architect)
    A->>GL: Commit spec.md artifact
    CI->>A: Stage 4: Spec-Checklist (QA)

    alt Checklist fails
        A->>GL: Post failed checks
        Note over CI,A: Loop back to clarification
    end

    A->>KB: Add spec-approved label
    CI->>A: Stage 5: Planning (Architect)
    A->>GL: Commit plan.md artifact
    CI->>A: Stage 6: Tasks (Developer)
    A->>GL: Commit tasks.md artifact
    CI->>A: Stage 7: Analysis (QA)
    A->>GL: Commit analysis.md + dependency graph
    A->>KB: Add ready-for-implementation label

    Note over H,CI: MANUAL GATE: Human triggers Stage 8
    CI->>A: Stage 8: Implementation (Developer)
    A->>GL: Create branch + MR
    CI->>A: Stage 9: Security (Security)
    A->>GL: Post security findings
    CI->>A: Stage 10: Testing (QA)
    A->>GL: Post test-report.md

    Note over H,CI: MANUAL GATE: Human triggers Stage 11
    CI->>A: Stage 11: Deployment (Developer)

    alt Deployment fails
        A->>A: Auto-rollback
        A->>KB: Add deployment-failed label
    end

    A->>KB: Update status::done

Stage-by-Stage Walkthrough

Verify Runner

~1 min Auto

A smoke test that runs on every push or web trigger. Verifies the runner is functional before any real work begins.

Checks that git, curl, and jq are installed
Prints runner ID, job ID, pipeline ID for debugging
Enforces the "No AI in Runners" constraint: if which claude succeeds, the stage fails with exit code 1
Does NOT require ISSUE_IID - it's the only stage that runs standalone

Why enforce "no AI in runners"?

Runners should be deterministic and predictable. Embedding AI directly in runners would make pipeline behavior non-reproducible. AI is invoked as external service calls instead.

Triage

PM Agent ~5 min Auto

The PM agent fetches the issue from GitLab's API and auto-classifies it based on keyword detection in the title and description.

Fetches issue via GET /projects/{id}/issues/{iid}
Scans title + description for keywords: bug|fix|error = bug, feature|add|new = feature, enhance|improve = enhancement
Applies scoped labels: status::triage + detected type:: label
Uses resource_group: issue-$ISSUE_IID to prevent concurrent processing

Labels Applied

status::triage, type::feature | type::bug | type::enhancement

Clarification

PM Agent ~10 min Auto

The PM agent evaluates whether the issue has enough detail to write a specification. If not, it generates structured clarification questions and posts them as a comment.

Invokes Claude with a prompt containing issue title, description, and labels
Claude analyzes for: scope clarity, success criteria, technical dependencies, edge cases, priority
If well-specified: responds WELL_SPECIFIED, pipeline continues
If vague: posts 3-5 numbered questions as an issue comment
Adds/removes needs-clarification label based on result
Updates board: status::triage → status::clarification

The human feedback loop

When clarification is needed, the pipeline effectively pauses. A human must answer the questions by commenting on the issue, then re-trigger the pipeline. The clarification stage will re-evaluate and proceed if answers are sufficient.

Specification

Architect ~15 min Manual Gate

The first manual gate. A human must click "play" on this stage in GitLab. The Architect agent then generates a formal specification document.

Creates directory structure: specs/issue-{IID}/ with checklists/ and contracts/ subdirs
Uses spec-template.md as a starting point
Generates spec.md focusing on WHAT and WHY (not technical HOW)
Includes: overview, requirements, user stories, acceptance criteria, constraints, security considerations
Artifact retained for 30 days

Artifacts Produced

specs/issue-{IID}/spec.md — Formal specification (WHAT + WHY)

Spec-Checklist

QA Agent ~5 min Auto

Quality gate: "unit tests for English." The QA agent validates the spec against a 6-point checklist before allowing planning to begin.

Check 1: Overview section present
Check 2: Requirements section present
Check 3: User Stories section present
Check 4: Acceptance criteria defined
Check 5: Security considerations included
Check 6: TBD items remaining (warning only, doesn't block)

If all checks pass, the spec-approved label is added. If any required check fails, the pipeline loops back for revision.

Artifacts Produced

specs/issue-{IID}/spec-checklist.md — Validation report with pass/fail for each check

Planning

Architect ~15 min Auto

Now the Architect transforms WHAT into HOW. Reads the approved spec.md and produces a detailed implementation plan.

Reads spec.md from the previous stage's artifacts
Generates plan.md with 3+ implementation tasks
Each task includes: assigned agent, effort estimate, dependencies, files to modify, step-by-step instructions
May include data model changes, API contracts, architecture decisions

Artifacts Produced

specs/issue-{IID}/plan.md — Implementation plan with tasks, dependencies, file changes

Task Generation

Developer ~10 min Auto

The Developer agent breaks the plan into 3-5 atomic, implementable tasks. Each task is small enough to complete in one session.

Reads plan.md from previous stage
Creates task table: ID, name, agent, status, dependencies
Each task has detailed description with acceptance criteria
Parallelizable tasks are marked with [P]
Adds ready-for-implementation label when complete

Artifacts Produced

specs/issue-{IID}/tasks.md — Task breakdown table with dependencies and criteria

Task Analysis

QA Agent ~10 min Auto

Another quality gate. The QA agent validates that the tasks are complete, properly scoped, and internally consistent.

Validates task count (3-5 expected)
Checks each task has: clear scope, acceptance criteria, dependencies
Generates ASCII dependency graph showing task ordering
Cross-references spec ↔ plan ↔ tasks for coverage gaps
Risk assessment: identifies high-risk tasks and missing edge cases

Artifacts Produced

specs/issue-{IID}/analysis.md — Task validation, dependency graph, risk assessment

Implementation

Developer ~30 min Manual Gate

The second manual gate. Human reviews the spec, plan, and tasks, then triggers implementation. The Developer agent writes the actual code.

Creates feature branch: issue-{IID}-implementation
Invokes Claude with full context: spec + plan + tasks + analysis
Claude generates code changes and implementation suggestions
Saves detailed output to implementation.md
Commits and pushes the feature branch
Updates board label: status::implementation

Artifacts Produced

specs/issue-{IID}/implementation.md + feature branch + merge request

Security Review

Security ~10 min Auto

The Security agent scans the codebase and deployment configuration for vulnerabilities. Behavior differs by group.

Check 1: Hardcoded credentials (regex: password|secret|token|api_key)
Check 2: Privileged containers (privileged: true)
Check 3: Secrets path compliance (should use $HOME/projects/secrets/)
Check 4: eval() usage (code injection risk)
Check 5: Network exposure (open ports, public endpoints)

Administrators: Blocking mode

For the administrators group, security findings are blocking - the pipeline fails with exit code 1 if any critical issue is found. Infrastructure projects need higher security standards.

Developers: Advisory mode

For the developers group, security findings are advisory - the pipeline continues with allow_failure: true. Warnings are posted but don't block deployment.

Testing

QA Agent ~20 min Auto

The QA agent auto-detects the project type and runs the appropriate test suite. No configuration needed - it figures out the framework.

Node.js: detects package.json → runs npm test
Python: detects pytest.ini or tests/ → runs pytest tests/
Go: detects go.mod → runs go test ./...
Rust: detects Cargo.toml → runs cargo test
Shell: detects tests/*.sh → runs each test script

Generates test-report.md with results. Exits with code 1 if tests fail.

Artifacts Produced

specs/issue-{IID}/test-report.md — Test results with pass/fail counts

Deployment

Developer ~15 min Manual Gate

The final manual gate. Human confirms deployment after reviewing security findings and test results.

Pre-deploy snapshot: Records current commit hash, timestamp, pipeline ID to rollback/ directory
Execute: Runs deploy.sh or docker compose up -d
Health check: Waits 10 seconds, verifies containers are running
On success: Adds status::done, removes status::deployment
On failure: Automatically triggers rollback, adds deployment-failed label

Automatic Rollback

If deployment fails, the rollback script (scripts/rollback.sh) executes automatically:

Stops containers: docker compose down --remove-orphans
Checks out previous commit from the pre-deploy snapshot
Runs migrations/rollback.sql if it exists
Restarts containers: docker compose up -d
Waits 10s and verifies rollback succeeded
Logs everything to /tmp/rollback_*.log

Matrix Notifications

The pipeline sends color-coded notifications to #cicd-notifications:ai-servicers.com via a Matrix bot at key moments.

Pipeline Success (main branch)  →  GREEN   notification
Pipeline Failure (main branch)  →  RED     notification
Manual Gate Reached             →  ORANGE  notification

Notifications are sent using scripts/notify-matrix.sh, which loads the bot token from $HOME/projects/secrets/matrix.env and sends an HTML-formatted m.room.message.

Retry & Error Handling

The pipeline has built-in resilience for transient failures.

Retry policy: Max 2 retries on runner_system_failure or stuck_or_timeout_failure
GIT_STRATEGY: fetch (incremental, not full clone each time)
Default image: registry.gitlab.ai-servicers.com/administrators/cicd/cicd-runner:latest
Runner tags: cicd, docker (ensures jobs go to the right runner)