11-Stage Pipeline Deep Dive

What happens at each stage, what artifacts are produced, and where the handoffs occur

End-to-End Flow

This sequence diagram shows how an issue flows through the complete pipeline, including the feedback loops when problems are detected.

sequenceDiagram
    participant H as Human
    participant GL as GitLab
    participant CI as CI/CD Pipeline
    participant A as AI Agents
    participant KB as Kanban Board

    H->>GL: Create issue
    GL->>CI: Trigger pipeline (ISSUE_IID)
    CI->>A: Stage 1: Triage (PM)
    A->>KB: Apply status::triage + type labels
    CI->>A: Stage 2: Clarification (PM)

    alt Issue is vague
        A->>GL: Post clarification questions
        A->>KB: Add needs-clarification label
        Note over H,GL: Pipeline pauses - awaiting human
        H->>GL: Answer questions in comment
        GL->>CI: Re-trigger pipeline
    end

    A->>KB: Update status::specification
    Note over H,CI: MANUAL GATE: Human triggers Stage 3
    CI->>A: Stage 3: Specification (Architect)
    A->>GL: Commit spec.md artifact
    CI->>A: Stage 4: Spec-Checklist (QA)

    alt Checklist fails
        A->>GL: Post failed checks
        Note over CI,A: Loop back to clarification
    end

    A->>KB: Add spec-approved label
    CI->>A: Stage 5: Planning (Architect)
    A->>GL: Commit plan.md artifact
    CI->>A: Stage 6: Tasks (Developer)
    A->>GL: Commit tasks.md artifact
    CI->>A: Stage 7: Analysis (QA)
    A->>GL: Commit analysis.md + dependency graph
    A->>KB: Add ready-for-implementation label

    Note over H,CI: MANUAL GATE: Human triggers Stage 8
    CI->>A: Stage 8: Implementation (Developer)
    A->>GL: Create branch + MR
    CI->>A: Stage 9: Security (Security)
    A->>GL: Post security findings
    CI->>A: Stage 10: Testing (QA)
    A->>GL: Post test-report.md

    Note over H,CI: MANUAL GATE: Human triggers Stage 11
    CI->>A: Stage 11: Deployment (Developer)

    alt Deployment fails
        A->>A: Auto-rollback
        A->>KB: Add deployment-failed label
    end

    A->>KB: Update status::done
            

Stage-by-Stage Walkthrough

0
Verify Runner
~1 min Auto

A smoke test that runs on every push or web trigger. Verifies the runner is functional before any real work begins.

  • Checks that git, curl, and jq are installed
  • Prints runner ID, job ID, pipeline ID for debugging
  • Enforces the "No AI in Runners" constraint: if which claude succeeds, the stage fails with exit code 1
  • Does NOT require ISSUE_IID - it's the only stage that runs standalone

Why enforce "no AI in runners"?

Runners should be deterministic and predictable. Embedding AI directly in runners would make pipeline behavior non-reproducible. AI is invoked as external service calls instead.

1
Triage
PM Agent ~5 min Auto

The PM agent fetches the issue from GitLab's API and auto-classifies it based on keyword detection in the title and description.

  • Fetches issue via GET /projects/{id}/issues/{iid}
  • Scans title + description for keywords: bug|fix|error = bug, feature|add|new = feature, enhance|improve = enhancement
  • Applies scoped labels: status::triage + detected type:: label
  • Uses resource_group: issue-$ISSUE_IID to prevent concurrent processing
Labels Applied
status::triage, type::feature | type::bug | type::enhancement
2
Clarification
PM Agent ~10 min Auto

The PM agent evaluates whether the issue has enough detail to write a specification. If not, it generates structured clarification questions and posts them as a comment.

  • Invokes Claude with a prompt containing issue title, description, and labels
  • Claude analyzes for: scope clarity, success criteria, technical dependencies, edge cases, priority
  • If well-specified: responds WELL_SPECIFIED, pipeline continues
  • If vague: posts 3-5 numbered questions as an issue comment
  • Adds/removes needs-clarification label based on result
  • Updates board: status::triagestatus::clarification

The human feedback loop

When clarification is needed, the pipeline effectively pauses. A human must answer the questions by commenting on the issue, then re-trigger the pipeline. The clarification stage will re-evaluate and proceed if answers are sufficient.

3
Specification
Architect ~15 min Manual Gate

The first manual gate. A human must click "play" on this stage in GitLab. The Architect agent then generates a formal specification document.

  • Creates directory structure: specs/issue-{IID}/ with checklists/ and contracts/ subdirs
  • Uses spec-template.md as a starting point
  • Generates spec.md focusing on WHAT and WHY (not technical HOW)
  • Includes: overview, requirements, user stories, acceptance criteria, constraints, security considerations
  • Artifact retained for 30 days
Artifacts Produced
specs/issue-{IID}/spec.md — Formal specification (WHAT + WHY)
4
Spec-Checklist
QA Agent ~5 min Auto

Quality gate: "unit tests for English." The QA agent validates the spec against a 6-point checklist before allowing planning to begin.

  • Check 1: Overview section present
  • Check 2: Requirements section present
  • Check 3: User Stories section present
  • Check 4: Acceptance criteria defined
  • Check 5: Security considerations included
  • Check 6: TBD items remaining (warning only, doesn't block)

If all checks pass, the spec-approved label is added. If any required check fails, the pipeline loops back for revision.

Artifacts Produced
specs/issue-{IID}/spec-checklist.md — Validation report with pass/fail for each check
5
Planning
Architect ~15 min Auto

Now the Architect transforms WHAT into HOW. Reads the approved spec.md and produces a detailed implementation plan.

  • Reads spec.md from the previous stage's artifacts
  • Generates plan.md with 3+ implementation tasks
  • Each task includes: assigned agent, effort estimate, dependencies, files to modify, step-by-step instructions
  • May include data model changes, API contracts, architecture decisions
Artifacts Produced
specs/issue-{IID}/plan.md — Implementation plan with tasks, dependencies, file changes
6
Task Generation
Developer ~10 min Auto

The Developer agent breaks the plan into 3-5 atomic, implementable tasks. Each task is small enough to complete in one session.

  • Reads plan.md from previous stage
  • Creates task table: ID, name, agent, status, dependencies
  • Each task has detailed description with acceptance criteria
  • Parallelizable tasks are marked with [P]
  • Adds ready-for-implementation label when complete
Artifacts Produced
specs/issue-{IID}/tasks.md — Task breakdown table with dependencies and criteria
7
Task Analysis
QA Agent ~10 min Auto

Another quality gate. The QA agent validates that the tasks are complete, properly scoped, and internally consistent.

  • Validates task count (3-5 expected)
  • Checks each task has: clear scope, acceptance criteria, dependencies
  • Generates ASCII dependency graph showing task ordering
  • Cross-references spec ↔ plan ↔ tasks for coverage gaps
  • Risk assessment: identifies high-risk tasks and missing edge cases
Artifacts Produced
specs/issue-{IID}/analysis.md — Task validation, dependency graph, risk assessment
8
Implementation
Developer ~30 min Manual Gate

The second manual gate. Human reviews the spec, plan, and tasks, then triggers implementation. The Developer agent writes the actual code.

  • Creates feature branch: issue-{IID}-implementation
  • Invokes Claude with full context: spec + plan + tasks + analysis
  • Claude generates code changes and implementation suggestions
  • Saves detailed output to implementation.md
  • Commits and pushes the feature branch
  • Updates board label: status::implementation
Artifacts Produced
specs/issue-{IID}/implementation.md + feature branch + merge request
9
Security Review
Security ~10 min Auto

The Security agent scans the codebase and deployment configuration for vulnerabilities. Behavior differs by group.

  • Check 1: Hardcoded credentials (regex: password|secret|token|api_key)
  • Check 2: Privileged containers (privileged: true)
  • Check 3: Secrets path compliance (should use $HOME/projects/secrets/)
  • Check 4: eval() usage (code injection risk)
  • Check 5: Network exposure (open ports, public endpoints)

Administrators: Blocking mode

For the administrators group, security findings are blocking - the pipeline fails with exit code 1 if any critical issue is found. Infrastructure projects need higher security standards.

Developers: Advisory mode

For the developers group, security findings are advisory - the pipeline continues with allow_failure: true. Warnings are posted but don't block deployment.

10
Testing
QA Agent ~20 min Auto

The QA agent auto-detects the project type and runs the appropriate test suite. No configuration needed - it figures out the framework.

  • Node.js: detects package.json → runs npm test
  • Python: detects pytest.ini or tests/ → runs pytest tests/
  • Go: detects go.mod → runs go test ./...
  • Rust: detects Cargo.toml → runs cargo test
  • Shell: detects tests/*.sh → runs each test script

Generates test-report.md with results. Exits with code 1 if tests fail.

Artifacts Produced
specs/issue-{IID}/test-report.md — Test results with pass/fail counts
11
Deployment
Developer ~15 min Manual Gate

The final manual gate. Human confirms deployment after reviewing security findings and test results.

  • Pre-deploy snapshot: Records current commit hash, timestamp, pipeline ID to rollback/ directory
  • Execute: Runs deploy.sh or docker compose up -d
  • Health check: Waits 10 seconds, verifies containers are running
  • On success: Adds status::done, removes status::deployment
  • On failure: Automatically triggers rollback, adds deployment-failed label

Automatic Rollback

If deployment fails, the rollback script (scripts/rollback.sh) executes automatically:

  • Stops containers: docker compose down --remove-orphans
  • Checks out previous commit from the pre-deploy snapshot
  • Runs migrations/rollback.sql if it exists
  • Restarts containers: docker compose up -d
  • Waits 10s and verifies rollback succeeded
  • Logs everything to /tmp/rollback_*.log

Matrix Notifications

The pipeline sends color-coded notifications to #cicd-notifications:ai-servicers.com via a Matrix bot at key moments.

Pipeline Success (main branch)  →  GREEN   notification
Pipeline Failure (main branch)  →  RED     notification
Manual Gate Reached             →  ORANGE  notification

Notifications are sent using scripts/notify-matrix.sh, which loads the bot token from $HOME/projects/secrets/matrix.env and sends an HTML-formatted m.room.message.

Retry & Error Handling

The pipeline has built-in resilience for transient failures.