Parallel Workloads & Multi-Tenant

How multiple issues run concurrently, resource group isolation, group-specific runners, and scaling

Resource Groups: The Parallelism Model

GitLab's resource groups are the foundation of how this pipeline handles concurrent work. Every job in the pipeline declares:

resource_group: issue-$ISSUE_IID

This one line creates a powerful isolation model.

What resource groups do

Different issues = parallel

Issue #5 and Issue #12 have different resource groups (issue-5 vs issue-12). They can run their stages simultaneously on the same runner. 10 issues can be in-flight at once.

Same issue = sequential

Issue #5's specification can't start while its clarification is still running. Jobs for the same issue queue up and execute one at a time. This prevents race conditions on labels and artifacts.

Visual: How parallel execution works

Time →→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→

Issue #5:  [triage] [clarify] [WAIT: spec] [spec] [checklist] ...
                                      ^
                                    human gate

Issue #12: [triage] [clarify] [spec-ok] [checklist] [plan] [tasks] ...
           (well-specified, no clarification needed)

Issue #23: [triage] [needs-clarif] [waiting for human...]

Issue #41: [triage] [clarify] [spec] [checklist] [plan] [tasks] [analysis] ...

green = automatic    blue = manual-triggered    amber = waiting

Key insight: issues move at different speeds

A well-specified bug fix might fly through all 11 stages in 90 minutes. A vague feature request might sit in clarification for days waiting for human answers. The pipeline handles both cases naturally - fast issues don't wait for slow ones.

Scaling Characteristics

N
Parallel Issues
1
Jobs Per Issue
1
Runner Instance
2
Max Retries
DimensionCurrentBottleneckScaling Path
Issues in parallelMultiple (limited by runner)Runner CPU/memoryAdd more runners
Same issue concurrency1 (resource group lock)By designN/A - intentional constraint
Runner instances1Single runner architectureRegister additional runners with same tags
API rate limits~60 req/min (GitLab)GitLab API throttleRequest caching, batch operations
AI token limits~100k tokens/minAnthropic API throttleStagger requests, use smaller models for triage
Artifact storage30-day retentionDisk spaceS3 backend, shorter retention for non-critical

Multi-Tenant Architecture

The pipeline supports multiple GitLab groups (tenants) with shared tooling but isolated execution environments. Currently: administrators and developers.

SHARED: Pipeline Library (administrators/cicd)
+---------------------------------------------------------------+
|  .gitlab-ci.yml              (main pipeline definition)       |
|  .gitlab-ci/templates/       (base job templates)             |
|  .gitlab-ci/groups/          (group-specific overrides)       |
|  scripts/                    (utility scripts)                |
|  specs/templates/            (spec templates)                 |
+---------------------------------------------------------------+
        |                                    |
        | include:rules                      | include:rules
        | if: namespace == administrators    | if: namespace == developers
        v                                    v
ADMIN CONTEXT                           DEV CONTEXT
+----------------------------+   +----------------------------+
| Security: BLOCKING         |   | Security: ADVISORY         |
| Staging: Manual approval   |   | Staging: Auto-deploy       |
| Runner: protected + locked |   | Runner: not protected      |
| Token: GITLAB_TOKEN_ADMIN  |   | Token: GITLAB_TOKEN_DEV    |
| Use case: Infrastructure   |   | Use case: Applications     |
+----------------------------+   +----------------------------+
        |                                    |
        v                                    v
+----------------------------+   +----------------------------+
| admin-runner               |   | dev-runner                 |
| Tags: [administrators]     |   | Tags: [developers]         |
| Volumes:                   |   | Volumes:                   |
|   /opt/shared/skills:ro    |   |   /opt/shared/skills:ro    |
+----------------------------+   +----------------------------+

How Group Selection Works

When a project includes the shared pipeline, GitLab evaluates include:rules at pipeline creation time. The variable $CI_PROJECT_NAMESPACE determines which group config is loaded.

# In the shared .gitlab-ci.yml:
include:
  - local: '.gitlab-ci/groups/administrators.yml'
    rules:
      - if: $CI_GROUP == "administrators"
      - if: $CI_GROUP == null    # Default if not specified

  - local: '.gitlab-ci/groups/developers.yml'
    rules:
      - if: $CI_GROUP == "developers"

Why not variable interpolation?

You might expect include: 'groups/${CI_GROUP}.yml' to work. It doesn't - GitLab doesn't support variable interpolation in include paths. The include:rules pattern with fixed paths is the correct approach.

Group Configuration Comparison

AspectAdministratorsDevelopers
Security scansBlocking (pipeline fails)Advisory (pipeline continues)
Staging deployManual approval requiredAuto-deploy on success
Production deployManual (protected branch)Manual (protected branch)
Runner protectionProtected + LockedNot protected
API token scopeFull access (admin)Read-only (scoped)
Typical projectsInfrastructure, core servicesApplications, user-facing
Error toleranceZero tolerance for securityWarnings acceptable

Runner Isolation

Runners are isolated using three mechanisms working together. Tags alone are insufficient - a misconfigured runner could pick up jobs from the wrong group.

1. Tags

Jobs specify tags: [administrators] or tags: [developers]. Runners only pick up jobs with matching tags. Both runners have run_untagged = false.

2. Protected Status

Admin runner is protected: true - it only runs on protected branches (main). Dev runner is unprotected and can run on any branch.

3. Locked Status

Admin runner is locked: true - it can't be shared to other projects outside its group. Dev runner is unlocked for flexibility.

Runner Configuration Comparison

admin-runner                          dev-runner
+------------------------------+   +------------------------------+
| name = "admin-runner"        |   | name = "dev-runner"          |
| executor = "docker"          |   | executor = "docker"          |
| run_untagged = false         |   | run_untagged = false         |
| locked = true                |   | locked = false               |
| protected = true             |   | protected = false            |
| tags = [administrators]      |   | tags = [developers]          |
|                              |   |                              |
| [docker]                     |   | [docker]                     |
|   privileged = false         |   |   privileged = false         |
|   volumes = [                |   |   volumes = [                |
|     docker.sock,             |   |     docker.sock,             |
|     /opt/shared/skills:ro    |   |     /opt/shared/skills:ro    |
|   ]                          |   |   ]                          |
+------------------------------+   +------------------------------+

Shared Skills Architecture

Both runners and both users need access to the same Claude Code skills. But Docker containers can't follow symlinks from the host. The solution: a shared directory mounted as a read-only volume.

flowchart TB
    subgraph "Host Filesystem"
        A["/opt/shared/claude-skills/"]
    end

    subgraph "Docker Runners"
        B["admin-runner container"]
        C["dev-runner container"]
    end

    subgraph "User Home Dirs"
        D["/home/administrator/.claude/skills/"]
        E["/home/websurfinmurf/.claude/skills/"]
    end

    A -->|"Volume mount :ro"| B
    A -->|"Volume mount :ro"| C
    A -.->|"Symlink"| D
    A -.->|"Symlink"| E

    B --> F["/opt/skills/ inside container"]
    C --> G["/opt/skills/ inside container"]

    style A fill:#10b981,color:#fff
    style B fill:#ef4444,color:#fff
    style C fill:#3b82f6,color:#fff
            

Two access patterns, one source of truth

Runners access skills via Docker volume mount (/opt/shared/claude-skills:/opt/skills:ro). Users access the same skills via filesystem symlinks from their ~/.claude/skills/ directory. Updates to /opt/shared/claude-skills/ propagate to everyone.

Secrets Isolation

Each group has its own GitLab access token and CI variables. A compromised developer token cannot access administrator projects.

SecretScopeStorage
GL_TOKENPer-groupGitLab CI variable (masked, protected)
GITLAB_TOKEN_ADMINAdministrators onlyDashboard env var
GITLAB_TOKEN_DEVDevelopers onlyDashboard env var
MATRIX_BOT_TOKENShared (notifications)GitLab CI variable (masked)
MATRIX_ROOM_IDShared (notifications)GitLab CI variable

No cross-group access

The admin token has access to administrators/* projects only. The dev token has access to developers/* projects only. Even if a developer's runner is compromised, it cannot access infrastructure project secrets or code.

Adding a New Group

The multi-tenant architecture is designed to scale. Adding a new group (e.g., contractors) requires these steps:

#StepWhat To Do
1GitLabCreate contractors group, add members
2KeycloakCreate matching group, configure group mapper in JWT
3PipelineAdd .gitlab-ci/groups/contractors.yml with group-specific settings
4RunnerRegister new runner with contractors tag
5LabelsRun scripts/replicate-labels.sh administrators contractors
6DashboardAdd contractors to allowed list + add GITLAB_TOKEN_CONTRACTORS
7SecretsCreate group-specific GitLab access token

No code changes needed

The dashboard only needs the allowed array updated in extractGroup(). The pipeline only needs a new group YAML file. Everything else is configuration.

Observability Stack

Pipeline health is monitored through three channels that work together to provide complete visibility.

Grafana Dashboard

Visual metrics: pipeline duration, success rate, stage times, failure trends. URL: grafana.ai-servicers.com/d/cicd/

Promtail + Loki

Centralized logging. All container logs are auto-discovered by Promtail, shipped to Loki, and queryable in Grafana. No per-service configuration needed.

Matrix Notifications

Real-time alerts to #cicd-notifications. Color-coded: green (success), red (failure), orange (manual gate). Bot: @cicd-bot:ai-servicers.com

Pipeline Jobs  →  Container Logs  →  Promtail (auto-discovery)  →  Loki  →  Grafana
                                                                                     |
Pipeline Events  →  notify-matrix.sh  →  Matrix Bot  →  #cicd-notifications     |
                                                                                     v
                                                                           Unified Dashboard

Architecture Decision Records

Major design decisions are tracked as ADRs using log4brains and published to nginx. Current Phase 4 decisions:

IDDecisionStatus
T4.1Parallel vs Sequential CI JobsProposed
T4.2SAST Tool SelectionProposed
T4.3Test Environment StrategyProposed
T4.4Deployment Rollout StrategyProposed
T4.5Rollback TriggersProposed
T4.6aPilot Scope SelectionProposed
T4.6bPilot Success MetricsProposed

ADRs auto-publish to nginx.ai-servicers.com/cicd/decisions/ when changes are pushed to the docs/adr/ directory. They can also be managed as DECISION cards on the GitLab board.

Risk Mitigations

RiskImpactMitigation
JWT missing groups claimDashboard access failsKeycloak group mapper + fallback to realm_access.roles
Runner picks wrong jobsCross-group security breachTags + protected refs + locked runners (3 layers)
Label drift between groupsInconsistent boardsPeriodic replicate-labels.sh or shared template
Shared skills breakBoth groups blockedGit version control; test before merge to /opt/shared
Token leakCross-group accessPer-group tokens; 90-day rotation; masked CI variables
Deployment failureService downtimePre-deploy snapshot + automatic rollback + health checks