CI/CD pipeline

How We Made Our CI/CD Pipeline 10x Faster (From 62 Minutes to 6)

The four boring changes — parallel tests, Docker layer caching, Terraform, live metrics — that cut our CI/CD pipeline from over an hour to six minutes.

Ralph DuinSeptember 15, 20253 min read

<p>Continuous Integration and Continuous Deployment pipelines are the backbone of modern software delivery. When your pipeline lags, everything else lags with it — feedback loops, bug fixes, release cadence, team morale. We hit that wall hard. Here's how we cut ours from over an hour to six minutes.</p> <h2>Step 1: Find the real bottleneck</h2> <p>The first thing we did was stop guessing. We instrumented every stage and measured it. The results surprised us:</p> <ul> <li>Tests: 72% of total pipeline time</li> <li>Build + image push: 19%</li> <li>Infra provisioning: 6%</li> <li>Everything else: 3%</li> </ul> <p>If you don't measure, you end up optimizing the wrong stage. We almost spent a week on Docker layer caching before realizing it would have saved us ninety seconds on an hour-long pipeline.</p> <h2>Step 2: Parallelize the test suite</h2> <p>Our tests ran serially on a single runner. Moving to parallel test sharding across four runners was the single biggest win — cut testing time by 68%.</p> <pre><code class="language-yaml">jobs: test: runs-on: ubuntu-latest strategy: fail-fast: false matrix: shard: [1, 2, 3, 4] steps: - uses: actions/checkout@v4 - uses: oven-sh/setup-bun@v1 - run: bun install --frozen-lockfile - run: bun test --shard=${{ matrix.shard }}/4</code></pre> <p>We also killed a dozen tests that were flaky or testing framework behavior instead of our own code. Fewer tests, faster signal.</p> <h2>Step 3: Cache the Docker build layer</h2> <p>Our Docker builds were starting from scratch every time. Adding BuildKit layer caching to a remote registry dropped image builds from four minutes to forty seconds:</p> <pre><code class="language-yaml">- uses: docker/build-push-action@v5 with: context: . push: true tags: ${{ env.IMAGE }}:${{ github.sha }} cache-from: type=registry,ref=${{ env.IMAGE }}:buildcache cache-to: type=registry,ref=${{ env.IMAGE }}:buildcache,mode=max</code></pre> <h2>Step 4: Provision infrastructure as code</h2> <p>Environment provisioning used to require a human in the loop. We moved to Terraform so the pipeline could spin up a preview environment for every pull request:</p> <pre><code class="language-hcl">resource "fly_app" "staging" { name = "app-staging-${var.pr_number}" org = "personal" }

resource "fly_machine" "api" { app = fly_app.staging.name region = "ams" image = var.image }</code></pre>

<p>Every PR gets its own preview environment, destroyed automatically when the PR closes. Reviewers stop asking "can you deploy this somewhere I can click?"</p> <h2>Step 5: Monitor so it stays fast</h2> <p>Pipelines rot. We wired Prometheus to scrape pipeline metrics and built a Grafana dashboard that alerts us when any stage creeps past its budget. If tests drift back toward fifteen minutes, we know before it becomes a daily annoyance.</p> <h2>The result</h2> <p>From 62 minutes to 6 minutes. A 10x improvement, achieved through four boring changes: measure, parallelize, cache, automate. No magic. No new framework. Just ruthless attention to where the time was actually going.</p> <p>The lesson: faster pipelines aren't a one-time project. They're a habit. Measure weekly, cut ruthlessly, and don't let your CI slow to a crawl while you're not looking.</p>