Specs.md: agile ceremony, rebuilt for agents

I went into a spec-driven framework expecting waterfall in an agile costume. What I found mapped almost exactly onto Scrum ceremony — intents, work items, runs — with one of the people in the room swapped for an agent.

Published: 21 June 2026

Diagram of the specs.md framework's three pluggable flows — Simple, FIRE, and AI-DLC — comparing agent counts, checkpoints, and best-fit use cases, above a row of compatible AI coding tools. — The specs.md framework offers three flows. This post is about the middle one, FIRE. Diagram from specs.md.

I left the AI Engineer conference in Melbourne feeling two things at once. Inspired, and a bit like I was falling behind. I'd just spoken about slop — about code that's cheap to produce and expensive to verify — and a good chunk of the room felt the same concerns I did. But there were a handful of people further along than me, already deep into frameworks I'd only heard about in passing.

I started looking at some of those. SpecKit from GitHub, AI-DLC from AWS, a few others. I'd heard from a few people who'd tried more than one that Specs.md was the lighter-weight option of the bunch, less documentation, less ceremony to wade through. That was enough to make it where I started.

I went in a bit wary. My worry was that a framework like this would ask me to specify everything up front, and end up feeling like waterfall wearing an agile costume. I've never been someone who plans a thing to perfection and then executes flawlessly. I like a messier process, one with room to wander and find something I didn't expect on the way.

That's not really what happened. The framework let me think as I went rather than forcing me to front-load certainty I didn't have yet.

The closest comparison I keep landing on is Scrum. Not as a metaphor I'm reaching for, but because the actual mechanics (the things you create, the order you create them in) map onto agile ceremony almost exactly. The difference is that one of the people in the room is now an agent. I'm not sure that's the whole story yet, but it's the frame that's helped me make sense of what I was doing while I was doing it.

To actually test the framework, I needed something to build. I landed on a small daily puzzle game — sort actors into three groups based on the films they share — in the spirit of Wordle or Connections. You can play it here; there's a new puzzle every day for the next three months. The game isn't really the point of this post. What I noticed while building it is.

Installing it is the easy part

Installation is no different to adding Playwright or Vite to a project: a handful of files land in your repo, and those files are the agents and skills that drive the rest of the process.

Specs.md offers a few workflows. I picked FIRE, mostly because it sounded better than the alternatives. Once it's installed, you kick things off with a single skill — /specsmd-fire-planner — and describe what you want to build. I could almost end the post here, because from that point the framework more or less carries you through everything else.

It starts by asking a lot of questions: coding standards, tech stack, the non-functional stuff that's easy to skip over. Then it moves into building what it calls an intent. This is where the Scrum comparison started to feel less like an analogy and more like the actual mechanism underneath.

Intents are epics. Work items are stories. Runs are sprints.

An intent is your goal and your work breakdown in one place: the problem you're solving, who you're solving it for, your acceptance criteria, your constraints. It's most of what I've always wanted an epic to contain and rarely got, because writing all of that out by hand usually felt like overhead nobody had time for. Specs.md tracks the intent's progress too, so you can pick it up, put it down, and come back days later without losing the thread. You can see the intents from my game here.

The planner then breaks the intent into work items, closer to stories. Each one carries a description, acceptance criteria, and two things I don't remember Jira tickets ever capturing properly: a complexity rating, and whether the agent can run the task on autopilot or needs me in the loop.

Working through this felt a lot like sprint planning and refinement, just compressed into minutes instead of days. I half-joked to myself that I wasn't sure what the point of Jira was, if this became how teams actually worked. I'm only half-joking now.

Then comes the /specsmd-fire-builder, which executes a work item as a run. Each run lands as a directory in your repo — run-001, run-002, numbered in sequence — containing the plan, a review report (security, architectural integrity), a test report, and a walkthrough of the change. You can see an example run here.

If a work item is a story, a run is something like what happens when an engineer picks up that ticket and works it through to done — except it happens in one pass, with a richer trail of what actually happened than most tickets ever get.

Letting go of the steering wheel

I started cautiously: scaffold something, review it, approve, move to the next thing. But as I got more comfortable, I started looking for ways to get out of the way a bit more. That autopilot flag on each work item seemed like the obvious next step, so I wired Claude up as a GitHub Action and started queuing autopilot tasks from my phone, closing my laptop, and coming back later to see what had landed.

It wasn't flawless. Dark mode broke in a couple of places. But instead of reworking the original intent to fix it, I just let the change through and raised a new intent to patch the bug afterwards. That's closer to how I've generally preferred to work, AI or not — keep moving towards the outcome and clean up the speedbumps along the way, rather than stalling everything chasing perfection on the first pass.

I had a working version of the game in a couple of hours, then spent a few more sessions over the following days iterating — reworking the interaction, adding an archive, expanding the puzzle types. What I ended up with is something I'm genuinely happy to come back to each day, the way I would a NYT puzzle.

Would I use it again?

Yes, easily. What I think sold me is that none of this asked me to learn a new way of thinking about work. Intents, work items, runs — I already know how to run that process, because I've been running versions of it for the better part of 25 years. Specs.md isn't asking me to abandon agile thinking for something stranger. It's swapping one of the people in the room for an agent, and keeping the ceremony that actually earns its keep.

That's the bit I think gets missed in a lot of the framework chatter right now. The interesting ones aren't reinventing how software gets planned and delivered. They're figuring out how to fit an agent into a process engineering teams already trust.