From vibe to spec

The methodologies fighting for attention such as vibe coding, spec-driven development, structured prompt-driven development, keep getting framed as rival camps. In fact, they're stages of one pipeline, and the real leverage is in the transition point between them.

A hand-drawn wireframe of a 'Projects' screen — filters, search, a list, sort and pagination — on the left, with a dashed arrow to a numbered specification on the right written as EARS requirements ('When … the system shall …'). An axis along the bottom runs from VIBE through a labelled midpoint, 'transition via builder's trial', to SPEC.

I’m having a lot of the same conversation right now. It usually starts with someone like a head of engineering, a CTO, or a tech lead, describing two things that don’t seem to fit together.

On the one hand, their developers are flying. Cursor, Claude Code, Replit Agent, pick the tool. Features that used to take a week are landing in an afternoon. The energy on the team is the best it’s been in years.

On the other hand, they’re opening pull requests they don’t recognise, patterns no one agreed to or repos growing faster than anyone can audit. They can feel the debt accumulating even as the velocity climbs, and they don’t know what to do about it.

Then they ask me some version of the same question: do we lean into this, or do we clamp down?

Neither, I think. The framing is the problem.

The methodologies competing for attention right now, like Vibe Coding, Spec-Driven Development, Structured Prompt-Driven Development, EARS, and autonomous agentic pipelines, keep getting positioned as rival camps. Pick a side. Be a vibe team or be a spec team. I don’t think that’s the choice. I think they’re stages of the same pipeline, and the interesting question isn’t which one wins. It’s when each one applies.

Two poles, one spectrum

Strip the marketing away and the new methodologies cluster around two poles.

Vibe coding treats the developer as a creative director. You describe what you want, the model produces it, you react, you iterate. On a greenfield prototype it’s genuinely the best way to find out what you’re actually trying to build. Its failing is just as clear: as the codebase grows, context degrades, the model starts hallucinating dependencies, and you end up with a system no one fully understands.

Spec-driven development treats specifications as the source of truth. The work shifts from typing code to writing a precise spec (usually markdown, committed to git) that the model translates into implementation. It is predictable and auditable. Its failing is inertia, forcing a team to specify an integration before they know if it’s even feasible kills exploration. A handful of frameworks have grown up around this pole, Spec Kit, OpenSpec and Kiro among them, all linked at the foot of the post.

Both failings are real. Both methodologies are right somewhere. The question is where.

Most arguments about methodology are really arguments about which stage of the feature lifecycle someone is picturing.

Features have a lifecycle

After having a number of these conversations, I’ve realised that a feature isn’t static. It moves through a lifecycle defined by two variables (uncertainty and risk), and those variables change as the feature matures.

Early on, a feature sits in high uncertainty, low risk. Nobody knows what the UI should feel like, whether the data model is right, or if the third-party API actually does what the docs claim. The impact is small because nothing is live. Velocity matters more than accuracy. This is where you should be vibing.

Later, the same feature becomes low uncertainty, high risk. The shape is final. Now it’s going to touch a production database, handle customer data, get called by other services. The impact is large. Accuracy matters more than velocity. This is where you need specs.

The vibe-coding evangelist is picturing a prototype. The spec-driven advocate is picturing a payments integration. They’re both right, and neither has the full picture.

The builder’s trial

The interesting work is at the transition point, what I’m going to call the builder’s trial. In maritime terms, it’s when a vessel goes from being tinkered with in dry dock to launching under real operational stress. For us, it’s the moment a feature stops being an experiment and starts becoming infrastructure. It’s when we’re getting ready to ship it.

The usual failure here is asking humans to write the spec after the fact. They won’t. The prototype works, the team is feeling the pressure of deadlines, and writing a retrospective specification feels like homework. So the prototype ships, the spec never gets written, and a year later it’s the thing nobody dares touch.

I think the cleaner move is to reverse the order and let the model write the spec. Feed the working prototype into an LLM and ask it to extract the system logic into a formal specification. EARS (Easy Approach to Requirements Syntax) is well suited to this, because it forces requirements into constrained templates that remove ambiguity:

  • When a user clicks export, the system shall generate a signed CSV.
  • If the auth token is expired, the system shall return a 401.

The prompt to do this isn’t elaborate. Here’s an example:

You are a senior engineer graduating a feature from
prototype to production. Analyse the prototype below
and produce a version-controlled spec.md with four
sections:

1. Intent — what it does and why, in a few lines.
2. Functional requirements — strict EARS only:
     The system shall <response>.
     When <trigger>, the system shall <response>.
     While <state>, the system shall <response>.
     If <error>, then the system shall <response>.
3. Engineering constraints — data models, the
   architecture, file structure, operations, our
   conventions, and security safeguards.
4. Roadmap — sequential, testable steps, each with
   a definition of done and its test criteria.

Tech stack & conventions: <...>
Prototype / session log: <...>

You now have a specification that describes what the prototype actually does, not what someone hoped it would do. A human reviews it, corrects what the model misread (much faster than writing it from scratch), and from that point forward the spec is the source of truth. The prototype code can be discarded or regenerated against it.

Don’t ask humans to write the spec after the prototype. Ask the model to extract it, and have humans correct it.

This is the move that makes the rest of the pipeline work. It removes the friction that usually keeps teams stuck in one mode or the other.

What this looks like in practice

A team building an export-to-CSV feature.

Week one is vibe coding. A product lead, a designer, and an engineer sit together and build a working prototype in an afternoon. They iterate on the UI, the file format, the edge cases users actually care about. By the end of the week they have something stakeholders can click through.

Week two is the builder’s trial. The team feeds the prototype to a model and generates an EARS-formatted specification. The engineer reviews it, corrects the bits the model misread, and commits it to the repo alongside a structured prompt that defines schemas, file layout, and the team’s linting conventions.

Week three is production. Background agents take the specification and the prompt scaffold, generate the production implementation and its test suite, and open a pull request. The engineer reviews it the way they’d review any other PR — except the spec is right there in the diff, so the review is about whether the spec is right, not whether the code matches some unwritten intention.

The handoff translation errors that plague the traditional SDLC: business writes a ticket, designer misreads it, developer mis-implements it, QA late-catches it, collapse, because the intent is captured directly at each stage in a form the next stage can consume.

A few honest caveats

This is still a hypothesis being tested for most teams. I’m watching it work in pieces like vibe-coded prototypes, spec extraction, agent-generated PRs. But I haven’t yet seen a team running the full pipeline end-to-end as comfortably as I’ve described it here. The bits are real. The choreography is still being learnt.

The builder’s trial is where it usually breaks. Teams I talk to are good at the vibe end and getting better at the agent end. The middle, the disciplined transition from prototype to spec, is where the investment is missing and where the pipeline collapses if you skip it.

Spec extraction isn’t magic. The model will get things wrong. It will miss implicit rules the prototype relied on. The review still matters, and it still takes judgement. The win is that you’re correcting a draft instead of starting with a blank page.

What it asks of engineering leaders

Two things.

First, stop treating these methodologies as identity markers. Your team doesn’t need to be a vibe-coding team or a spec-driven team. It needs to know which mode applies to which work, and to move between them deliberately. That sounds simple. In practice it’s the hardest cultural shift in any of this, because it requires giving up the comfort of a single way of working.

Second, invest in the builder’s trial. The transition from prototype to production is where AI-native development either pays off or collapses into debt. The tooling for this such as spec extraction, prompt versioning, agent review workflows, is where the leverage is, and it’s the part most teams are underinvesting in because it’s less fun than either pole.

This is the conversation I’m having most weeks now, and the frame above is the one that’s been most useful in those rooms. If you’re working through this with your team, I’d love to hear what’s landing and what isn’t.

The organisations that figure this out will be the ones that stopped picking between methodologies and embraced both.

Further reading