Spec-Driven Development is Waterfall With a Fresh Coat of Paint

The AI tooling ecosystem keeps rediscovering waterfall. Here's why you should resist it.

When did you last finish a requirements document and think: yes, this is exactly what we'll build, and it'll still be accurate when we ship?

If you've been in software long enough, you already know the answer. The spec is wrong the moment you stop writing it. Software is discovery β€” you learn what the right solution is by building it, not by describing it in advance.

So why is the AI tooling industry trying so hard to bring specs back?

In this post you'll learn:

  • What spec-driven development is and which tools are pushing it
  • Why it's structurally identical to waterfall β€” just with better marketing
  • Why an AI-empowered team with agentic context engineering outships a spec-heavy team every time

πŸ’‘ This is Part 6 in my Context Engineering series. Part 1 covered the four pillars. Part 2 introduced agent skills. Part 3 argued for lean over Scrum. Part 4 put skills to work with git worktrees. Part 5 showed why Markdown is the native format of agentic context.


πŸ” What is Spec-Driven Development?

The premise sounds reasonable: before asking AI to write code, write a detailed specification. The AI reads the spec, understands the intent, and generates an implementation that matches it. No vague prompts. No hallucinated architecture. Just a clear document that drives predictable output.

Several tools and frameworks have formalised this into a full workflow.

OpenSpec is probably the most talked-about right now. It installs via npm and works across Claude Code, Cursor, GitHub Copilot, and 30+ other tools. When you run /openspec:proposal, it generates a structured change proposal β€” requirements doc, implementation tasks, design decisions β€” before a single line of code is written. It markets itself as "lightweight" and "brownfield-first," and to be fair, it's more considered than a blank requirements doc. But the fundamental shape is the same: spec first, code second.


πŸ’§ It's Waterfall. Here's the Proof.

Let's put the two workflows side by side.

Classic waterfall:

  1. Requirements phase β†’ document everything
  2. Design phase β†’ spec everything
  3. Implementation phase β†’ build to spec
  4. Testing phase β†’ verify against spec
  5. Release

Spec-driven AI development:

  1. Discovery / inception β†’ gather requirements
  2. Write spec (BA, PM, or AI-assisted) β†’ document everything
  3. AI implements to spec
  4. Review and validate

Same shape. Different tools.

The spec-driven workflow also sneaks back in the roles that lean teams have been actively trying to eliminate:

  • πŸ“‹ Business Analysts to write the requirements
  • ✍️ Product Owners to sign off on the spec
  • πŸ›οΈ Architects to validate it's implementable
  • πŸ” Dedicated QA to verify against the spec

These are hand-off roles. Each one is a seam where context gets lost, where timelines stretch, and where "it wasn't in the spec" becomes the answer to every inconvenient question.

This isn't a new failure mode. Behaviour-Driven Development made the same bet a decade ago β€” that BAs and product owners could write tests in Gherkin, the human-readable format behind tools like Cucumber, without deep engineering involvement. The promise was that non-technical stakeholders could define acceptance criteria directly. In practice, the Gherkin files were either written by engineers anyway, or they drifted from reality the moment the code changed. The handover assumption was the problem then, and it's the problem now.

Modern testing has moved in the opposite direction. Shift-left testing brings quality thinking into the development loop rather than delegating it downstream. Test automation tooling β€” MCPs and CLIs like Playwright β€” lets engineers write and run tests as part of the build cycle, not as a separate gate. The assumption that non-technical people should own the test layer was always backwards. What works is engineers who think about quality from the start.

This is about more than methodology. It's about team composition. In the AI era, the answer isn't a better handover process β€” it's not needing one. What you need are AI-empowered engineers with strong engineering instincts and the systems thinking to own a problem end to end.

But there's a subtler problem beneath the process overhead: the methodology and the headcount justify each other in a loop that's very hard to break from the inside. You need BAs to write the specs. You need QAs to validate against them. You need sign-off ceremonies before work begins and acceptance cycles before it ships. Heavyweight spec processes don't just slow teams down β€” they create an organisational incentive where writing a good spec becomes more valued than shipping good software.

And here's the deeper problem: spec-driven development is solving the wrong bottleneck. Generating code was never the problem. Deciding what to build β€” and staying adaptable as that changes β€” is the hard part. A spec locks you into an answer before you've done the learning.

πŸ’‘ The future is T-shaped engineers who can own the full problem. Not specialists producing hand-off artefacts.

And there's another problem: spec rot. The spec is a snapshot of what you thought you needed at the moment you wrote it. The moment you start building, reality diverges. You're now maintaining two artefacts β€” the spec and the code β€” and they drift apart from day one.

No feedback loop in the build means you don't discover what's wrong with your design until you've already built to it. That's exactly waterfall's fatal flaw β€” and spec-driven development inherits it wholesale.


🚨 Hacker News Called It Immediately

I'm not alone in this read. When marmelab published a critique of spec-driven development on Hacker News, it hit 225 points and 191 comments β€” and the title said it all: Spec-Driven Development: The Waterfall Strikes Back.

The thread reactions were sharp:

"Managers appreciate SDD for the same reasons they favored waterfall initially." β€” constantcrying

"The distinction between waterfall and agile is merely how much spec you write before implementing β€” longer feedback loops increase risk of major mistakes." β€” pydry

"Really, we are doing waterfall, but with AI, now?" β€” multiple commenters

Engineers who tried Kiro β€” AWS's spec-driven agentic IDE β€” were equally blunt:

"Kiro generated massive task lists (12+ tasks with 4+ sub-tasks each) with unpredictable code deletion β€” a sledgehammer to crack a nut." β€” hatmanstack

"Kiro, your new corporate project manager." β€” iamsaitam (on Kiro treating a small bug fix as a full project management exercise)

When you show experienced engineers a spec-driven workflow, the reaction is consistent regardless of platform: we tried this already.


⚑ The Right Context is Lightweight, Not Heavyweight

Here's what I think people are actually reaching for when they advocate for spec-driven development: they want AI to have enough context to produce good output. That's a legitimate goal. The mistake is thinking a formal spec document is the right way to deliver it.

The Agile Manifesto articulated this argument in 2001:

"Working software over comprehensive documentation."

25 years later, spec-driven AI development is rediscovering the same mistake the Manifesto was written to correct.

What does good agentic context engineering actually look like? It's three things:

1. Context files β€” always-on institutional knowledge that lives in the repo, versioned with your code. CLAUDE.md covers project conventions, architecture decisions, gotchas, and commands. agents.md (covered in Part 5) gives the agent ground truth about your agent architecture β€” what each agent does, how they relate, known quirks. Neither of these is a spec. They describe what is, not what should be. Both are plain Markdown, readable by any agentic tool, owned entirely by your team.

2. Plan mode β€” an interactive dialogue with the agent to scope the work before acting. Not a document. Not a hand-off artefact. A conversation. Here's what that looks like in practice:

You type: "Add user notification preferences to the API β€” get and set per-user settings."

Claude Code reads your codebase and responds in under a minute: "I can see your Express router in src/api/routes.js. I'll add GET /users/:id/preferences and PATCH /users/:id/preferences, extend the User model with a JSON preferences column, and add a migration. I'll reuse your existing auth middleware. Want to proceed, or adjust anything?"

You refine. You ship. The spec-driven equivalent of that exchange is a requirements document, a BA-facilitated discovery session, an architect sign-off, and a sprint of writing before anyone touches the code.

The reason plan mode works is the same reason context engineering works β€” you're doing the thinking that a spec tries to capture, but you're doing it in a living, adaptive way rather than freezing it in a document that will be out of date before the first PR is merged. The plan evolves with the codebase. The context evolves with understanding. The output is software, not documentation.

Plan mode produces a plan.md β€” a structured, readable artefact you can review, edit, and pass to a sub-agent. Before execution I run a principal engineer review: a sceptical sub-agent persona whose job is to find problems with the plan before a line of code is written. A fresh context window sees what your planning session normalised. This is the quality gate β€” lightweight, fast, and embedded in the loop rather than bolted on at the end as a separate role.

OpenSpec's own FAQ raises this objection: "Plan mode is great for a single chat session β€” but what about longer work that spans multiple sessions?" Their answer is a spec. The actual answer is simpler: a progress.md file. The agent maintains it as work proceeds β€” completed steps, decisions made, what's next. At the start of the next session it reads its own notes and picks up where it left off. You don't need a platform, a structured workflow, or a vendor's artefact format. You need a text file and a habit.

3. A prototype β€” build something real, show it to a stakeholder, learn. The fastest way to discover what you need to build is to build it.

Spec-Driven Development Plan Mode (Claude Code)
Written upfront, before building Created interactively, just before acting
Formal document Conversation-first, optionally captured
Requires dedicated roles (BA, PM, architect) One T-shaped engineer
Locked in before learning Revised as understanding improves
Spec β†’ code (one-way) Plan β†’ code β†’ refine β†’ code (iterative)

πŸ‘₯ The Team You Actually Need

In Part 3, I argued that Scrum's ceremony overhead doesn't survive contact with AI development speeds. Spec-driven development has the same problem, and it introduces an additional one: it justifies bloated team structures.

If your workflow requires someone to write the spec, someone to implement it, someone to verify it, and someone to maintain it as it drifts β€” you've described a four-person process for work one T-shaped engineer with an AI agent could handle in an afternoon. (A T-shaped engineer is someone with deep expertise in one area and enough breadth to own the problem end to end β€” from requirement clarity to production β€” without a handoff chain.)

Importantly, the BA and QA functions don't disappear in this model β€” they just stop being separate roles with separate artefacts and handoff ceremonies. A strong engineer with good AI tooling and a disciplined approach to context engineering can own requirement clarity as part of how they think about a problem, not as a separate phase before they're allowed to start. The translation between business need and technical implementation is a skill, not a job title. Engineers who develop that skill β€” and AI tooling makes it significantly easier β€” are more valuable than engineers who hand that responsibility to a BA and wait.

The same applies to quality. Quality is not a function β€” it's a standard. When it's owned by a specialist role, it becomes that role's problem to manage rather than every engineer's responsibility to maintain. When quality thinking is embedded in the agentic development loop rather than delegated to a gate at the end, failure rates don't increase. In many cases they improve, because the person closest to the code is also closest to the risk.

The Claude Code team at Anthropic is a well-documented example of this model working at scale. They built one of the most serious AI coding tools in the industry without traditional PRDs β€” using plan mode and rapid prototyping instead.

That's the team composition the AI era calls for:

  • πŸ”· Flat β€” minimal hierarchy, no specialist silos
  • πŸ”· T-shaped β€” deep in one area, broad enough to own the full stack
  • πŸ”· Outcome-oriented β€” shipping working software, not producing documents
  • πŸ”· AI-empowered β€” each engineer multiplied by an agent

In an interview with The Pragmatic Engineer, Boris Cherny β€” who led the development of Claude Code β€” put it plainly:

"There's just no way we could have shipped this if we started with static mocks and Figma or if we started with a PRD."

If the teams building the tools aren't writing PRDs, the argument that PRDs are necessary for serious product development becomes very hard to sustain. The question isn't whether this model scales β€” the question is whether your organisation has the hiring standards and culture to make it work.


The Bottom Line

Every generation of tooling rediscovers the same heavyweight process and gives it a new name. Waterfall became RUP became Scrum became SAFe. Now it's spec-driven development. The values the Agile movement fought for β€” working software over comprehensive documentation, responding to change over following a plan β€” are exactly what spec-driven development trades away.

You don't need a spec. You need good agentic context engineering principles β€” CLAUDE.md, plan mode, T-shaped engineers, prototypes. What you don't need:

  • ❌ A BA writing requirements before the team has touched the problem
  • ❌ A spec document to maintain in lockstep with your code
  • ❌ An inception workshop before a single line is written

Spec-driven development frameworks are solving a problem that better hiring, better tooling, and better engineering culture have already made obsolete. They're not a step forward β€” they're a formalisation of constraints that high-performing teams have already moved past.

The spec was never the source of truth. The software always was.