← All posts

Vibe-Coding Doesn't Scale. Context-Driven Development Does.

An AI agent with no context writes plausible code that quietly breaks something three files away. Draft gives your assistant a spec, a plan, a live code graph, and a three-stage review — turning a fast guesser into a disciplined engineer. Free, local, open source.

The Bottom Line

The current generation of AI coding agents are astonishing at writing code and terrible at knowing whether they should. Drop a task into a fresh chat and the model does what it always does: it pattern-matches to something plausible, writes it confidently, and has no idea that the helper it just "simplified" is called from 40 other places. The code compiles. The demo works. Two weeks later something unrelated is on fire.

This isn't a model-quality problem — smarter models guess more convincingly, which is worse. It's a context problem. A senior engineer doesn't write better code because they type faster; they write better code because they carry a mental model of the system, work from a spec, plan before they touch anything, and review against intent. Draft gives your AI agent all four. It's an open-source Claude Code plugin that implements Context-Driven Development: a small, opinionated workflow that wraps every change in the structure a real engineer brings to the job.

Why "Just Prompt Better" Stops Working

Vibe-coding — describe it, accept the diff, move on — feels incredible for the first hundred lines. It falls apart on real codebases for three reasons that no prompt can fix:

  1. The agent can't see the blast radius. It reads the files you opened. It doesn't know what calls the function you're editing, what your change breaks downstream, or which test guards the invariant you're about to violate.
  2. Intent evaporates between turns. The "why" lives in your head and a Slack thread. Next session — or next teammate — starts from zero and re-derives it, usually wrong.
  3. Nothing checks the work against the goal. "It runs" is not "it's correct." Without a spec to grade against, review degrades to skimming a diff and hoping.

You can paper over each of these with longer prompts and more pasted context. That doesn't scale — it just moves the unscalable part into your fingers.

What Draft Actually Does

Draft replaces ad-hoc prompting with a workflow your agent runs every time, backed by durable files in your repo and a local knowledge graph. Four primary commands carry the whole loop:

1. /draft:init — give the agent a memory

Run once. Draft analyzes the repository and writes a draft/ directory: a product overview, tech stack, conventions, guardrails, and — in its richer mode — a navigable knowledge wiki of one concept per subsystem. It also stands up a live code graph via a 100% local engine (codebase-memory-mcp, 159 languages, no API key, nothing leaves your machine). From now on your agent loads context by reasoning over that map for relevance — navigating a page-index table of contents the way a senior engineer scans one — instead of running a vector similarity search or making you paste files. (Why relevance beats similarity gets its own deep dive: Similarity Isn't Relevance.)

2. /draft:plan — decide before you build

Turn a request into a spec (what and why, with acceptance criteria) and a plan (the ordered steps). For anything non-trivial, Draft can decompose the work into modules, draw the dependency graph, and catch a circular dependency before a line is written. The plan is a file in your repo — reviewable, editable, and still there next week.

3. /draft:implement — build against the plan, not the vibe

The agent executes the plan step by step, grounded in the spec and the graph. Before it edits a hotspot it can run an impact query — the real downstream files, tests, and configs a change touches — and adjust the approach accordingly. It writes tests, follows your conventions because they're written down, and leaves an impact record behind so the next change knows what this one touched.

4. /draft:review — grade the work against intent

A three-stage review: automated validation (build, tests, lint), spec compliance (did it do what the spec asked?), and code-quality analysis — including a multi-dimension bug hunt that looks for the failure modes humans skim past. The output is findings tied to the spec, not a vibe-check on a diff.

That's the public surface most people need. Underneath sit routers (/draft:ops, /draft:docs, /draft:discover, /draft:jira) dispatching specialist skills — but you can ship real work knowing only the four above.

The Difference, Side by Side

Vibe-coding a changeThe same change with Draft
Context is whatever you pasted this sessionContext loaded from a persistent repo map + code graph
Retrieval ranks chunks by vector similarityRetrieval reasons over a page index for relevance
"Why" lives in your head"Why" lives in a spec, versioned with the code
Agent edits blind to callersImpact query shows the blast radius first
Review = skim the diffReview = build + spec compliance + bug hunt
Next session starts from zeroNext session inherits specs, plans, impact memory
Your code shipped to a SaaS for "context"Everything local; nothing leaves your machine

Local by Construction

Most "AI understands your codebase" products improve context by shipping more of your code to a third party for indexing. Draft goes the other way. The context files are plain markdown in your repo. The knowledge graph is built and queried by a local engine installed alongside the plugin — no MCP server to host, no SaaS, no cloud, no API key, no leaked code. It works on an air-gapped machine. Your security team can read every byte of what Draft produces because it's all sitting in draft/, in git.

What This Means for the Three People Reading This

If you're an engineer: you stop being the context bus. You describe the change once, the agent plans it, builds it against that plan, and reviews it against the spec — and the next change starts where this one left off instead of from a blank prompt.

If you're an engineering leader: AI-assisted work becomes legible. Every change has a spec, a plan, and a review trail in the repo — the same artifacts you'd want from a human, now generated as a byproduct of the workflow rather than nagged into existence.

If you're in a regulated or air-gapped shop: there's nothing to procure and nothing to send out. No vector database, no embedding API, no external indexer. The whole thing is markdown, shell helpers, and a local graph engine.

The Larger Argument

The leap from "AI can write code" to "AI can do engineering" isn't a bigger model. It's the scaffolding around the model: a memory of the system, intent captured as a spec, a plan to work against, and a review that grades the result. Humans figured this out decades ago and called it engineering discipline. Draft's bet is simple — give the same discipline to the agent, keep it local, keep it free, and let the model do what it's genuinely great at inside guardrails that catch what it's genuinely bad at.

Try It

# Install Draft (Claude Code plugin)
npx @drafthq/draft install claude-code

cd your-repo
/draft:init                 # build the repo map + local code graph (run once)

# Then ship work, the disciplined way:
/draft:plan      "add rate limiting to the public API"   # spec + plan
/draft:implement                                          # build against the plan
/draft:review                                             # validate, check the spec, hunt bugs

It's free and open source. Point it at a repo you know well and watch the first /draft:review find something you'd have shipped.