r/SpecDrivenDevelopment

I built the same app with GitHub Spec Kit and then OpenSpec: Tutorial & Comparison

Been using spec-driven development for a few months and kept seeing Spec Kit vs. OpenSpec comparisons that felt like they were written by people who hadn't actually used both. So I just built the same app twice.

Same stack (Next.js, Drizzle, PostgreSQL), same feature set, different SDD framework each time. Tracked token costs, session times, where I got stuck, where the AI went sideways.

A few things surprised me, particularly around token economics and how differently each tool handles context.

Curious if others have done similar experiments or landed somewhere different. I've compiled 2 tutorials + my findings and opiniated conclusions in a blog post and a few associated repos - will post them in a comment below.

What's your take on Spec Kit vs. OpenSpec?

reddit.com
u/redditfroggie — 1 day ago

GitHub SpecKit + Agile

Has anyone here had any luck trying to implement speckit with a large team of developers all working on the same application?

Typically, we review our user stories during a sprint planning and then the developers choose and work on their user stories during the sprint. Pretty standard.

I feel like SDD isn’t meshing well with how we typically do things. I know we will have to adapt, but just curious if anyone else has faced this and what they’ve done to navigate it.

reddit.com
u/EmbarrassedHumor9295 — 2 days ago

Tried GitHub's spec-kit with Claude Code for 2 months — notes on what works and what doesn't

Been experimenting with Spec-Driven Development for a couple of months now, specifically GitHub's spec-kit toolkit with Claude Code as the agent. Wanted to share notes because I think this sub will have strong opinions on it, and frankly I'm still figuring parts of it out.

Quick definition for anyone who hasn't seen spec-kit: it's GitHub's official toolkit for what they call Spec-Driven Development. The philosophy is that the spec, not the prompt, becomes the source of truth. You write a versioned, reviewable spec; the agent generates code from it; any substantial change goes back to the spec first. Five phases: Constitution, Specify, Plan, Tasks, Implement. Repo: github.com/github/spec-kit

What's actually good:

- Agent-agnostic. Same spec works with Claude Code, Cursor, Codex, Gemini CLI, Copilot. I've literally generated initial code with Claude Code, then handed the spec to Cursor for test refactoring, and it picked up cleanly. The spec is the portable asset.

- Hard checkpoints between phases. You see the full proposed architecture (Plan phase) before a single line of code gets written. Catches bad arch decisions when they cost 5 minutes to fix instead of 5 hours.

- The Constitution file as quality gate. You define inviolable principles up front (test coverage minimums, dependency allowlists, perf budgets, typing strictness). Agent fails its own validation if it tries to violate them.

- Determinism improves a lot vs. raw prompting. The agent isn't filling in 30 implicit decisions on its own — they're in the spec. Re-running the implement phase produces much more consistent output across runs.

What annoys me:

- Drift is real. If you tweak code manually without updating the spec, things desync fast. spec-kit has some tooling for this but it's young.

- Heavy overhead for small changes. Bug fixes <50 LOC or trivial features make the 5-phase flow feel ceremonial. My current rule: only do full SDD for new modules or features touching 200+ LOC. Below that, just do it manually.

- Legacy migration is painful. Retrofitting SDD onto an existing 30k-LOC codebase without prior specs is months of work, not days. Haven't found a clean approach yet.

- Quality depends heavily on the agent. Claude Code (Sonnet/Opus 4.6+) handles it well. Smaller models struggle with the Plan phase — they generate plans that compile but don't reflect good architectural reasoning.

Practical setup I'm using now:

- spec-kit installed via: uv tool install --from git+https://github.com/github/spec-kit.git specify-cli (PSA: PyPI has typosquatters with similar names. Only the github/spec-kit repo is official.)

- Claude Code as primary agent. Have also tested with Cursor and Gemini CLI for cross-validation.

- SQLite for any local persistence needs in the project. Easy to spec, easy to validate, no cloud dependency to mock.

- A reusable constitution template I've extracted: strict typing, pytest coverage >80%, explicit dependency allowlist, no cloud services unless requirement explicitly demands it.

Two questions for the sub:

  1. Has anyone gotten local models (Qwen, DeepSeek-Coder, GLM, Llama) to handle the Plan and Implement phases competently? My local-only experiments have been mixed — small models follow the format but architectural reasoning falls apart. Curious if anyone's found specific local models or prompt engineering tricks that fit spec-kit's phase structure.
  2. Anyone running SDD multi-agent (one model writes spec, another implements, a third audits)? Theoretically should improve quality through specialization but I haven't gotten it to be measurably better than single-agent in practice.

Curious if anyone has a setup that actually works.

u/jokiruiz — 5 days ago

I stopped organizing AI agents like an Agile team and built a Mafia instead

Gangsta Agents (GitHub), a skills framework for spec-driven AI development. I wanted to share it here because the community actually cares about the problem it solves.


Why a Mafia family, not an Agile team

A lot of SDD frameworks organize agents to mimic Agile teams — standups, sprints, backlogs, story points. I think that's the wrong metaphor. Agile was designed around human coordination costs. AI agents don't need standups. They need hierarchy, discipline, and enforced pipelines.

Gangsta Agents is inspired by the structure of a Mafia family. There's a Don (you) at the top who approves every phase gate. An Underboss decomposes work. Crew Leads orchestrate. Workers execute in parallel. No one freelances outside their role. No one skips a step.

A well-run Mob moves faster than a committee — and so do agents when you stop pretending they need Agile rituals.


The core idea

Every feature goes through a 6-phase pipeline called The Heist:

  1. Reconnaissance — gather intel before any design decisions
  2. The Grilling — adversarial debate (more on this below)
  3. The Sit-Down — produce a signed Contract spec; no code without one
  4. Resource Development — execution planning
  5. Execution (The Hit) — code runs against the spec, never the other way around
  6. The Delivery — wrap up and update institutional memory

Each phase is gated. The Don (you) approves before anything moves forward.


What makes it different — The Grilling

Most spec-driven frameworks get you to write a spec before coding. Gangsta Agents adds a step most frameworks skip: adversarial validation before the spec is finalized.

The Grilling runs two agents in structured debate — a Proposer argues for the best approach from the Dossier, a Devil's Advocate attacks every assumption, identifies edge cases, and proposes alternatives. Multiple rounds run until positions stabilize, then the Grilling Conclusions are documented and approved before you ever write a Contract.

The result: weak assumptions get caught through debate, not production failures. I've found this is where single-perspective design silently goes wrong — no one challenges the obvious approach until it's too late.


What makes it different — The Ledger

AI sessions are stateless. Every session starts from zero unless you do something about it. Most frameworks don't.

The Ledger is persistent institutional memory stored in docs/gangsta/ in your project. It tracks:

  • Insights — successful patterns worth reusing
  • Fails — documented mistakes and why they happened
  • Constitution — project-specific rules that accumulate over time

The Ledger is updated at the end of every Heist. Any new session can read it and pick up where the last one left off — the agent knows the codebase's quirks, the gotchas, and the agreed-upon rules without you re-explaining them.

Without this, every session re-discovers the same things. Insights don't compound. Fails repeat.


Who it supports

Native integrations for Claude Code, GitHub Copilot, Gemini CLI, OpenCode, and Codex. Cursor supported via npx skills add. Skills are pure markdown files — no vendor lock-in, any agent can read them.


Happy to answer questions about the design decisions. The spec-is-law principle (Omerta rule #5: code contradicts spec → revise the spec first, never the reverse) was the hardest constraint to enforce in practice, and The Grilling emerged directly from watching LLMs confidently pick the wrong approach with no one to push back.

🔗 https://gangsta.page 🐙 https://github.com/kucherenko/gangsta

reddit.com
u/Affectionate-Blood92 — 6 days ago
▲ 6 r/SpecDrivenDevelopment+1 crossposts

Why Specification-Driven Development (SDD) is Not a Silver Bullet for AI-Assisted SDLC

When people with limited real-world experience show up declaring that “specs” are the silver bullet for software development, my mind immediately goes back to The Mythical Man-Month. This is a 40-year-old book, very few read it now, but it has deep meaning even now.

Fred Brooks wrote, “The complexity of software is an essential property, not an accidental one. Hence, descriptions of a software entity that abstract away its complexity often abstract away its essence”

In essence, it means that if the specification is abstracted will abstract away the essence, or in short, code remains the ultimate specification,as specification can never be as detailed as the code and is essentially a loose abstraction

Good design and algorithmic ideas are crafted while coding

More here: https://www.linkedin.com/pulse/why-specification-driven-development-sdd-silver-bullet-alex-punnen-6ybvc/

and with a repo to experimentally test out the thesis https://github.com/alexcpn/speckit_test

u/alexcpn — 7 days ago
▲ 10 r/SpecDrivenDevelopment+1 crossposts

In this video, I walk through a custom OpenSpec schema that formally captures Architectural Decision Records (ADRs) and preserves them in a persistent folder. This ensures that every new change proposal "reads" your previous tech choices (like moving from Server Side Rendering to a split frontend/backend) before suggesting new designs. Would love to hear your thoughts and feedback.

u/harikrishnan_83 — 8 days ago
▲ 30 r/SpecDrivenDevelopment+2 crossposts

Spec-Driven Development with OpenSpec and OpenCode

My OpenCode + OpenSpec setup for Spec-Driven Development with skills for git commit discipline, interviewing during proposal creation using grill-me, C4 Diagrams during design, Architectural Decision Records for durable technical choices and Custom OpenSpec Schema to bring this all together. Thanks.

youtu.be
u/harikrishnan_83 — 8 days ago
▲ 9 r/SpecDrivenDevelopment+2 crossposts

OpenSpec template — spec-driven dev for fork-and-go

GitHub repo:

https://github.com/arananet/openspec-template

Template I use for every new project. Core rule: every feature/bugfix needs a YAML spec (acceptance criteria + test plan) before code. Enforced by a pre-commit hook, a deterministic CI check, and an agentic spec-vs-code review.

Setup is one command (bash setup.sh).

When you open the fork in Claude Code, it reads CLAUDE.md, interviews you for project details, customizes the README, and scaffolds your first spec. Same instructions apply to Codex CLI and Copilot via AGENTS.md and .github/copilot-instructions.md.

What's in the box: CodeQL, gitleaks, dep-review, OSSF Scorecard, SBOM + cosign signing + SLSA provenance on releases, DCO, doc-drift check, lint stack, Dependabot auto-merge for patches, cost-capped AI workflows, optional CODEOWNER-gated issue auto-fix agent.

Local scripts/openspec CLI (pure bash) handles scaffold/check — no external dependency.

MIT, feedback welcome.

u/arananet — 14 days ago