AI Coding Tools Are Fast. Your Workflow Is Still Broken.
Here's what "AI-assisted development" looks like on most teams: someone drops a three-line Jira ticket. One dev reads it one way, another reads it differently, and the AI agent someone plugged in last week hallucinates a third interpretation. Three days later, you're on a call trying to figure out why nothing matches what anyone wanted.
The AI types fast. You still do all the thinking, debugging, and cleanup. Nobody wants to admit the "10x productivity" they tweeted about is just moving the mess around.
The problem isn't the AI. It's everything around it.
The Real Issue Is Structure, Not Speed
Most AI coding tools treat every interaction as a one-shot. You prompt, it generates, you deal with the consequences. The agent has zero context about your feature scope. It doesn't know what's done, what's left, or how the pieces connect. It won't test its own output. You become the project manager, QA engineer, and cleanup crew.
Worse, your agents wake up with amnesia every morning. The code they wrote yesterday? Gone. The decision your team argued about for an hour? No idea. What another agent is building right now in a parallel session? Invisible. Multiple amnesiac agents working together aren't a team — they're strangers with a shared git repo.
The answer isn't switching tools. It's building the layer underneath them — specs, memory, coordination, versioning. The discipline we already apply to every other part of the stack but somehow forgot to apply to AI.
Open Spec: Think Before You Build
Most teams already have "specs" — sitting in a Notion page someone wrote six months ago, half-finished, quietly ignored. Open Spec is different not because it's a new tool, but because it forces a habit teams resist: think before you build.
Before anyone writes code, you define what you're building. What it does. What it doesn't do. What "done" looks like. Clearly enough that a dev who joined today could read it and immediately know what to do.
A real Open Spec is short. Take a "retry failed payments" feature: Goal — auto-retry failed payments within 5 minutes. Non-Goals — no UI changes. API — one endpoint, idempotent. Edge Cases — gateway timeout retries once, duplicates rejected. Definition of Done — tests cover the edge cases, spec updated if behavior changes mid-sprint.
That's it. No 20-page doc. It lives at /specs/feature-name.md in the repo, updated with every PR. Hand it to five AI agents in parallel and they actually execute correctly — not because of magic, but because someone did the thinking upfront.
The trap most teams fall into: they write a solid spec at the start, then mid-sprint a decision gets made in a Slack thread and never makes it back. Two weeks later the spec is lying. An agent reading a stale spec doesn't get confused — it executes confidently in the wrong direction. The new habit: when you touch the code, you touch the spec. Treat it like tests. Not optional. Part of done.
Memory, Awareness, Coordination
Specs solve upfront thinking. But agents still need a brain underneath them — memory that persists, awareness of what other agents are doing, coordination that doesn't require babysitting every session.
Three open-source tools stack to provide exactly that. Dolt is a SQL database that works like Git — every change is a commit, every commit has history, you can roll back to any point. Your code already has version control. Dolt gives your data the same treatment. Without it, agent memory is sand. With it, bedrock.
Beads sits on top. Every task, decision, and dependency lives there — structured, connected, permanently stored. When an agent starts a session, it reads Beads first and immediately knows the full picture. When it finishes, it writes back. The next agent picks up where the last one left off. Update one Bead and every agent sees it instantly.
Gas Town is the orchestrator. It reads the spec from Beads, figures out what can run in parallel, spins up your agents, and assigns each a specific task. As they work, progress flows back to Beads. Finished work merges cleanly into main. You stop managing chaos and start reviewing what shipped.
The Tooling Catches Up
Two tools worth knowing show where this is heading.
Kiro is an AI IDE that doesn't start by writing code — it starts by thinking. Give it a feature requirement and it generates a spec first. Real requirements, structured design, task-by-task plan. Then it codes against that plan, one task at a time, testing as it goes. Every line ties back to a requirement. A built-in task tracker loops — plan, code, test, improve — until the feature is actually done. Not until it runs out of tokens. Done. Agent hooks fire on file save: tests run, docs update, quality checks kick in. The stuff you forget gets handled every time.
Tessl tackles a different problem: prompts themselves. On most teams, prompts aren't versioned, tested, or shared. One dev's agent writes clean code, another gets garbage from the same tool the same day, and the team shrugs. "AI, right?" No — that's a process problem. Tessl treats AI context like npm packages. The core unit is a Tile that bundles skills, docs, and rules. You install it, version it, share it. When something breaks, you roll back. It tests your context with and without the Tile, giving you actual numbers instead of vibes. Build. Test. Version. Share. A real feedback loop.
The Mindset Shift
The teams winning with AI won't be the ones with the most tools or cleverest prompts. They'll be the ones who figured out one thing: AI agents are executors, not thinkers.
You still do the thinking. You write the spec, version your context, give agents real memory and shared awareness. Once that work is done — properly, openly, as a team — execution becomes embarrassingly fast.
AI doesn't fix unclear thinking. It scales it. Write the spec. Open it up. Keep it honest. Give your agents a brain. Ship.
Leave a comment
Your email address will not be published. Required fields are marked *


