Coding Consistently with Agents in 2026
I wrote about vibe coding in 2024, before “vibe coding” was a thing, about using AI to build software. It worked, in a limited way, but only just. You still really needed to know how to code.
In January 2026, the workflow looks completely different. Context windows are bigger, agents are reliable, tool calling has matured, and Claude Code superpowers tie it all together.
Here is how I now build and maintain a ~30,000-line codebase, derived from ~100,000+(!) lines of plans.
First, get a Claude Max subscription. Its limits are huge, and you can code massive apps for a set monthly fee. I think it's the best bang for buck, at least right now.
Brainstorm
Before a single line of code is written, I plan. I launch claude code, and start with the /brainstorm tool from superpowers, giving it a one-paragraph summary of what I want to build, emphasizing that it should be built in a modular way. The brainstorm tool asks detailed follow-up questions to add more depth to the plan. This process also determines which components make up the 'app' overall. This is incredibly important, as each component has a complexity cap. I'll get into that later.
When brainstorming is complete, the tool generates a design file. I save it as PRD.md.
Now, some architecting skill comes in. If I'm building an app of any useful size, it couldn't possibly be fully captured in a single design file. The PRD gives a sense of the components and modules that are required, but I need to brainstorm each module in more detail. I relaunch claude and prompt:
/brainstorming I want to brainstorm building [module] from @PRD.md, along with the initial app scaffolding.
The tool then focuses all of its attention on designing that single module, rather than the app as a whole. Crucial: I will know that the module is below the complexity cap if the design file for it is no more than ~300 lines. This is because Claude has limits on its output tokens, and if a design gets too big, the plan it will generate will degrade in detail as it gets closer to the end.
This will end with a completed [module]-design.md file.
Plan
Once the design for the module is created, I scan it, make any adjustments, then move on to the /write-plan phase. This converts the design into a detailed implementation plan that the agent will follow. Again, I will know that the module is below the complexity cap if the implementation plan is no more than ~1,500 lines.
I now have a high-level design of the app (PRD.md), and a design and implementation plan for the app's scaffold and first module ([module]-design.md and [module]-implementation-plan.md).
I make note of any external APIs that might be necessary, and create a new CLAUDE.md file. In that file, I'll put URLs to API docs, or other rules I want the implementer to follow, for example, "NEVER use mock data, or fallbacks. If live data is not available, the app simply should not work."
Implement
Now, finally, the coding begins. I switch to the /execute-plan tool, point to the @[module]-implementation-plan.md, and let claude rip.
Claude builds incrementally, one task at a time, strictly bounded by the plan. It performs regular commits, and documents as it goes.
When it's finished, it should feel like Christmas morning, waking up to a gift waiting in the terminal: a bootstrapped app with its first module.
Depending on its size, I'll sometimes open a fresh Claude Code instance and use the /requesting-code-review superpower, along with the prompt "thoroughly review @[module]-implementation-plan.md. Ensure everything was coded and wired up." If a plan is very large, or one of many phases for a single module, sometimes things slip through the cracks. This little intermediate step addresses that.
After testing to make sure the module matches the vision, I go back to /brainstorm mode and work on the next module.
Once every module has (at least) a design file, I delete PRD.md, as the plans diverge from it enough that it's no longer accurate.
Clean up
When all module development is complete, there will inevitably be a bit of redundancy, and inconsistency across the modules. This happens even among the best human development teams. The app should run just fine, but it might reimplement the same function multiple times, or break a bit too easily after future code changes. The clean up step alleviates that quite a bit.
I open up a fresh Claude instance, and module-by-module, submit the following prompt:
/brainstorming look at the XX module. identify areas of brittleness. or where npm packages could simplify. or where a refactor could improve things. or where code is no longer used. or where something does its own implementation but it already exists elsewhere. imagine being a senior engineer taking it over, and meeting their expectations.
This will identify issues big and small, and find areas to combine things. Once I do this across every module, I do a final pass across the whole app. (This wasn't possible until Opus 4.6 with its 1M token context window.)
/brainstorming look at this entire app. identify areas of brittleness. or where npm packages could simplify. or where a refactor could improve things. or where code is no longer used. or where something does its own implementation but it already exists elsewhere. or where there is no single source of truth. imagine being a senior engineer taking it over, and meeting their expectations.
At this point I should have a relatively clean, production-ready base of code.
If I feel that the app is "freezable," I'll ask claude to review the app and write a thorough README.md.
Final thoughts
At the outset, my default stance is to not code anything. I rigorously consider whether the agent is going to "build for building's sake," or actually create something that's truly needed. If I continued to prompt "find things that are missing and add them," I can easily get to a note taking app that is a million lines of code. So I've found that it's important to reality-test everything that might be added, so the final product is very focused.
This process yields the highest quality of AI-generated code. The code isn't perfect, and probably wouldn't pass the vibe-check of any senior engineer. But for end-users, it's a product that otherwise wouldn't exist, and that's enough.
On finding new tools
Most of these new coding tools are just clever system prompts chained together. The bulk of the value of superpowers rests in this single file. I'm also exploring incorporating Karpathy's brilliant autoresearch into my repos, and just installed superset. Plus, Claude's built-in plan mode itself has gotten so good, I wonder if I even need superpowers at all. All of that to say: the ground is shifting quickly beneath our feet, and the job to do is surf it.
On CLAUDE.md rules
While a CLAUDE.md should reflect the app being built, there is one consistent rule I add in all of my projects:
- this app is being used in a hospital system. Code quality directly impacts patient outcomes. Develop with the highest professional standards. [I put this in no matter what. It seems to yield higher quality code]
Written on Jan 5th, 2026