Work / Continual Coder
07 · Open source
Continual Coder
Open source · MIT licence · Creator & sole author · github.com/daz-williams/continual-coder ↗
A minimal, self-improving coding harness for LLM agents - a rapid-prototyping system in a box. An agent fixes failing tests in a repo, and a refiner rewrites the agent's own long-term memory from each trajectory, so it accumulates codebase-specific knowledge and gets smarter with every task.
MIT
licence
4
stacks scaffolded
1
CLI to drive it all
0
cloud dependencies
What it is
Per-task coding assistants are excellent - and static. They start every task from zero. Continual Coder adds the outer loop: after each task, a refiner distils what the agent learned from its own trajectory into persistent, on-disk memory, so the next task starts smarter. Reset-free in spirit - memory survives across tasks, runs, and restarts.
It's a small, readable implementation of the idea behind self-improving agent harnesses (skill libraries, self-rewriting systems), pointed at the one domain with a clean, incorruptible verifier: code that must pass tests. The agent proposes edits, the verifier runs the tests, the loop retries until green - then the learning gets written down.
Everything runs against your own model endpoint using the standard chat-completions API - built for locally hosted open-weights models, so nothing leaves your machine. Each app lives in its own container with its own workspace, memory, and metrics.
The cc CLI
A prototyping system in a box - idea to running app, one command at a time.
-
cc new --wizard - interview to spec
A short Q&A (hard-capped at eight questions) turns your idea into a written spec, a project skeleton, and a first set of tests that are expected to fail - they're the target the build loop drives toward.
-
cc run - the self-improving loop
The agent proposes file edits, the verifier runs the tests, it retries until green - then the refiner distils what it learned into that app's memory so the next task starts smarter.
-
cc serve · cc share · cc summary - run it, demo it, measure it
Start the app's dev server, share a gated tunnel-based demo link, and check the metrics that matter: is the learning actually compounding?
-
cc task - keep iterating
Queue the next feature as a task and run it. Every phase is re-runnable by hand, so a messy bootstrap never means starting over.
Why it matters
It's the distilled version of how I build: agentic loops with a hard verifier, memory that compounds, local-first infrastructure, and container isolation as a default rather than an afterthought. And because it's MIT-licensed and a deliberately small codebase, you can read the whole thing in a sitting - it's my working style, in public.
Stack