continuous-refactoring is an open-source tool that points AI agents at messy codebases and lets them clean up incrementally. The loop is simple: the agent proposes a small refactor, runs your test suite, and only keeps the change if validation passes. If tests fail, the change gets thrown out. It supports models like gpt-5.3-codex-spark and claude-opus-4-6, and it's built for the grunt work that never earns a dedicated sprint. Killing dead branches. Simplifying helpers. Fixing names that stopped making sense months ago. The stuff that slowly makes every file a little worse to work in.
The real problem was taste. "Good refactor" means different things to different people. Some want comments stripped aggressively. Others want boundary comments preserved. So the author built taste.md, a configuration file generated through an interactive interview where the agent asks you pointed questions about your preferences on error handling, abstraction, deletion versus preservation, module boundaries. Like the 62 DESIGN.md files that teach agents to build like Stripe and Apple, these preferences get injected into every refactoring prompt. The tool also supports versioned taste profiles, so you can upgrade your style guidelines without starting over.
For work that's too big for a single pass, there's a migrations workflow. A classifier decides whether a target is "cohesive-cleanup" or "needs-plan." The migrations path runs through a six-stage planner: generate approaches, pick one, expand into phases, review, revise, do a final review. Each phase executes on its own branch and gets flagged for human review if the agent can't honestly verify readiness. The author deliberately kept migrations behind an explicit opt-in flag so the original cleanup loop wouldn't degrade as features got bolted on.
The cleanup loop doesn't know migrations exist.