Workflow

Claude Code for legacy refactors: a safe workflow

UntestedDeveloperFor Developer

Status: published as a field guide and marked untested by our desk. The steps and prompts follow standard refactoring practice, but validate them against your own codebase before relying on the output.

The task

You inherited a large, under-tested module and need to refactor it without changing behavior or shipping a regression. This workflow uses Claude Code to map the blast radius, lock current behavior behind characterization tests, and refactor in small, reviewable commits you can actually trust.

Before you start

Tools: Claude Code, a git repo with a clean working tree, your existing test runner.
Rule one: never let the agent change behavior and structure in the same commit. Tests first, refactor second, each in its own commit.
Work on a branch. Keep diffs small enough to review by eye.

The workflow

1. Map the blast radius before touching anything

Ask the agent to trace, not edit:

Code

Don't change any code yet. Trace every caller of `OrderService.finalize()`
across this repo. List each call site (file + line), what it passes in, and
what it depends on from the return value or side effects. Flag any caller
with zero test coverage.

Read the output and sanity-check it against your own search. This map is what you refactor against.

2. Pin current behavior with characterization tests

Before changing logic, capture what the code does today — bugs included:

Code

Write characterization tests for `OrderService.finalize()` that assert its
CURRENT behavior, including the edge cases visible in the call sites above.
Don't fix anything you think is wrong — just lock in today's output so we
can prove the refactor changes nothing. Use <our test framework>.

Run them. Green on the current code is the safety net for everything that follows.

3. Refactor in small, named steps

Give the agent one transformation at a time, not "clean this up":

Code

With the characterization tests green, extract the discount calculation out
of `finalize()` into a pure function `computeDiscount(order)`. No behavior
change. Run the tests after the edit and show me the diff.

One concept per commit: extract function, rename, inline, split. Commit between each.

4. Verify nothing changed

After each step the characterization suite must still be green. If the agent "improved" behavior, that is a separate, intentional commit — never smuggled into a refactor.

5. Tighten the tests last

Once the structure is clean, replace the brittle characterization assertions with intention-revealing unit tests against the new, smaller functions.

A reusable prompt

Pin this as your refactor system prompt:

Code

You are refactoring legacy code. Rules:
1. Never change behavior and structure in the same commit.
2. Before any change, confirm the characterization tests are green.
3. Make the smallest reviewable change, then stop and show me the diff.
4. If you believe behavior is wrong, say so — do not fix it silently.

Gotchas and limits

The agent will confidently refactor code it only partially traced. Step 1 is non-negotiable; skipping it is how a silent regression lands in an untested caller.
Large files blow past useful context. Point it at the specific function and its callers, not "the whole module."
Characterization tests can encode bugs as "correct." That's intended — preserve behavior first, fix bugs later as separate, reviewed changes.
Don't trust a green run you didn't watch. Have the agent show the test command and output, or run it yourself.

What to try next

Browse more AI workflows for Developers.
Subscribe to Agentic Daily for the day's most relevant developer AI news.

Source: Agentic Daily