重构遗留代码

遗留代码是程序员的噩梦——它不是「写得不好」的代码，是没人能解释为什么这么写的代码。

原作者早走了，文档没有，注释是「TODO: fix this later」（later 是 2017 年），测试更别提。你不敢动它，因为不知道动一下会引发什么连锁反应；你又必须动它，因为它挡着你做新功能。

这是重构的终极副本。下面这套流程来自老金 10 万字教程里的重构遗留代码案例，是 Claude Code 在这种场景下能派上用场的标准打法。

第一步：先探索，再下刀

遗留代码最大的风险是未知——你不知道某个函数被谁调用、某个全局变量被谁改、某个奇怪分支是什么时候加的、为什么加。

不要亲自用主上下文去探索。一个 5 万行的遗留模块，你让主代理读完，上下文就废了一半。委托 Explore 子代理：

delegate to the Explore subagent at "very thorough" level:
- map the structure of src/legacy/billing/ — every file, every public function
- for each function, list: callers (grep across repo), external dependencies, side effects (db writes, file IO, global mutations)
- identify the "scary" parts: functions with >100 lines, deeply nested conditionals, functions called from >5 places
return a structured report. don't modify anything

「don’t modify anything」是给 Explore 的红线——它本来就没写权限，但你再强调一遍，确保它只读只看只总结。

这份报告是你后续所有动作的地图。没有这张图，所有重构都是赌博。

第二步：Plan Mode 制定迁移方案

地图有了，下一步是制定方案——别急着动手。

enter Plan Mode. based on the Explore report, design a migration plan for src/legacy/billing/:

1. identify the "seams" — places where the legacy code can be detached from the rest (interfaces, single-call-site functions, pure utilities)
2. for each seam, propose a replacement strategy: wrap, extract, rewrite, or leave
3. order the migrations by risk (lowest risk first) — pure utilities before business logic, leaf functions before central dispatchers
4. for each step, specify:
   - what changes
   - which tests must pass (existing + new ones to add before touching)
   - how to verify no behavior change (diff strategy, snapshot tests, parallel runs)
5. mark any step that requires behavior change as OUT OF SCOPE — flag, don't fix

最后一条「OUT OF SCOPE」是关键。遗留代码里藏着无数「实际上是个 bug 但大家已经习惯」的行为——动了它们，下游会以你意想不到的方式崩。重构阶段只搬不改，bug 留到重构完之后单独发 PR 修。

Plan 出来后，审计它。看看：

步骤是不是真的从低风险开始？
每步有没有对应的测试保护？
有没有夹带「顺手优化」？
顺序合不合理（依赖在前，调用方在后）？

第三步：小步替换，每步测试

Plan 定了，开始执行。核心原则：每步小到可以单独 revert。

execute step 1 of the migration plan: extract the date-formatting utility from src/legacy/billing/format.py into src/billing/utils/date.py.

before changing anything:
- write characterization tests for the current behavior of the function being extracted
- run them — they must pass on the legacy code

after extraction:
- the old function should delegate to the new one (keep the legacy entry point alive)
- all characterization tests must still pass
- all existing tests must still pass

commit with message: "refactor(billing): extract date formatting into utils (step 1/N)"

这一段 prompt 把「一步」拆成了前置保护 + 提取 + 双向兼容 + 验证 + 提交五件事。其中两个细节最值钱：

Characterization Tests（特征测试）

这是 Michael Feathers 在《修改代码的艺术》里讲的核心招式——在动遗留代码之前，先给它写测试，描述它「现在」的行为（而不是它「应该」的行为）。

write characterization tests for the function `calculate_discount` in src/legacy/billing/discount.py.
don't test what the function "should" do — test what it currently does.
include cases for: empty cart, single item, bulk order, expired coupon, negative quantity.
for each case, capture the actual current output as the assertion.

注意「don’t test what it should do — test what it currently does」——这是 characterization test 的精髓。哪怕函数对负数返回了一个奇怪结果，你也记下这个奇怪结果——因为某个客户可能在依赖这个奇怪行为。重构后输出变了，测试立刻红，你就知道下游会以同样方式崩。

保留旧代码做对照

「keep the legacy entry point alive」——旧函数别删，让它代理到新函数：

def format_date(d):
    # DEPRECATED: use src.billing.utils.date.format_date instead
    # kept for backward compat — remove after all callers migrate
    from src.billing.utils.date import format_date as _new
    return _new(d)

这样：

老调用方还在调老函数，行为不变
新调用方可以直接用新函数
重构期间双轨运行，谁崩了能立刻定位
最后一个老调用方迁移完，再统一删除老入口

第四步：用 review subagent 防回归

每完成一组迁移，让 review 子代理审一遍：

delegate to review subagent:
- examine the last 5 refactor commits
- verify: behavior preserved? old entry points still delegate correctly?
- check: any caller of the old function I missed?
- check: any test that was silently deleted (not just modified)?
- flag: any place where "refactor" actually changed behavior

「any test that was silently deleted」是个细节——AI 在重构时偶尔会「删掉跑不过的测试」而不是「修测试」。这种偷偷删测试的行为是回归的头号来源。让 review 子代理专门盯这个。

第五步：分批发 PR，别一次性吞

遗留代码重构最大的错误是一个 PR 改 50 个文件——审查者看不完、回退不知道退哪、出问题隔离不了。

正确做法：

this migration will be 5 separate PRs:
- PR1: characterization tests only (no code changes) — establishes the safety net
- PR2: extract pure utilities (lowest risk)
- PR3: extract leaf business logic
- PR4: replace central dispatchers
- PR5: remove legacy entry points (after all callers verified migrated)

each PR must independently pass tests and not break the build

每个 PR 独立可发、可审、可 revert。最后一个 PR（删除老入口）要等所有调用方都迁移完才发——可以用 grep 验证：

grep the whole repo for any remaining reference to the legacy entry point. if zero, we're safe to delete. if any, list them — they need migrating first

一份「啃遗留代码」prompt 模板

Target: [遗留模块路径]

Phase 1 — Explore (委托 Explore 子代理):
- map structure, callers, dependencies, side effects
- identify scary parts (>100 lines, >5 callers, deep nesting)
- don't modify

Phase 2 — Plan (Plan Mode):
- identify seams (interfaces, single-call-site, pure utils)
- order by risk (low → high)
- mark behavior changes as OUT OF SCOPE
- propose characterization tests for each step

Phase 3 — Execute (per step):
- write characterization tests for current behavior
- extract / wrap / rewrite per plan
- keep legacy entry point alive, delegating to new
- all tests green (old + new)
- commit: "refactor(scope): step N/M — <what>"

Phase 4 — Review (委托 review subagent):
- behavior preserved?
- legacy entry still delegating?
- any test silently deleted?
- any caller missed?

Phase 5 — Cleanup (separate PR, last):
- grep for remaining legacy references
- delete legacy entry points
- final test run

走完这五步，你啃下来的不只是「让代码能动」，而是「让代码能继续动下去」——这才是重构遗留代码真正的目标。

下一站

想看更基础的重构流程——读代码重构流程。
子代理怎么用——回顾 Subagents 深入。
Plan Mode 怎么用——读 Plan Mode 与 Ultraplan。