From 733 False Positives to 9 Safe Deletes with Claude Code

The codebase had been growing for months. Nobody deleted the old files.

HealthScoreService replaced by HealthBrain. ConversionRateService absorbed into the digest pipeline. The warnings system superseded by Incidents. Each time, the old code stayed, deployed on every push, referenced by nothing, just sitting there accumulating. 1,713 lines of technical debt with zero remaining value. I knew it was there. I didn’t know how much.

So I built a cleanup loop I could trust.

The insight

Dead code detection is noisy because most tools try to be right. They shouldn’t. The scanner’s job isn’t to deliver truth: it’s to narrow the search space. Don’t trust the tool’s output until you’ve verified with the actual code path, that’s what the grep step is for.

I built a structural graph (using LayerView connected to Claude Code via MCP) that maps every file, class, and dependency in the codebase. Dead code candidates are nodes with zero inbound edges: classes that nothing references.

The first scan flagged 733 of them. Over 90% were false positives. That’s fine. The loop handles it.

“The scanner’s job isn’t to be right. It’s to narrow the search space.”

The loop

Structural scan→Narrow candidates→Verify cheaply (grep)→Delete confidently→Re-scan

Each pass through the loop improved the scanner and reduced false positives. Four versions over one session:

Version	What Changed	Candidates	Actually Dead	Accuracy
v1	Raw graph, no filtering	733	~9	~1%
v2	Framework-aware path exclusions	19	8	42%
v3	Vendor class bug, Blade static calls, FQN resolution	13	9	69%
v4	PHP static call extraction, bootstrap file scanning	9	9	100%

The raw graph had no concept of Laravel conventions. Artisan commands, service providers, middleware, Blade components, event listeners: all flagged as dead because nothing in the PHP source explicitly referenced them. Dependency counts were wildly inflated, with GenerateWarnings showing 53 dependents instead of 0.

Each iteration fixed a specific category of false positives. v2 added framework-aware filtering. v3 fixed vendor class resolution and added Blade static call detection. v4 added PHP static call extraction and bootstrap file scanning.

The hardest false positive to kill: ErrorReporter, the global error handler. Registered via a fully-qualified reference in bootstrap/app.php, a file the scanner originally ignored as a framework entry point. v3 still flagged it as dead. v4’s bootstrap file parser finally caught the registration and excluded it.

What the loop found

Nine files. All superseded, never cleaned up:

Warnings system (5 files): model, command, digest, two Blade views. Replaced by Incidents + SanityAlerts months ago. Explicitly marked @deprecated. Still deployed.
PageBaseline: an Eloquent model. A trait method happened to contain the word “PageBaseline” in its name, which made it look referenced. It wasn’t.
ConversionRateService: 230 lines of conversion math. The digest system had absorbed all of it.
AlertDataService: a rename alias for SanityAlerts that was never wired up.
HealthScoreService: replaced by HealthBrain. 269 lines of funnel-weighted scoring that nothing called.

None of these caused errors. All of them cluttered the codebase, confused search results, and got deployed to production on every push.

The verification trick

The graph narrows candidates. Grep confirms them.

Here’s what it looks like when Claude Code queries the structural graph. Take ErrorReporter, flagged as dead in v3:

mcp__layerview__callers({ project: "itbroke", node: "ErrorReporter" })

Response: 2 callers. Both self-references: the file defining the class and the class’s own method. No external imports. No controller calls it. No command references it. One outbound edge to config:mail.admin_address, and nothing reads it back. That’s the graph signature of dead code.

Contrast that with AlertEngine, a class the graph flagged as a bottleneck:

mcp__layerview__callers({ project: "itbroke", node: "AlertEngine" })

Response: 38 callers across 4 subsystems: controllers, services, commands, Blade templates. Impact query at depth 2 shows 17 files affected. One is safe to delete. The other needs a migration plan. Same query, completely different answer.

My first instinct was to run queries like this for every candidate. It worked, and cost ~120,000 tokens.

Then I realized the same answer takes 3 commands:

Grep-based verification Batch grep all candidates. Disambiguate hits with 2+ matches. ~3,000 tokens. Same answer.

The pattern:

# Step 1: One hit = confirmed dead
for class in Warning GenerateWarnings WarningsDigest PageBaseline \
  ConversionRateService AlertDataService HealthScoreService ErrorReporter; do
  count=$(grep -rl --include="*.php" "$class" . | grep -v vendor/ | wc -l)
  echo "$class: $count files"
done

# Step 2: Disambiguate anything with 2+ hits
# Is it a comment? A method name? A @deprecated reference? Or real usage?
grep -rn "ErrorReporter" . --include="*.php" | grep -v vendor/ | grep -v "the/class/own/file"

Step 1 catches 80% of candidates instantly. Step 2 resolves the rest. Total cost: 2% of the graph query approach.

The bonus find

The structural graph doesn’t just find dead code. It flagged every /docs/* page as having abnormal fan-out: the docs navigation was pulling in the entire content parsing pipeline for every page load.

After: 0.35 seconds Frontmatter-only scanner reads YAML headers without markdown conversion. Same nav, 113x faster.

That performance bug had been there for weeks. No error logs. No crashes. Just 39.5 seconds of TTFB on every docs page, hidden behind CloudFlare’s edge cache until the cache expired.

The takeaway

The scan takes 3.5 seconds. I added it to the deploy script: every push updates the structural map automatically. The loop runs on current state, not stale data. This is the kind of task that belongs in a script rather than a skill: fixed sequence, runs the same way every time, no judgment needed.

The codebase map doesn’t need to be perfect. It needs to be cheap to verify.

Even at 42% accuracy, v2 was already useful: it narrowed 733 candidates to 19, and a quick grep pass confirmed which 8 were real. The value isn’t in reaching 100%. It’s in the loop. Scan, verify, fix, deploy, repeat.

Nine files deleted. 1,713 lines gone. One 39.5-second page load fixed. Total hands-on time: 35 minutes.

This wasn’t really about MCP or any particular tool. It was about building a cleanup loop you can trust.

Related:

AI Change Control: The Pre-Commit Hook Framework: The prevention companion to this removal loop — same “cheap scan, narrow fast” discipline applied to catching drift before it ships
5 Structural Patterns That Survive Every Code Review: What cross-file analysis catches that file-by-file review misses
Custom Skills for Claude Code: The skills system that powers workflows like this one
Audit Before You Spread the Mess: Finding structural problems before they multiply