A grep script found 75 violations on its first run in a Laravel codebase I had been building myself for months. The CLAUDE.md for that project already forbade every single pattern it caught. By the time I was done, the count was 104 grep violations plus 4 bugs caught by an AST layer: 108 real issues in code I thought was clean.
This is the receipts post. Everything in the AI Change Control framework is theory until you run it against a real repo. So I ran it against this one, a Laravel 13 + Blade + Tailwind project, and wrote down exactly what fell out.
First run
I dropped .ai/policy.json, scripts/policy-check.py, and a pre-commit hook into the codebase, then ran the scanner in full-tree mode.
$ python3 scripts/policy-check.py --all
Policy check FAILED — 75 violation(s) across 85 file(s) [mode: all (no git)]
75 violations. In a codebase I’d written myself, with a tight CLAUDE.md that explicitly forbade the exact patterns being caught. The drift accumulates silently because no tool is watching. That is the whole reason to audit before you spread the mess.
The breakdown:
| Pattern | Count | What CLAUDE.md said |
|---|---|---|
space-y-* / space-x-* in blade | 70 | ”Use gap- for spacing in flex/grid — never space-x-* or space-y-” |
bg-gray-* in hero-animation.blade.php | 4 | ”Use semantic color names — not raw colors (text-gray-800)“ |
space-y-* in app.css @apply | 2 | Same as above. Even my own design system had drift |
Every violation was backed by a documented, codified rule that I had written, that the AI had been told about, that I had reviewed, that had been ignored anyway. You cannot prompt your way out of this. You cannot skill your way out of it either. The only thing that changes outcomes is machine enforcement. (For the why, see why Claude Code ignores your CLAUDE.md.)
The blog post almost banned the blog post
The first lesson came from the scanner eating its own tail. I was enriching the draft for the framework post when I ran the scanner for the first time. That draft contains code snippets showing banned patterns as examples. So dd(, dump(, console.log(, HACK, all matched inside the markdown file that was literally about banning them. The scanner caught 9 violations in the post itself.
The tool I was writing about tried to ban the post I was writing about it in.
Fix: add resources/content/ to forbidden_paths. Content isn’t code. The script scans code. One-line config change, problem gone.
Content is not code. Scan what you enforce, not what you write about enforcement.
A second near-miss was sloppier regex. I first wrote dd\( with no word boundary, so it matched added(, padded(, rendered(, any word ending in dd(. Fix: use \bdd\(. Word boundaries turn grep from a dumb substring matcher into an actual pattern matcher. Real regex review takes about five minutes and saves hours of false-positive noise.
The remediation
With the scanner working, I ran /remediation to clear the existing drift. Not a hand-waved “I’ll get to it eventually.” An actual pass through every violation, same session.
The pattern was mostly mechanical but required judgment:
Naive fix: sed 's/space-y-/gap-/g' across all blade files
Correct fix: for each violation, check the parent element. space-y-* works on any block parent. gap-* only works on flex/grid parents. Blind substitution breaks layouts where the parent is a plain div. Each fix required adding flex flex-col to the parent before swapping the class.
The final report after remediation:
$ python3 scripts/policy-check.py --all
Policy check passed — 85 file(s) scanned [mode: all]
75 → 0. Across 18 files. In one focused pass. With a commit that itself passed through the pre-commit hook as proof the loop closed.
The commit that proved the loop
$ git commit -m "refactor: migrate space-y/space-x to gap, eliminate raw colors"
Policy check passed — 19 file(s) scanned [mode: staged]
[main 10fefb3] refactor: migrate space-y/space-x to gap, eliminate raw colors
19 files changed, 3822 insertions(+)
The hook ran. Scanned the staged files. Passed. The commit landed. The same loop will now run on every future commit in this repo, and in every other repo I set this up in. If you want to build that loop yourself, the pre-commit policy-check tutorial walks through the script step by step.
The deploy pipeline integration
The policy check is wired into deploy.sh Step 1 alongside PHP syntax, blade compilation, and route loading. It runs in strict mode by default, so any drift blocks deploy:
[1/8] Syntax & Policy Checks
Running AI policy check...
✓ AI policy check passed — no drift
Checking PHP syntax...
✓ PHP syntax OK
Checking Blade templates...
✓ Blade templates compile
Checking routes...
✓ Routes load (5 routes)
A bypass exists (POLICY_STRICT=0 ./deploy.sh) for emergencies, but the default is blocking. The whole point of machine enforcement is that it’s hard to ignore, so the off switch is intentionally awkward to use.
What I learned from actually doing it
A few things the theory didn’t prepare me for:
Hard-Won Lessons from the First Real Run
Do
- Content directories must be excluded. If you write about enforcement, the scanner will catch your examples. Exclude
content/,docs/,posts/, whatever your content directory is. - Word boundaries are not optional.
dd\(matchesadded(.dump\(matches nothing useful.\bturns grep from “dumb substring matcher” into “actual pattern matcher.” - Start strict on universal, loose on project. The universal bans (debug, generated paths) were instantly right. The project bans (
space-y-*) caught 70 real violations on first run. I did not need to tune them down. I needed to let them bite. /remediationis the right follow-up, not/fix. Fixing 75 violations one-by-one with/fixwould have been death by a thousand cuts./remediationhandled it as a pattern migration across 18 files in a single pass.- The commit message IS the receipt. “75 → 0” in one commit is proof the framework works. Don’t hide it in a chore commit. Lead with the count in the message so future-you can find the moment enforcement started mattering.
- Flip strict mode AS SOON AS drift is zero. Warning mode is a transition state. The longer you stay in it, the less it enforces. The moment the first
policy-check --allpasses, flip the default to strict. I did it in the same session. No regrets.
The canon enforcement chapter: 29 → 0
After clearing the 75 style violations, I ran a /think assessment on the original goal and realized something important. The framework had caught hygiene and style drift, but the bugs I had originally been complaining about were a different class: AI hardcoding values that should come from canon, AI bypassing canonical entry points, AI reinventing instead of reusing.
Grade against the original concerns at that checkpoint:
| Original concern | Caught? |
|---|---|
| Debug leftovers, Tailwind drift, raw DB access | ✓ |
| AI hardcoding a canonical value | ✗ |
| AI using a non-canonical function when canonical exists | ✗ |
| AI bypassing a service layer with one-off logic | ✗ |
The fix was a canon inventory pass on this repo. Thirty minutes of grep to identify the actual canonical primitives (Projects::*, MarkdownContent service, named routes in routes/web.php, blade components) and find where the codebase was bypassing them.
What I found: Projects::* and MarkdownContent were being used correctly everywhere. The only real canon bypass was hardcoded route paths, 29 of them scattered across views, breadcrumbs, nav arrays, and service pages.
Examples of what was being hardcoded:
<x-button href="/contact">Get in touch</x-button>
<x-link href="/learn">Browse articles</x-link>
<a href="/projects/itbroke-dev">itbroke.dev</a>
Every one of these should have been using route('contact'), route('learn'), route('projects.show', ['slug' => 'itbroke-dev']). The named routes existed. Nobody was using them. AI (and I) had been hardcoding paths every time a new link got added.
The new rule I added to the resources/views/ override:
"banned_usages": [
"[\"']/contact[\"']",
"[\"']/learn[\"']",
"[\"']/projects[\"']",
"[\"']/offers[\"']",
"[\"']/changelog[\"']",
"[\"']/services/mvp-partner[\"']",
"[\"']/services/ads-partner[\"']",
"[\"']/services/strategy-partner[\"']",
"[\"']/projects/[a-z0-9-]+[\"']"
]
The regex form ["']/path["'] catches both HTML attribute syntax (href="/contact") AND PHP array syntax ('href' => '/contact') in one pattern. Dynamic routes like href="/learn/{{ $pillar }}" don’t match, because the {{ breaks the pattern.
First run after adding the rules:
$ python3 scripts/policy-check.py --all
Policy check FAILED — 29 violation(s) across 85 file(s) [mode: all]
29 hardcoded canonical route strings. In 10 files. Caught by the same grep-based scanner, no new infrastructure needed.
Remediation was mechanical: href="/contact" became :href="route('contact')" (colon-prefix for Blade component bindings), 'href' => '/contact' became 'href' => route('contact') (PHP expression in array), hardcoded project slugs became {{ route('projects.show', ['slug' => 'itbroke-dev']) }}.
Final run:
$ python3 scripts/policy-check.py --all
Policy check passed — 85 file(s) scanned [mode: all]
29 → 0. Across 10 files. And this time the violations weren’t style. They were the exact class of bug I had originally been complaining about: hardcoded values bypassing canonical primitives that already existed in the codebase.
The four-commit arc at that point:
6f89b50 feat(policy): enforce named routes in blade views + remediate 29 hardcoded paths
a94be89 chore: flip POLICY_STRICT to on by default + add remediation receipts
10fefb3 refactor: migrate space-y/space-x to gap, eliminate raw colors
0753725 chore: add AI policy check scaffolding
Reading that history top-to-bottom: scaffold → clear hygiene → lock strict → enforce canon. The framework is now load-bearing for the bug class that sparked the whole thing.
Total drift cleared across all four commits: 104 real violations in a codebase I had been writing myself, with a tight CLAUDE.md, using all the skills I had built. None of that caught it. The grep-based policy check caught all of it.
The framework doesn’t need to be smart. It needs to be present.
Update: what happened when I added the AST layer
Earlier I had claimed that grep-based checks give you 80% of the value at 5% of the effort, and AST-based checks give you the last 20% at 10x the cost. After the grep layer had been running clean for a few commits, I went back and installed the AST layer, Larastan (PHPStan for Laravel) at level 5, to see if the 20% claim held up on this exact codebase.
It did. In a very specific way.
Install + configure + first run took about 6 minutes. The first run reported 4 errors, zero false positives. Every error was a real bug in code that grep could not see:
| File | Error class | What grep could not see |
|---|---|---|
GenerateFeaturedImages.php:73 | Dead code (!$dryRun always true) | Control flow: the dry-run branch continues 20 lines earlier, so the later negated check is unreachable |
OgImageController.php:23 | Deprecated implicit nullable (string $slug = null) | PHP’s type system: would become a hard error on PHP 9 |
MarkdownContent.php:420 (renderVersus) | Dead null-coalesce (preg_match_all never returns a missing offset 0) | Return type semantics of the stdlib |
MarkdownContent.php:482 (renderCompare) | Same dead ?? [] as above | Same |
None of these would have been caught by prompt tuning, CLAUDE.md rules, skill refinement, or any amount of regex polishing. They’re the exact class of bug the AST layer is built to catch: semantics, not syntax.
Fixing all four took about 4 minutes. Total elapsed time from install to clean second run to committed deploy: ~12 minutes. The 20% claim was right. AST tooling is more expensive to set up than grep, but the cost is measured in minutes, not weeks. Once it’s clean, it stays clean, and any future drift that touches types, null, or signatures fails the preflight check the same way grep failures do.
Grep catches drift. AST catches semantics. You need both, and the sequencing matters.
The lesson worth keeping: do grep first. Installing PHPStan before the grep layer is stable would have meant triaging hundreds of low-signal style warnings alongside the 4 real bugs, and the real bugs would have been buried. By the time I added PHPStan, the hygiene drift was already gone, so every PHPStan finding was automatically high-signal. The layers support each other. Run them in the wrong order and each layer makes the other harder.
The updated deploy preflight now runs both layers in Step 1:
[1/8] Syntax & Policy Checks
Running PHPStan (level 5)...
✓ PHPStan passed — no type errors
Running AI policy check...
✓ AI policy check passed — no drift
Checking PHP syntax...
✓ PHP syntax OK
Checking Blade templates...
✓ Blade templates compile
Checking routes...
✓ Routes load (5 routes)
PHPStan adds about 3-5 seconds. The grep check runs in under one second. Total Step 1 time stays under 15 seconds including blade compile and route load. Cheap, fast, load-bearing.
The eight-commit arc
04b5339 feat(phpstan): install Larastan at level 5 + fix 4 real bugs it caught
3bb1767 feat(toc): H2-only on desktop + fade mask for overflow handling
1b0780f feat(content): dynamic sidebar TOC + AI-ready prompt block + compare fix
f792d37 content: publish AI Change Control framework post
6f89b50 feat(policy): enforce named routes in blade views + remediate 29 hardcoded paths
a94be89 chore: flip POLICY_STRICT to on by default + add remediation receipts
10fefb3 refactor: migrate space-y/space-x to gap, eliminate raw colors
0753725 chore: add AI policy check scaffolding
Reading bottom-to-top (chronological): scaffold → clear hygiene → lock strict → enforce canon → publish the spec → polish UX → refine TOC → install AST layer. Each commit is a receipt for one claim the framework makes. The whole arc took a few focused sessions, not weeks.
Total drift cleared: 104 grep violations + 4 AST bugs = 108 real issues in a codebase I built myself and thought was clean. None of them would have been caught without the enforcement layer. All of them shipped silently before it was installed.
The full scaffolding (.ai/policy.json, scripts/policy-check.py, the pre-commit hook, and the deploy integration) is open source: github.com/spp-ben/ai-change-control. Clone it, drop it into your own repo, and watch what it finds on the first run.
Related:
- AI Change Control: The Framework: the full spec these receipts prove out
- Why Claude Code Ignores Your CLAUDE.md: why documented rules don’t survive contact with a model
- Build the Pre-Commit Hook Yourself: the step-by-step tutorial for the policy check