How To De-Slop A Codebase Ruined By AI (with one skill)

Matt Pocock · 11m 19s · Watch on YouTube · 15 sources

Decision Card

Effort: A 30–60 minute session — install Matt Pocock’s improve-codebase-architecture skill into your .claude directory, point it at one real (imperfect) codebase, turn off auto mode, and walk through the human-in-the-loop grilling for a single deepening candidate.

Honest take: The skill is a finder, not a fixer — Pocock is explicit that it offloads judgment back onto you (“the general”), so the headline “de-slop with one skill” oversells it; the actual leverage is the shared vocabulary (modules/interfaces/seams/depth) plus a disciplined interview, none of which is new — it’s Ousterhout and Feathers repackaged for an agent. Also note the “41.5k stars” in the video is already stale: the repo passed 100k by May 2026.

Concrete next steps:

  • Read the source skill in the repo and skim its glossary (~15 min): github.com/mattpocock/skills
  • Run it on one codebase with auto mode OFF, pick ONE candidate, and answer the grilling questions yourself instead of saying “choose your recommended answers” (~45 min)
  • Before refactoring legacy code, add tests at the seams first — read Feathers on seams: informit.com seams excerpt
  • Skip if your codebase is small, well-tested, and you already think in terms of deep modules — you’ll get more from reading A Philosophy of Software Design directly than from running an agent over it.

TL;DR

Matt Pocock argues AI has accelerated “software entropy,” turning codebases into balls of mud, and offers his improve-codebase-architecture skill as a cure that scouts for “deepening opportunities” using Ousterhout’s deep-module vocabulary. The catch: the skill only surfaces candidates and grills you — the human must make every strategic call, so it’s a thinking aid wrapped around classic software-design fundamentals, not an autopilot.

Key Points

  • AI hasn’t made code cheap so much as accelerated software entropy — codebases now decay faster because changes that ignore the whole system snowball into a “ball of mud.” 00:08
  • A prior video covered prevention via deep modules; this one covers the cure — rescuing a codebase that feels beyond repair. 00:45
  • He recently added a shared glossary of terminology to the skill, arguing a shared vocabulary with the AI lets you be far more precise in requests. 01:14
  • Core primitives: a module is a unit of the app, its interface is everything a caller must know to use it, and its implementation is what’s inside. 02:08
  • Deep modules (much hidden behind a simple interface) beat shallow modules (complex interface, little behind it) — from Ousterhout’s A Philosophy of Software Design. 02:45
  • Seams are the gaps between modules where interfaces live — the natural place to add mocks for unit and integration testing. 03:43
  • An adapter (borrowed from hexagonal architecture) is a concrete module satisfying an interface at a seam — e.g. a real clock in production, a fake clock in tests. 04:14
  • The two payoffs of deep modules: locality for maintainers (changes/bugs concentrate in one place) and leverage for users (more capability per unit of interface learned). 04:51
  • Demo on his real React Router + effect.ts repo (~1,500 commits): the skill explored, then surfaced six deepening opportunities, including a concept with two parallel implementations and an untested seam. 06:28
  • Crucial caveat: this is not an AFK skill — agents are good “tactical programmers” but you must be the “strategic programmer” making the long-term judgment calls; run it every couple of days on fast-moving codebases. 09:32

Notable Quotes

“What’s happening is that AI has simply accelerated software entropy. In other words, codebases are falling apart faster than they ever have before.” 00:08

“A deep module hides lots of implementation behind a relatively simple interface. A shallow module has a complex interface and kind of not much implementation actually behind it.” 02:45

“I think of agents as really, really good tactical programmers… But they need someone on the level above them who is the strategic programmer. And that’s what this skill does.” 09:32

Verified Claims

Claim: The deep-module / shallow-module distinction comes from John Ousterhout’s A Philosophy of Software Design. 02:56

Claim: “Adapter” is a term from hexagonal architecture. 04:14

Claim: Seams are where you do unit/integration testing, e.g. by inserting a mock. 03:43

Claim: Libraries like TanStack Query are good deep modules — lots of complexity behind a super-simple interface. 03:21

  • Sources: TanStack Query — Overview
  • Verdict: Confirmed — the docs emphasize a simple promise-based interface with no global state to manage, while remaining “configurable down to each observer instance,” matching the deep-module definition.

Claim: His skill is part of a GitHub skills repo “currently sitting at 41.5k stars.” 01:08

Claim: The demo repo is a React Router application that uses effect.ts under the hood. 05:43

  • Sources: Effect — official site, Effect-TS/effect on GitHub
  • Verdict: Inconclusive on the specific repo / Confirmed on the tech — both React Router and Effect (effect.ts) are real, production-grade TypeScript libraries; the private “course video manager” repo itself can’t be independently inspected.

Tools, Papers & Standards Mentioned

Follow-up Questions

  1. Does running improve-codebase-architecture every couple of days actually reduce measurable complexity (e.g. coupling, change-amplification metrics) over time, or does it mostly surface candidates the team never acts on?
  2. How well does an LLM identify true “deepening opportunities” versus producing plausible-but-shallow refactors that increase churn — and what failure modes appear when a user just says “choose your recommended answers”?
  3. For genuinely legacy/untested codebases, what’s the most reliable AI-assisted workflow to build the test “harness” before refactoring, given the chicken-and-egg problem that you need seams to test but need tests to safely create seams?

Sources