Goldfish

deep-dive

Heuristics + AI: Why Deterministic Rules Come First

The cheapest, most accurate part of Swordfish doesn't use AI at all. That's not an accident.

Matt Yonkovit · 5 min read

When we started building Swordfish, the obvious move was to point the biggest available model at a codebase and say “find everything that won’t work in Postgres.” It’s 2026. The models are good. Why write rules like it’s 2010?

We tried it. It works, kind of, and it’s a terrible default. What nobody tells you about throwing an entire legacy codebase at an LLM and asking it to find migration problems: it’s expensive, it’s non-reproducible, and it’s least reliable on exactly the stuff that’s easiest to catch with a thirty-character regex. You burn a fortune in tokens to have a probabilistic system maybe-notice that ROWNUM exists, when ROWNUM is a known keyword that a deterministic rule finds every single time, instantly, for free, offline. That’s using a Formula 1 car to deliver pizza. It’ll do it. You shouldn’t be proud of it.

So we built it the other way around. Deterministic first, AI second, and AI only where it actually earns its seat.

The funnel, and why the order matters

Swordfish runs four tiers, and they run in this order on purpose:

  1. 716 deterministic rules (700 declarative patterns, 16 code-level rules) sweep first. These are the known knowns: CONNECT BY, ROWNUM, NVL, AUTO_INCREMENT, T-SQL TOP, the empty-string trap. Patterns we’ve seen a thousand times and can describe exactly.
  2. Call-site detection and string extraction find SQL hiding in application code across seven languages, including the dynamic stuff a schema tool walks right past.
  3. A targeted LLM pass looks at the ambiguous cases the first two tiers flagged but couldn’t resolve.
  4. A full LLM sweep goes hunting for the long tail: the weird, the novel, the things nobody wrote a rule for.

Results get deduplicated and corroborated across tiers, so the same problem doesn’t show up four times wearing four hats. By the time the expensive, probabilistic tier runs, most of the work is already done by the cheap, deterministic one. The LLM isn’t doing the bulk of the labor. It’s doing the interesting part of the labor, which is a completely different and much better use of it.

Four reasons deterministic wins the first pass

I’m a database guy who’s spent the last few years deep in AI infrastructure, so believe me when I say this isn’t anti-AI grumpiness. It’s about using each tool for what it’s genuinely good at. Rules win the first pass for four concrete reasons.

Money. A real migration codebase is millions of lines. Feeding all of it through a frontier model, repeatedly, as you re-scan during the project, is a budget line that’ll get your migration cancelled. A regex costs effectively nothing and you can run it a hundred times a day. The economics aren’t close, and the economics decide whether the project survives contact with finance.

Determinism. Run the deterministic tier twice on the same code and you get the identical answer. Every time. That sounds boring until you’ve tried to manage a migration where the “list of things to fix” changes between runs because the model felt different on Tuesday. You can’t build a project plan, a CI gate, or a burndown chart on a number that wobbles. Reproducibility isn’t a nice-to-have here. It’s the thing that makes the assessment trustworthy enough to plan against.

Precision where precision is free. A model can hallucinate that your code uses CONNECT BY when it doesn’t, or miss it when it does, because it’s reasoning probabilistically. A rule that matches CONNECT BY finds exactly the lines with CONNECT BY, no more, no less. For the known patterns, deterministic detection is simply more accurate than the smartest model, and it comes with a citation: here’s the file, here’s the line, here’s the rule that fired. No “trust me.”

It runs in the dark. Rules need no network, no API key, no provider. They run fully offline, which matters enormously when the thing you’re scanning is a bank’s proprietary core or a healthcare system nobody’s allowed to ship to an external API. The deterministic tier delivers the bulk of the value air-gapped, before a single byte leaves the building.

So what is the AI actually for?

This is where the “right-sized AI” thing I keep banging on about comes in. The LLM isn’t there to do what a rule can do cheaper. It’s there to do what a rule can’t do at all:

  • Read a 600-line dynamically-assembled query nobody could write a pattern for, and tell you what dialect it’s even in.
  • Look at a flagged construct in context and judge whether it’s actually a problem here or a false positive.
  • Explain why a behavioral trap matters in plain language a developer can act on, instead of just pointing at a line.
  • Draft a candidate rewrite for the human to review and edit.

That’s judgment, synthesis, and language. That’s the stuff models are genuinely, almost magically good at. Spending them there instead of on find-the-keyword is the difference between a tool that’s smart and a tool that’s just expensive.

And this is the part I want to land: rules have a ceiling. They catch the known patterns brilliantly and they cannot reason about the novel ones, full stop. That ceiling is exactly where we hand off to AI. It was never heuristics versus AI. It’s heuristics then AI, each doing the job the other is bad at. The regex doesn’t get tired and the model doesn’t get bored, and between them you cover ground neither could cover alone.

If you’re building anything that analyzes code at scale right now, that’s the pattern I’d steal: do the cheap, certain, reproducible thing first, exhaust it, and only then spend your expensive probabilistic budget on the residue that actually needs a brain. Your accuracy goes up, your bill goes down, and you can finally tell someone exactly why a finding exists.

Next time someone pitches you “AI finds everything,” ask them what it costs to run twice, whether it gives the same answer both times, and whether it can show you the line. Then ask why they’re not using a regex for the regex-shaped problems. I’d love to hear the answer.


Swordfish is an open-source (Apache-2.0) assessment harness for migrating Oracle, MySQL, SQL Server, Sybase, and DB2 to PostgreSQL — it shows you what’s in your codebase, what needs to change, and hands scoped tasks to the copilot you already use. Source: github.com/EnterpriseDB/swordfish-migrations