Goldfish

opinion

Swordfish: A Harness for Your Copilot, Not Migration-in-a-Box

We're not trying to replace your LLM. We're trying to make it stop guessing.

Matt Yonkovit · 7 min read

Every few weeks another product shows up promising to migrate your database “automatically, end to end, powered by AI.” Push a button, get a modernized codebase, pop the champagne. And every developer who has actually shipped a migration reads that and thinks the same thing: sure you will.

We thought it too. So when we sat down to build Swordfish, the first decision we made was what not to build. We are not building migration-in-a-box. We’re not trying to out-LLM the LLMs. We watched the last post’s problem (the inferred knowledge, the behavioral traps, the SQL nobody can find) and concluded that a black box claiming to handle all of it is either lying or about to corrupt someone’s revenue table. Probably both.

Here’s the bet we made instead: your copilot is already good. Claude, GPT, your local coder model, Cursor, whatever you’ve got — they’re genuinely capable at the mechanical work of translating a stored procedure or rewriting a query. The thing they’re bad at is the thing from the last post: they can’t see the whole codebase, they don’t know your business, and they guess confidently when they should stop and ask. They’re missing context and ground truth, not horsepower.

So we built the context and ground truth. Swordfish is a harness — it wraps around the tools you already use and feeds them exactly what they’ve been missing. It shows you what’s in your codebase, tells you what needs to change and why, lets you edit those recommendations, and then hands your copilot a scoped, contextual task instead of “here’s 800,000 lines, good luck.” It’s a helper, not an autopilot. It makes the migration more efficient. It does not make it zero work, and anyone who promises you zero work is selling you the bug, not the feature.

What the harness actually does

Strip away the marketing and it’s four moves: assess, categorize, recommend, hand off. You stay in the driver’s seat for all four.

1. It finds the work, including the work you can’t. This is the four-tier discovery funnel, and it’s the part I’m proudest of because it directly attacks the “you can’t migrate what you can’t find” problem:

  • Tier one: 716 deterministic rules (700 declarative patterns plus 16 code-level rules) sweep the codebase for the known stuff: CONNECT BY, ROWNUM, AUTO_INCREMENT, T-SQL TOP, the patterns we’ve seen a thousand times. Fast, precise, runs offline, costs nothing.
  • Tier two: call-site detection (93 database-access signatures across seven languages) plus string extraction finds SQL hiding in your application code: the embedded queries, the concatenated dynamic SQL, the stuff a schema tool walks right past.
  • Tier three and four: then, and only then, we bring in the LLM. First a targeted pass over the ambiguous cases, then a full sweep for the long tail nobody wrote a rule for. Results get deduplicated and corroborated across tiers, so you’re not drowning in the same finding reported four ways.

The point of the funnel is restraint. Cheap deterministic methods do the bulk; the LLM is reserved for where it actually adds value. Your token bill thanks you. So does your trust in the output.

2. It tells you what’s a real problem, and how scared to be. A raw list of “here are 12,000 things that are different” is useless. Swordfish categorizes every finding by severity and effort, separates the clean ports from the rewrites, and, critically, flags the behavioral traps: the “compiles fine, runs wrong” cases from the last post. The empty-string-is-NULL stuff. The case-sensitivity stuff. The LEN()-trims-spaces stuff. The migrations that pass every test and then lie to you in week three. A schema diff will never warn you about those. We maintain a whole catalog of them precisely because they’re the ones that hurt.

3. The recommendations are suggestions, and you can argue with them. Every finding comes with a recommended change and, where the LLM is involved, a confidence rating backed by multiple models cross-checking each other (more on why we make the models argue in a later post). But (and this matters) you can edit any of it. Tweak the suggestion, reject it, adjust the prompt that’ll be sent to your agent. Swordfish has opinions; it does not have the final word. You do. It assesses and recommends. It does not decide.

4. It hands off to the copilot you already trust. When you’re ready to actually change code, Swordfish doesn’t lock you into some proprietary agent. You pick:

  • Export a prompt pack. Bundle the findings plus surrounding context into ready-to-paste prompts for Cursor, Claude Code, GitHub Copilot, Cline, whatever’s in your editor.
  • Direct LLM mode. Point it at your provider (or your local model; we’re provider-agnostic and air-gap-safe by default) and let it draft rewrites in place.
  • Permission-gated agent mode. Run the Claude Agent SDK in-process where every single tool call is gated through our permission UI, so the agent can’t touch a file you didn’t approve.
  • Sandboxed copilot CLIs. Invoke claude, codex, aider as subprocesses with a fixed, locked-down argument list. (We deliberately strip the “yolo” / auto-approve flags. On purpose. Always.)

The part that should make you trust it: it never touches your source

I want to be loud about this one because it’s where most “AI modernization” tools quietly terrify me.

Swordfish does not modify your source tree. Ever. Every rewrite, whether from the direct LLM, the agent, or a subprocess copilot, lands in a .new sibling file or a configured mirror directory. You diff it. You review it. You decide what merges. The original code sits there untouched the entire time, which means the worst case for a bad suggestion is “you read a diff and said no,” not “an agent rewrote 200 files at 2am and now git blame is a crime scene.”

That’s not a limitation we apologize for. It’s the whole philosophy. The human stays in the loop because the human is the only thing in this pipeline that knows what status = 3 means.

”But a real product would just do it for me”

I hear this, and I get the instinct. Push-button is seductive. But run the math the way I run it.

The pitch for full automation is “90% automated, you only touch 10%.” Sounds great until you ask the only question that matters: which 10%? Because in a migration, the last 10% is the inferred knowledge and the behavioral traps: the exact stuff that breaks production and the exact stuff an automated tool is worst at. And the killer: with a black box, you don’t know which findings are in the safe 90% and which are in the dangerous 10%. You can’t tell the confident-and-correct from the confident-and-wrong. So you end up re-reviewing everything anyway, except now you’re reverse-engineering a machine’s guesses instead of working from a map. That’s slower, not faster.

A great assessment that makes a developer three times faster and keeps them in control of the dangerous parts beats a black box that’s right 85% of the time and won’t tell you which 85%. Every time. I’d rather give you a sharp tool and a clear picture than a magic wand that occasionally sets the building on fire.

That’s the trade Swordfish makes. We’re not the migration. We’re the thing that makes you (and the AI tools you already have) way better at doing the migration. Helper, not hero.

Try the boring version first

Here’s my challenge, and it costs you nothing. Don’t take my word for any of this. Spin up the assessment against a real repo — cd ops && docker compose up -d, point it at your codebase, let it run. Don’t migrate anything. Just look at the map.

Look at how much SQL it found that you forgot was there. Look at the behavioral traps it flagged that would’ve passed every test you have. Look at one finding, edit the recommendation, and export a prompt for your copilot.

Then decide whether you’d rather start your migration from that — a clear, editable, contextual picture — or from a button that promises the moon and hands you a codebase full of fluent guesses.

I already know which one I’d pick. But I’ve been doing this for twenty years and I’m biased, so go see for yourself.


Swordfish is an open-source (Apache-2.0) assessment harness for migrating Oracle, MySQL, SQL Server, Sybase, and DB2 to PostgreSQL — it shows you what’s in your codebase, what needs to change, and hands scoped tasks to the copilot you already use. Source: github.com/EnterpriseDB/swordfish-migrations