Goldfish

deep-dive

The Schema Is the Easy Part: A Field Guide to Behavioral Traps

A catalog of the migrations that pass every test, ship clean, and quietly corrupt your data three weeks later.

Matt Yonkovit · 5 min read

The worst migration bug I ever watched unfold didn’t throw an error. It didn’t fail a test. The cutover went fine, the smoke tests were green, everyone went home. Three weeks later, finance noticed a report was off by a couple percent. It took days to trace, because nothing was broken — the system was running exactly as written. The problem was that “as written” meant something subtly different on the new database, and the difference was invisible until it had quietly compounded across thousands of rows.

That’s the category I want to talk about, because it’s the one that humbles people. Syntax errors are loud and easy. A CONNECT BY that doesn’t parse stops you cold, you fix it, you move on. The dangerous bugs are the ones where the code ports perfectly — compiles, runs, returns results — and the results are wrong. I call them behavioral traps, and a real chunk of why Swordfish exists is to drag them into the light before they reach production.

Why they’re invisible

A schema converter sees structure. It reads a VARCHAR2(50) and tells you it’s a varchar(50), and it’s right. What it can’t see is behavior — what the engine does at runtime with the values in that column. And behavior is where databases secretly, stubbornly disagree.

The reason these slip through every safety net you have:

  • Your tests probably don’t cover them. You wrote tests for your business logic, not for “does this database treat the empty string as NULL.” Nobody writes that test, because on the old database it was never a question.
  • They’re data-dependent. The bug only fires when a specific value shows up — an empty string, a trailing space, a number that divides unevenly. Your test fixtures might never contain that value. Production absolutely does.
  • They compound silently. One wrong comparison doesn’t crash; it just quietly includes or excludes a row. Multiply by a few million rows and a few weeks and you get a number that’s off, with no stack trace pointing at why.

The field guide

Here’s the catalog I’d hand a team starting a migration, drawn from the trap families Swordfish maintains in its behavioral-difference catalog. Read these and you’ll recognize at least one from a project you’ve shipped.

Oracle: the empty string is NULL. Oracle stores '' as NULL; PostgreSQL keeps them distinct. Every WHERE col = '', every NVL that was quietly catching empty strings, every “is this field blank” check changes meaning. This is the single highest-impact Oracle trap, and it produces zero syntax errors.

MySQL: case-insensitive by default. MySQL’s default collation says 'Matt' = 'matt'. PostgreSQL says they’re different strings. Your login lookup, your uniqueness check, your dedup job all assumed the MySQL behavior. After migration, you get duplicate accounts and “user not found” for people who type their email in the wrong case.

MySQL: zero dates and loose typing. '0000-00-00' is a real stored value in MySQL and a hard error in PostgreSQL. And MySQL will happily compare '5' = 5; PostgreSQL won’t. Code that leaned on either behavior changes — sometimes by erroring (the lucky case), sometimes by silently matching different rows.

SQL Server: LEN() ignores trailing spaces. LEN('abc ') is 3 in SQL Server and 6 with PostgreSQL’s length(). Any validation or comparison built on string length shifts. Same family as the Oracle empty-string trap: a function that “obviously” does the same thing, doesn’t.

DB2: CHAR(n) blank-padded comparison. DB2 compares fixed-width CHAR columns ignoring trailing-space differences, so 'abc' and 'abc ' match. Carry that to PostgreSQL without care and joins and lookups on CHAR columns start quietly missing rows.

DB2: DECIMAL division scale. DB2 and PostgreSQL apply different precision rules to DECIMAL division, so a money calculation can land on a different number of decimal places — a rounding difference that compounds in exactly the financial code where you least want surprises.

Sybase: ISNULL truncation and + overloading. Sybase’s ISNULL coerces and can truncate to the first argument’s width, and Sybase overloads + for both string concatenation and arithmetic. Get the + wrong in translation and you either error out or silently add numbers that were meant to be joined.

Notice the pattern. None of these is exotic. They’re all “this everyday operation has a different opinion on the new engine,” and they all hide behind code that looks completely fine.

What to actually do about them

Two things, and neither is “trust the tool to fix it,” because the tool can’t — the correct behavior depends on what your code meant, and only you know that.

First, find them before you migrate. This is the assessment job: Swordfish flags every behavioral-trap site as a distinct, high-priority finding category, separate from the syntax conversions, precisely because they need different handling. You don’t want these buried in a list of 4,000 mechanical changes. You want them pulled out, labeled “this will pass your tests and still be wrong,” and put in front of a human.

Second, test for behavior, not compilation. The reason these survive is that “it runs” got mistaken for “it’s correct.” For each flagged trap, write the test that the original team never did: feed it the empty string, the trailing space, the uneven division, the mixed-case duplicate. Make the difference visible in CI instead of in a finance report three weeks after cutover.

The schema is the easy part. It’s documented, it’s deterministic, tools have handled it for twenty years. The behavior is where migrations actually go wrong, and it goes wrong quietly. Respect that, hunt the traps on purpose, and you avoid the specific kind of bug that doesn’t page you — it just slowly makes you wrong.

Next: why, given all this, we deliberately refuse to apply any of the fixes for you.


Swordfish is an open-source (Apache-2.0) assessment harness for migrating Oracle, MySQL, SQL Server, Sybase, and DB2 to PostgreSQL — it shows you what’s in your codebase, what needs to change, and hands scoped tasks to the copilot you already use. Source: github.com/EnterpriseDB/swordfish-migrations