opinion
Heuristics vs LLMs: It's a Division of Labor, Not a Cage Match
It is not heuristics versus LLMs — it is a division of labor. Heuristics own the facts, the LLM owns the judgment, and a cheap check keeps it honest.
There are two loud camps in every “should we use AI for this?” meeting, and they’re both answering the wrong question.
Camp one: rules are safer. Deterministic, auditable, no hallucinations, ship it. Camp two: throw the model at everything. Why hand-code logic in 2026 when a frontier model can just figure it out? They argue past each other for forty minutes, somebody books a follow-up, nothing happens.
I’ve sat in that meeting more times than I want to admit, and the deal is this: it’s not heuristics versus LLMs. That framing is a cage match nobody needs. The real question is a division of labor — which engine should own which job — and most AI features get the answer exactly backwards.
Two engines, two very different temperaments
A heuristic is encoded expertise. “If shared_buffers is under 25% of RAM, flag it.” It’s a rule a human wrote down. It costs nothing to run, returns the same answer every time, you can read exactly why it fired, and it will never, ever make up a number. Its weakness is the flip side of its strength: it’s rigid. It can’t say “well, that high connection count is fine because you’re running a pooler.” It doesn’t do nuance, it doesn’t synthesize across twenty findings, and it can’t write you a paragraph a human actually wants to read.
An LLM is the opposite temperament. Give it twenty findings and it’ll tell you the two that actually matter and why, in plain language, accounting for context a rule could never encode. That’s real synthesis, and it’s genuinely hard to do any other way. The catch: it’s probabilistic. Ask it to recite a fact and it’ll occasionally invent one, with total confidence, in the same voice it uses for the things it got right.
So you’ve got a cheap, reliable engine that’s rigid, and an expensive, flexible engine that lies sometimes. Stated that plainly, the assignment writes itself.
Right-size the job to the engine
I’ve been beating this “right-sized AI” drum for a while: not everything needs the biggest model, and a lot of things don’t need a model at all. This is the same idea, one level down. Heuristics own the facts. The LLM owns the judgment.
You do not need a billion-parameter transformer to compare two numbers. Comparing 128MB to 25% of 512MB is a rule. Encode it. It’s free, it’s instant, and it cannot hallucinate the comparison. Reserve the model for the thing only it can do: “these three findings compound into a connection-storm risk under your write pattern, and you should fix the memory one first.” No rule writes that sentence. That’s what you’re paying the LLM for.
The vibe-coded default gets this backwards. The reflexive move is to dump all your raw data at the model and say “analyze this,” prose and all. Now you’ve handed the fact-reciting job to the engine that sometimes lies about facts. I watched a tool do exactly this and confidently report shared_buffers at 256MB when the real value was 32MB. The model didn’t read the gauge. It pattern-matched a plausible number, because that’s what probabilistic engines do when you ask them to be a database of record. Wrong engine, wrong job.
The best systems don’t choose — they compose
The mistake hiding inside the cage-match framing is that you have to pick one. You don’t. The strongest pattern uses both, each for its job, with the cheap one keeping the expensive one honest.
The clearest example I’ve built is a two-part judge for an AI database tool. When the model makes a claim — “shared_buffers is 32MB, too low” — part one is a pure heuristic: does the number it cited match the number we actually collected? That check is free, deterministic, and runs in microseconds. If the model fabricated the value, a string comparison catches it before a single extra token is spent. Only the claims that survive that cheap filter go to part two: a vote among multiple models on the judgment calls a rule can’t settle. Heuristic for the facts, LLM for the opinion, heuristic verifying the LLM. Both engines, doing what each is actually good at.
Notice what that buys you. The expensive, fallible engine never gets to be the system of record for a fact. It gets to do synthesis, and even then its synthesis is grounded in values a deterministic check already confirmed. You get the LLM’s flexibility without betting your output’s correctness on its worst habit.
This generalizes past databases
Swap the domain and the shape holds:
- RAG / Q&A: let the model synthesize the answer, but use a deterministic check to confirm the cited passage actually contains the claim. Don’t trust the model to remember whether its own citation is real.
- Document extraction: the model pulls fields from a messy invoice; a regex or format check validates the ones that have a knowable shape (dates, totals, IDs). Heuristic catches the hallucinated total.
- Code review bots: the model proposes comments; a deterministic pass confirms the line it’s complaining about actually exists in the diff and actually does what the model claims.
Every one of these is the same division of labor. The model does the part that needs judgment. A cheap, boring check owns the part that has a right answer.
So, do both have a place?
Yes — and pretending otherwise is how you end up with either a rigid tool nobody finds useful or a confident one nobody can trust. The skeptics are right that you shouldn’t let a probabilistic model be your source of truth. The maximalists are right that rules alone can’t do synthesis. They’re both describing half of a system that works.
The move, next time you’re building an AI feature: list every job it does, and for each one ask “does this have a knowable right answer?” If yes, that’s a heuristic — write the rule, it’s cheaper and it can’t lie. If it genuinely needs judgment, that’s the LLM’s job, and you should still verify whatever facts it leans on. Use the expensive engine for less than your instinct says, and verify the rest.
That’s not anti-AI. It’s just using the right tool for the job, which is the oldest engineering advice there is. The whole worked example — judge, grounding, the works — is open source if you want to steal the pattern.
— The HOSS