Knowledge, Logic, and Constraints

June 13, 2026 · 11 min read

A bank’s loan-approval system has 12,000 rules. Some come from regulation, some from internal policy, some from credit-risk models. Every loan decision must be defensible to an auditor and a customer. The team is asked: should we replace this with an LLMLLMA neural network trained to predict the next token in a sequence, large enough that it generalises to tasks it wasn’t explicitly trained for. ? The answer is no, and not because LLMs are bad. It’s because the correct answer to “is this customer eligible for this loan, given current rules” is a deduction, not an opinion.

In the previous post we covered classical AI’s search algorithms, the part that asks “how do I get from here to there?” This post covers the part that asks “what follows from what I know?”

This is the symbolic AI of the 1980s, the field that gave us expert systems and Prolog and the long winter that followed. Most of it failed as a way to build “intelligence.” Almost all of it, in transformed shape, is still in production today, doing useful work that a transformer can’t do.

Two ways to know things

Symbolic AI starts with a separation between facts and rules.

Before any notation, here’s the whole idea in plain English. A fact is something you state outright: Alice is Bob’s parent. Bob is Charlie’s parent. A rule is a recipe for making new facts from old ones: anyone who is the parent of a child’s parent is that child’s grandparent. A query is a question you put to the system: who are Charlie’s grandparents?

The system answers by lining the rule up against the facts. It needs a parent of a parent of Charlie. Bob is Charlie’s parent; Alice is Bob’s parent; so Alice is Charlie’s grandparent. Nobody typed that answer in. It follows from what was typed in, and the system can show its working: which rule fired, matched against which facts. That auditability is why this style of reasoning still runs loan approvals.

Everything in this post is that trick, scaled up: more facts, more rules, cleverer ways of matching them. The notation that follows is nothing more than a compact way of writing facts, rules, and queries down.

Facts describe the state of the world: parent(alice, bob), temperature(sensor_3, 78), status(account_2847, "frozen"). Rules describe relationships: if parent(X, Y) and parent(Y, Z) then grandparent(X, Z). Given a knowledge base of facts and rules, you can derive new facts by applying the rules.

This is deduction: from “all humans are mortal” and “Socrates is a human,” conclude “Socrates is mortal.” It feels primitive in 2026, of course you can do that, and that’s the point. The mechanism for guaranteed correct derivation from knowledge is one of the most useful things in computer science. It just doesn’t get the headlines that probabilistic methods get.

The two main flavours of logic in classical AI:

Propositional logic. Statements are atomic (“it is raining,” “the door is open”) and combined with AND/OR/NOT/IF. No variables, no quantifiers. Decidable but limited.
First-order logic. Adds variables, quantifiers (“for all X,” “there exists Y”), and predicates with arguments. Vastly more expressive. Undecidable in general but tractable for restricted fragments.

Almost all production logic-based AI uses a restricted fragment of first-order logic, because the unrestricted version lets you write down problems no algorithm can ever solve.

Prolog and the Datalog descendants

Prolog (1972) was the moment logic became programming. A Prolog program is a knowledge base; running the program means asking a query and letting the engine search for proofs. The famous example:

parent(alice, bob).
parent(bob, charlie).
grandparent(X, Z) :- parent(X, Y), parent(Y, Z).

?- grandparent(alice, Who).
Who = charlie.

Prolog had a moment in academic AI. It still ships, mostly behind the scenes, type inference engines, some natural-language interfaces, some constraint solvers built on top of it. SWI-Prolog and SICStus are the main implementations.

Datalog is Prolog’s well-behaved cousin. It’s a strict subset (no functional terms, no negation in early variants) chosen specifically to be decidable and efficient. You can ask any Datalog query against a fact base and get a guaranteed answer in polynomial time.

Datalog has had a renaissance:

Soufflé, a high-performance Datalog engine used for static program analysis (finding bugs and security issues in code).
Logica (Google), a Datalog dialect that compiles to SQL and runs on BigQuery.
Differential Datalog (VMware), incremental Datalog for live network analysis.
Datomic, a database that exposes Datalog as the query language.

The pattern that pays off: Datalog is the correct tool when your problem is “I have facts and rules, I want to ask which other facts follow.” Network reachability analysis, access-control reasoning, code analysis, recursive queries over relational data. None of these are LLM problems. All of them are Datalog problems.

Description Logic and ontologies

A specific corner of logic worth knowing about: Description Logic (DL), the formal underpinning of OWL (Web Ontology Language) and the Semantic Web stack.

DL is a family of decidable fragments of first-order logic specifically designed for representing concepts and the relationships between them. It’s the technology behind:

Medical ontologies. SNOMED CT (350,000 medical concepts with formal definitions), used in healthcare records worldwide.
Biological ontologies. Gene Ontology, Protein Ontology, and many others, used to integrate biological databases.
Industry ontologies in finance, manufacturing, and aerospace.
Knowledge graphs. Wikidata, Google’s Knowledge Graph, enterprise knowledge graphs at most large companies.

DL reasoners (HermiT, Pellet, FaCT++) can answer questions like “is concept A a subclass of concept B?”, “is this combination of facts consistent?”, “what concepts are equivalent?”, and do it provably. When a hospital system needs to reason “this patient has a condition that is a kind of cardiovascular disease, and they’re on a medication contraindicated for cardiovascular disease,” the reasoning is a DL inference, not an LLM call.

SAT and SMT solvers: logic that scales

A different lineage starts with the Boolean Satisfiability Problem (SAT): given a Boolean formula, find an assignment of true/false to its variables that makes it true. SAT is famously NP-complete, there’s no known polynomial algorithm.

And yet modern SAT solvers (MiniSat, Glucose, Kissat, CaDiCaL) routinely solve problems with millions of variables in seconds. They use a combination of unit propagation, conflict-driven clause learning, and aggressive heuristics that took thirty years of research to perfect. The result is one of the most useful tools in computer science, used in:

Hardware verification. Every chip you own was checked by SAT solvers.
Software verification. Memory safety, concurrency bugs, security properties.
Software-defined networking. Verifying that firewall rules and routing don’t allow forbidden traffic.
Automated theorem proving.
Cryptanalysis.

SMT solvers (Satisfiability Modulo Theories) extend SAT with theories, linear arithmetic, bit-vectors, arrays, strings, floating-point. Z3, CVC5, and Yices are the standard tools. Where SAT can answer “is this propositional formula satisfiable?”, SMT can answer “is there an integer x and a string s such that x > 0 and length(s) == x and s contains ‘foo’?”

SMT solvers are the engine behind:

Symbolic execution in security tooling.
Verification-oriented type systems (F*’s proofs, Dafny’s contracts, Liquid Haskell’s refinement types all discharge their obligations to an SMT solver).
Test-case generation in tools like KLEE.
Constraint-based program generation.

These are not learning systems. They reason. They’re called when an answer needs to be correct, not plausible.

Production rule engines

We touched on these in Rules, Grammars, and Regex, but they deserve a deeper look here. A production rule engine is a system that runs forward-chaining inference: take the facts, fire rules whose conditions match, derive new facts, repeat.

The dominant industrial tools:

Drools (Java/JBoss), the most-used open-source rule engine in enterprise software. Insurance underwriting, claims processing, fraud detection, benefit eligibility.
IBM ODM (Operational Decision Manager), enterprise rule platform with a business-user authoring environment.
CLIPS (C Language Integrated Production System). NASA-developed expert system shell, still in use.
Apache Jena Rules for semantic-web reasoning over RDF data.

These are not academic curiosities. They run claims-adjudication systems at major insurers (where every claim is scored against thousands of rules), benefits-eligibility logic at government agencies (where regulations change every legislative session and the rules need to be authored by domain experts), and pricing logic in financial services.

The reason they persist: rule engines let domain experts, not programmers, maintain the rules. A senior actuary or compliance officer can read and edit a rule like “IF policy_type == ‘life’ AND age > 65 AND smoker THEN apply_loading(‘senior_smoker’).” That’s not something a fine-tuned transformer can offer.

Knowledge graphs and reasoning

A knowledge graph is a database of entities and the relationships between them. They’ve become a standard tool for:

Search. Google’s Knowledge Graph is the reason “who founded Microsoft” gets you a card with Bill Gates’s photo.
Recommendation. Connecting “users who liked X” to “X is a kind of Y” to “other Y items.”
Customer 360. Resolving entities across systems, this customer in Salesforce is the same as that account in SAP is the same as those tickets in Zendesk.
Fraud detection. Mapping the network of accounts, devices, and transactions to spot patterns.

The reasoning over knowledge graphs is a mix of:

Graph traversal. Find paths between nodes (like Cypher queries in Neo4j or Gremlin queries).
Logic-based inference. Apply rules to derive implicit relationships (Datalog or DL).
Embedding-based methods. Represent entities as vectors and predict missing edges (TransE, RotatE, and their descendants).

A modern knowledge-graph stack often combines all three, traverse for explicit relationships, reason for derivable ones, embed for likely-but-unstated ones. This is the closest thing classical AI has to a “stack” the way RAG is a stack for text.

Where logic-based AI fails

Symbolic AI failed at being “general intelligence” for two big reasons.

Brittleness is one. A logic-based system handles exactly the cases its rules cover. When the world produces an input outside the rule set, the system has no graceful fallback. The rules either fire or they don’t. There’s no “kind of” or “probably.”

Knowledge acquisition is the other. Building a knowledge base of any size requires extracting structured rules from human experts, and humans are bad at articulating their tacit knowledge. The 1980s expert-system projects ran aground here. Maintaining a 12,000-rule knowledge base is a job for a team of engineers, not a one-person side project.

The lesson the 1980s taught the field: logic is great for the parts of a problem that are crisp, and useless for the parts that aren’t. The modern answer is hybrid, use logic where you can write down the rules, and machine learning where you can’t.

A decision table

If your task is...	Reach for...
Apply thousands of editable business rules to each input	A production rule engine (Drools, IBM ODM)
Recursive query over relational data	Datalog (Soufflé, Logica, Datomic)
Reason about concepts in a controlled vocabulary	Description Logic + a DL reasoner (HermiT, Pellet)
Verify that hardware or software meets a property	A SAT or SMT solver (Z3, CVC5, MiniSat)
Generate code or test cases that satisfy constraints	An SMT solver as a constraint engine
Resolve entities across systems	A knowledge graph + graph traversal + similarity matching
Adjudicate insurance claims against policy	A production rule engine, with policy authored by underwriters
Find security bugs by reasoning about possible inputs	Symbolic execution + SMT
Answer freeform natural-language questions	An LLM, possibly with retrieval against a knowledge graph

The hybrid that works: neurosymbolic AI

A live research line: combine symbolic logic with neural networks. The pattern goes by various names, neurosymbolic AI, LLM tool use over knowledge graphs, logic-augmented language models, and the practical applications are growing.

The shape, in current production:

The LLM converts a natural-language question into a structured query.
The structured query runs against a logic-based system, a knowledge graph, a Datalog engine, a SAT solver.
The result is converted back into natural language by the LLM.

This is what’s happening when you ask Claude or ChatGPT a math question and it writes Python code to solve it. The LLM is the natural-language interface; the logic is in the Python (or Wolfram, or SymPy, or the SQL the LLM writes against your database).

The pattern is still maturing, but the trajectory is clear: LLMs are becoming better natural-language interfaces to symbolic systems, not better symbolic systems themselves. The reasoning lives in the symbolic layer, and the LLM translates between human and that layer.

Symbolic AI failed at the goal of “general intelligence” the 1980s set for it, and the winter that followed gave logic-based methods a permanent reputation problem. The actual outcome is more interesting than the headlines made it look. Datalog has had a renaissance and runs static program analysis at Soufflé scale and recursive queries inside Datomic. SAT and SMT solvers routinely solve problems an LLM cannot touch, provable correctness for hardware, type checking for languages with sharp edges, symbolic execution for security tooling. Production rule engines like Drools let actuaries and underwriters maintain the rules an insurance business runs on without going through a developer. Description Logic reasons over SNOMED’s three hundred and fifty thousand medical concepts every day. Knowledge graphs combine traversal, inference, and embeddings into a stack that runs Google search and most enterprise customer-360 systems.

The shape of the future is clearer now than it was in either of the AI winters. Symbolic systems do the reasoning that has to be correct; LLMs do the natural-language interface between humans and that reasoning. Ask Claude a math question and it writes Python. Ask it a database question and it writes SQL. The hybrid is the answer, and the reasoning still lives in the symbolic layer.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.