Search and Planning

June 06, 2026 · 9 min read

You open the map app. You type an address. Forty milliseconds later it shows you a path through 3.4 million road segments, optimal in time, accounting for current traffic. There’s no neural network involved. There’s no learning. There’s an algorithm from the 1960s, running on a graph, doing what it has always done.

In the previous five posts we covered AI for problems where the input is text. Classification, retrieval, generation, the lot. This post leaves text behind. Most of what the textbooks call classical AI, the kind in Russell and Norvig’s Artificial Intelligence: A Modern Approach, isn’t about understanding language. It’s about searching through possibilities.

Search algorithms run more production AI than transformers do. They route your packets, plan your warehouse robot’s path, schedule your CI build, find the chess move, plan your delivery route. They’ve been quietly working since the 1960s and they’re not getting replaced.

This post walks the family. Like Before the Transformer but for problem-solving instead of language modelling.

The setup: states, actions, goals

Most search problems share a structure:

States. Configurations of the world. The position of the chess pieces. The location of the delivery truck. The contents of the warehouse robot’s basket.
Actions. Things you can do that change one state into another. Move a piece. Drive to the next intersection. Pick up an item.
A goal. A state (or set of states) you want to reach. Checkmate. The customer’s address. All packages delivered.
A cost. Often there’s a cost on each action, distance, time, fuel, whatever, and you want the cheapest path to the goal.

The search problem is: find a sequence of actions that gets you from the starting state to a goal state, ideally cheaply, ideally fast.

That’s it. That’s the framing. Once a problem is in this shape, decades of algorithms apply.

Uninformed search

These are the algorithms you can write in fifty lines because they don’t know anything about the problem, they just systematically explore.

Breadth-first search (BFS) explores layer by layer, finding the path with the fewest actions. Use it for: small graphs, “fewest-moves” puzzles, finding the nearest matching node. The classic example: solve the 8-puzzle in the minimum number of slides.

Depth-first search (DFS) goes as deep as possible before backtracking. Use it for: exploring trees, generating permutations, anything where you need a memory-light traversal. Classic example: enumerate all possible game positions.

Iterative deepening (IDS) combines both: do DFS to depth 1, then to depth 2, then to depth 3, and so on. Memory of DFS, completeness of BFS. Used in chess engines for depth-limited search.

Uniform-cost search is BFS with weighted edges, explore in order of cumulative cost rather than in order of depth. Equivalent to Dijkstra’s algorithm, which you’ve probably implemented at some point.

These are the workhorses. They’re old, they’re simple, and they show up everywhere.

Informed search: A* and friends

The big jump happens when you have a heuristic, a function that estimates how far each state is from the goal. With a heuristic, you don’t explore blindly. You explore in order of “most promising next.”

A* (Hart, Nilsson, and Raphael, 1968) is the algorithm. It expands states in order of g(n) + h(n), where g is the cost to reach the state and h is the heuristic estimate of cost to the goal. If h never overestimates the true cost (an “admissible” heuristic), A* is guaranteed to find the optimal path.

A* runs:

Your map app, with a heuristic of straight-line distance to the destination.
Pathfinding in games, where the units need to walk around walls efficiently.
Robot path planning, both in factories and in self-driving cars.
Puzzle solving, with heuristics like “number of misplaced tiles” for the 15-puzzle.

A* works because a good heuristic can collapse the search space dramatically, from “explore everything” to “explore only what’s plausible.”

When A* won’t fit in memory, you have variants: IDA* (iterative-deepening A*), SMA* (memory-bounded A*), D* (dynamic A* for changing environments). These are all in the toolbox for production pathfinding.

Local search and metaheuristics

When the search space is too big to enumerate, you give up on optimality and try to find a good answer rather than the best one. This is local search.

Hill climbing starts from a random state and moves to the best neighbour. Simple, fast, gets stuck in local optima. Good enough for many problems.

Simulated annealing (Kirkpatrick, Gelatt, and Vecchi, 1983) hill-climbs but occasionally accepts a worse move (more often early, less often later). The “annealing” comes from metallurgy, cooling slowly to find a better global structure. Workhorse for layout problems, scheduling, and combinatorial optimisation.

Genetic algorithms maintain a population of candidate solutions, combine pairs (“crossover”), perturb them (“mutation”), and select the fittest to breed. Used for design-space exploration, hyperparameter tuning before Bayesian methods, and antenna design (NASA has flown a genetic-algorithm-designed antenna).

Tabu search keeps a list of recently-visited states and refuses to revisit them, forcing the search to explore new territory.

These are not the prestige algorithms of the field, but they’re the practical answer for “I have a giant combinatorial problem and I need a reasonable solution by Friday.”

Adversarial search: games

When you’re playing against an opponent, single-agent search isn’t enough, you need to anticipate what they’ll do. Enter adversarial search.

Minimax is the basic game-tree search: assume both players play optimally, and pick the move that maximises your worst-case outcome. The tree branches at every move, with you maximising at your turns and the opponent minimising at theirs.

Alpha-beta pruning is the optimisation that makes minimax practical. By tracking the best score the maximiser is assured of (alpha) and the best the minimiser is assured of (beta), large parts of the search tree can be pruned without affecting the result. A well-tuned alpha-beta search can go many plies deeper than naive minimax in the same time.

This is the algorithmic core of:

Chess engines (Stockfish, the strongest classical chess program, is alpha-beta with extensive engineering).
Checkers, Go (pre-AlphaGo), Othello, and most other deterministic two-player games.
Monte Carlo tree search (MCTS), which is what AlphaGo and AlphaZero used, a different strategy, but still adversarial search at heart.

Even the deep-learning-based game systems use search. AlphaGo combined a neural network for evaluation with MCTS for search. Stockfish 16+ has a neural network evaluation but still does alpha-beta search through the tree. The search is the engine; the neural net is the heuristic.

Constraint satisfaction problems

Slightly different shape: you have a set of variables, each with a domain of possible values, and a set of constraints between them. You want an assignment of values to variables that satisfies all constraints.

Examples:

Sudoku. Variables are cells, domains are 1-9, constraints are row/column/box uniqueness.
Map colouring. Variables are regions, domains are colours, constraints are “adjacent regions different.”
Class scheduling. Variables are courses, domains are time slots, constraints are room/teacher/student conflicts.
Configuration. Variables are component choices, domains are products, constraints are compatibility.

The classical algorithm is backtracking with constraint propagation: pick a variable, try a value, propagate the implications, recurse, backtrack on failure. The cleverness is in the propagation, arc consistency, forward checking, unit propagation, which prunes the search space dramatically.

Industrial CSP solvers (Google OR-Tools, Choco, MiniZinc) handle problems with millions of variables and constraints. They run:

Hospital staff scheduling.
Aircraft and crew scheduling.
Hardware verification.
Network configuration.
Supply-chain optimisation.

There’s no learning involved. Just very-well-engineered search through structured spaces.

Planning

Planning is the version of search where the action descriptions are more abstract. Instead of a graph of states, you have:

A description of the world in terms of facts (the box is at location A, the gripper is empty, the door is open).
A library of actions, each described by preconditions (what must be true to execute) and effects (what becomes true / false after executing).
A goal expressed in terms of facts (the box is at location B).

The classical algorithm here is STRIPS (Stanford Research Institute Problem Solver, 1971) and its descendants. Modern planners (Fast Downward, LAMA, ENHSP) can handle much richer planning problems with continuous variables, time, and resources.

Planning runs:

Robot task planning. “Put the cup on the shelf” decomposed into a sequence of low-level actions.
Logistics and delivery routing in complex domains.
Game AI for non-player-character behaviour, particularly Goal-Oriented Action Planning (GOAP) in commercial game engines.
Spacecraft autonomy. NASA’s Remote Agent on Deep Space 1 was a planner.

Planning is less famous than minimax or A*, but in domains where you need to reason about long action sequences, it’s the correct tool.

A decision table

If your task is...	Reach for...
Shortest route between points on a graph	Dijkstra (no heuristic) or A* (with one)
Pathfinding for a unit in a 2D/3D world	A* on a navigation grid or mesh
A two-player perfect-information game	Alpha-beta or MCTS, with a learned or hand-crafted evaluation
Solving a puzzle (Sudoku, n-queens, scheduling)	A CSP solver (OR-Tools, MiniZinc)
Optimising a hard combinatorial problem	Simulated annealing or genetic algorithm if exact methods are infeasible
Sequencing actions for a robot or workflow	A planner (PDDL + Fast Downward, or GOAP for games)
Routing many vehicles to many destinations	A vehicle-routing solver (built on CSP / mixed-integer programming)
Build-system dependency resolution	Topological sort + Dijkstra or DAG-aware scheduler

Search vs ML: when each wins

Search and machine learning solve different shapes of problem.

Search wins when:

The state space and action space are well-defined.
Optimality (or near-optimality) is the goal, not “good enough.”
Problems are deterministic and the rules are knowable.
You can write a heuristic that estimates progress.

ML wins when:

The state space is fuzzy or perceptual (images, raw text).
“Good enough” is fine and you can’t define optimal.
The rules are statistical, not deterministic.
You have lots of examples to learn from.

The most powerful systems combine both. AlphaZero is search guided by a learned heuristic. Self-driving cars use planning over a perception layer that’s deep learning. Modern logistics systems use ML to predict demand and search to plan delivery.

Most of the AI in the textbooks before deep learning was some flavour of search. Those textbooks weren’t wrong; they were a generation early, and most of the algorithms they covered are still in production. A* still finds the route in the map app. Alpha-beta still drives the strongest classical chess engines, and even AlphaZero is search guided by a learned evaluator rather than a pure neural play. CSP solvers schedule the hospital, the airline, and the supply chain. STRIPS-descended planners sequence robot actions and ran the autonomy on Deep Space 1. When exact search doesn’t fit, simulated annealing and genetic algorithms produce respectable answers by Friday.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.