In this post, we'll look at how to teach computers to solve puzzles. Specifically, we'll look at a simple puzzle that can be expressed as a boolean constraint satisfaction problem, and we'll write a simple constraint solver (a SAT solver) and mention how our algorithm, when augmented with a few optimizations, is used in modern SAT solvers. The only requisite knowledge is a basic understanding of algorithms and the ability to read Haskell code.

In future posts, we'll extend our puzzle solving abilities beyond boolean constratint satisfaction by writing an SMT solver; after that, we'll look at how these puzzle solving algorithms can be used for software verification.

This is an exported IHaskell notebook, and you can download the original to play around with the code.

## Boolean Constraint Satisfaction

Let's start by jumping in with an example of a constraint satisfaction problem. Suppose that you need to go grocery shopping, and need to visit three stores: Costco, Home Depot, and Walmart. Costco is open in the morning and evening, Home Depot is open in the evening only, and Walmart is open in the morning only. You can only be in one place at a time, and shopping at a given store takes up the entire morning or evening. Can you go to all three stores in a day?

To a human, it is intuitively obvious that the answer is no. Since Home Depot and Walmart offer us only one time option (evening and morning, respectively), then we have to go there at those times. However, this leaves no time for a Costco trip, so it's evident that this "puzzle" has no solution.

Now suppose instead of three stores, you were given three thousand (each with its own schedule), and instead of two times, you were given all the hours of a day? At this point, the problem becomes intractable for a human. Luckily, though, cruching numbers and analyzing thousands of different options are what computers excel at.

So, how would you encode the above problem in a way that a computer could understand?

Computers are built upon boolean algebra, operating on true and false values. Thus, a natural way to encode our problem is to try to rewrite it as an expression involving boolean variables, which can either be true or false. For example, using the example of three stores and two times, let's make six variables:

- $C_e$: Whether we go to Costco in the evening.
- $C_m$: Whether we go to Costco in the morning.
- $H_e$: Whether we go to Home Depot in the evening.
- $H_m$: Whether we go to Home Depot in the morning.
- $W_e$: Whether we go to Walmart in the evening.
- $W_m$: Whether we go to Walmart in the morning.

Each of these variables if true (or 1) if we visit the store at the corresponding time, and false otherwise. Next, we form some constraints on these variables, and express them in a unified form we could feed to a computer.

First, we know that we can only be in one place at a given time. For example, if we are at Costco in the morning (that is, $C_m = 1$), then we cannot be at Home Depot or Walmart in the morning (and thus $H_m = W_m = 0$). Using $\wedge$ for "and", $\vee$ for "or", and $\overline{A}$ to indicate "not $A$", we can express that constraint as $C_m \wedge \overline{H_m} \wedge \overline{W_m}$. Note that this constraint *assumes* that $C_m = 1$; $C_m$ must be true in order to satisfy that constraint.

Of course, we know that at a given time, we could go to Costco, Home Depot, *or* Walmart, so $C_m$ doesn't have to be true. Thus, the constraint that we only go to one place in the evening can be represented as

Similarly, the constraint that we only go to one place in the morning is

$$(C_m \wedge \overline{H_m} \wedge \overline{W_m}) \vee (\overline{C_m} \wedge {H_m} \wedge \overline{W_m}) \vee (\overline{C_m} \wedge \overline{H_m} \wedge {W_m})$$Next, we need a constraint that we go to Costco in either the morning or evening, which we can represent as $C_m \vee C_e$: either we go to Costco in the morning, or in the evening. We have similar constraints for Walmart and Home Depot, yielding the following constraint to represent that we must go to each store:

$$(C_m \vee C_e) \wedge (H_m \vee H_e) \wedge (M_m \vee M_e)$$Thus, the full set of constraints for our problem is

$$\begin{align*} (C_m \vee C_e) \wedge (H_m \vee H_e) \wedge (M_m \vee M_e) &\wedge\\ (C_m \wedge \overline{H_m} \wedge \overline{W_m}) \vee (\overline{C_m} \wedge {H_m} \wedge \overline{W_m}) \vee (\overline{C_m} \wedge \overline{H_m} \wedge {W_m}) &\wedge\\ (C_e \wedge \overline{H_e} \wedge \overline{W_e}) \vee (\overline{C_e} \wedge {H_e} \wedge \overline{W_e}) \vee (\overline{C_e} \wedge \overline{H_e} \wedge {W_e}) & \end{align*}$$To find out whether we can complete our shopping trip, we must find a set of true or false values for all our boolean variables such that the constraints are **satisfied**. This type of problem is known as the boolean satisfiability problem, often abbreviated to just "SAT". A program that finds solutions to these problems is known as a SAT solver.

```
data Expr = Var Char
| And Expr Expr
| Or Expr Expr
| Not Expr
| Const Bool
deriving (Show, Eq)
```

Note the `Var Char`

constructor: in this simple expression data type, we use `Char`

to represent variables. We also allow for `Const`

constructors to represent concrete true or false values; this is particularly useful for intermediate computations.

The most basic algorithm for SAT solving is a backtracking search. This search has the following steps:

- Find a variable in the constraint expression that
*hasn't*been assigned (a free variable). - Guess a value for this free variable.
- Replace all occurrences of the free variable with the guessed value.
- Simplify the expression. If the expression simplifies to true (one), then the values we've assigned work, and any variables that are unassigned do not matter. If the expression simplifies to false (zero), then undo the last assignment, and assign the opposite value.

First, we implement step 1: finding free variables with `freeVariable :: Expr -> Maybe Char`

. Since not all expressions have a free variables, we return a `Maybe`

value, and we can use `<|>`

from the `Alternative`

typeclass to choose the first `Just`

we encounter.

```
import Control.Applicative ((<|>))
-- Return the first free variable in the boolean expression.
-- If there are no free variables (it is Const), return Nothing.
freeVariable :: Expr -> Maybe Char
freeVariable (Const _) = Nothing
freeVariable (Var v) = Just v
freeVariable (Not e) = freeVariable e
freeVariable (Or x y) = freeVariable x <|> freeVariable y
freeVariable (And x y) = freeVariable x <|> freeVariable y
```

Next, implement step 2: replacing free variables with a constant true or false guess value. At this point, we introduce `Const`

constructors into our expression tree, replacing the appropriate `Var`

constructors with them.

```
-- Guess a value for a variable. This replaces all
-- the appropriate Var constructors with a Const constructor.
guessVariable :: Char -> Bool -> Expr -> Expr
guessVariable var val e =
case e of
Var v -> if v == var
then Const val
else Var v
Not e -> Not (guess e)
Or x y -> Or (guess x) (guess y)
And x y -> And (guess x) (guess y)
Const b -> Const b
where
guess = guessVariable var val
```

Now that we've introduced `Const`

constructors, our expression is no longer simplified. For example, we can now have expressions such as $x \wedge 0$ or $y \vee 1$, which should be simplified down to 0 and 1, respectively.

Thus, we introduce `simplify`

, which returns either a `Const`

constructor or a simplified expression; if the result is not a `Const`

constructor, it guarantees that there are no `Const`

constructors in the `Expr`

tree further down. (Note that we could encode this invariant in the type system! The best way to do this is left as an exercise to the reader.)

```
simplify :: Expr -> Expr
simplify (Const b) = Const b
simplify (Var v) = Var v
simplify (Not e) =
case simplify e of
Const b -> Const (not b)
e -> Not e
simplify (Or x y) =
-- Get rid of False values, which are irrelevant.
let es = filter (/= Const False) [simplify x, simplify y]
in
-- If True is in a branch, the entire expression is True.
if Const True `elem` es
then Const True
else
case es of
-- If all the values were False, this 'or' is unsatisfied.
[] -> Const False
[e] -> e
[e1, e2] -> Or e1 e2
-- Dual to the simplify (Or x y) definition.
simplify (And x y) =
let es = filter (/= Const True) [simplify x, simplify y]
in
if Const False `elem` es
then Const False
else
case es of
[] -> Const True
[e] -> e
[e1, e2] -> And e1 e2
```

We now have all the building blocks of the backtracking algorithm described above. We can implement it as a depth first search, using recursion for backtracking:

```
-- Extract the boolean from the Const constructor.
unConst :: Expr -> Bool
unConst (Const b) = b
unConst _ = error "Not Const"
satisfiable :: Expr -> Bool
satisfiable expr =
case freeVariable expr of
Nothing -> unConst expr
Just v ->
-- We have a variable to guess.
-- Construct the two guesses.
-- Return whether either one of them works.
let trueGuess = simplify (guessVariable v True expr)
falseGuess = simplify (guessVariable v False expr)
in satisfiable trueGuess || satisfiable falseGuess
```

We can test this backtracking search on a few trivial examples:

```
(satisfiable (Const True), satisfiable (Const False))
```

```
satisfiable $ And (Var 'x') (Not (Var 'x'))
```

```
satisfiable $ Or (Var 'x') (Not (Var 'x'))
```

We can also use this to solve our earlier puzzle. For generality, we'll write the code to solve this puzzle for any set of stores and times, but only define a small number of them:

```
-- Stores we need to go to.
data Store = Walmart | HomeDepot | Costco deriving (Show, Eq)
-- Times the stores could be open.
data Time = Morning | Evening deriving (Show, Eq)
```

Now that we have a data type to represent stores and times, let's encode (as a function) when each store is available. In a real application, this would be fetched from a database, or some other information source:

```
-- Times each store is open.
availability :: Store -> [Time]
availability Walmart = [Morning]
availability HomeDepot = [Evening]
availability Costco = [Morning, Evening]
```

Finally, create a set of constraints to satisfy. We'll first have to encode the variables ($C_m$, $C_e$, $H_m$, and so on) in our expression format, after which we can encode all our constraints:

```
-- A variable to represent visiting a store at a time.
variable :: Store -> Time -> Expr
variable Walmart Morning = Var 'a'
variable Walmart Evening = Var 'b'
variable HomeDepot Morning = Var 'c'
variable HomeDepot Evening = Var 'd'
variable Costco Morning = Var 'e'
variable Costco Evening = Var 'f'
-- Create a constraint requiring us to visit a given store
-- on *one* of the provided times. In our puzzle, the provided
-- times are all the available ones.
visitConstraint :: [Time] -> Store -> Expr
visitConstraint times store =
foldl1 Or $ map (variable store) times
-- A constraint requiring us to visit all the stores.
visitAllConstraint :: [Store] -> [Time] -> Expr
visitAllConstraint stores times =
foldl1 And $ map (visitConstraint times) stores
-- mapMaybe :: (a -> Maybe b) -> [a] -> [b]
-- Apply a function and keep all the Just results.
import Data.Maybe (mapMaybe)
-- Generate a constraint enforcing the fact that
-- we only visit stores when they are available.
storeConstraint :: [Time] -> Store -> Expr
storeConstraint allTimes store =
case mapMaybe timeConstraint allTimes of
[] -> Const True -- We can go to this store any time.
cs -> foldl1 And cs
where
-- Return Nothing is we can visit the store
-- at the time. Otherwise return a constraint
-- that must be satisfied which ensures that the
-- store is not visited at this time.
timeConstraint :: Time -> Maybe Expr
timeConstraint time
| time `elem` availability store = Nothing
| otherwise = Just (Not (variable store time))
-- (\\) is set difference on lists.
import Data.List ((\\))
-- Constraints for each time, indicating that we cannot
-- be in multiple places during one time.
timeConstraint :: [Store] -> Time -> Expr
timeConstraint allStores time =
foldl1 Or $ map chooseStore allStores
where
-- Generate a constraint requiring us to be in a given store at a time,
-- and no other stores at that same time.
chooseStore store =
And (variable store time)
(foldl1 And (map (Not . flip variable time) (allStores \\ [store])))
```

With our main logic written we now formulate our full set of constraints:

```
-- All available stores and times in the puzzle.
stores = [Walmart, HomeDepot, Costco]
times = [Morning, Evening]
constraints :: Expr
constraints =
foldl1 And $
[visitAllConstraint stores times] ++
map (storeConstraint times) stores ++
map (timeConstraint stores) times
-- Look at the expression tree
-- It's quite huge!
constraints
```

Can this be satisfied? As our human intuition immediately told us, no, it cannot:

```
satisfiable constraints
```

But now we know how to derive that result algorithmically, with a very simple SAT solver!

## Improving our Algorithm

Our simple backtracking algorithm, while it works, can be pretty slow. It's a brute-force solution to the problem: we simply search through all possible assignments of values to variables, and check whether any of them work. At worst, when given an expression with $N$ variables in it, backtracking search has to check $2^N$ different assignments; this happens if every single assignment is initially guessed incorrectly.

Surprisingly enough, it's possible that asymptotically better algorithms don't exist, as some subsets of SAT (3-SAT) are known to be NP-complete. The formal definition of NP-completeness involves a bit of math and complexity theory. Roughly speaking, NP problems are those that we can check but not necessarily solve in polynomial time, and NP-complete problems are problems that, if we could solve them in polynomial time, we could solve any NP problem in polynomial time. Computer scientists currently do not know if P is equal to NP; that is, whether all NP problems can be solved in polynomial times. Most people believe that that isn't the case, and that NP-complete problems *cannot* be solved efficiently (in polynomial time). Since 3-SAT is NP-complete, this means that it probably can't be solved in polynomial time, and so quite possibly the exponential asymptotics of backtracking search are the best we can do.

However, even if asymptotically better algorithms cannot exist, that doesn't mean we can't have algorithms that are better in practice. One such algorithm, known as the DPLL Algorithm (named after it's creators, unabbreviated as the Davis–Putnam–Logemann–Loveland algorithm), is effectively a set of heuristics that optimize our simple backtracking search. Although the DPLL algorithm was invented decades ago (in the 60s), many modern SAT solvers still use DPLL as their base, adding many algorithmic tricks, data structures, and low-level optimizations to achieve high performance.

### Conjunctive Normal Form (CNF)

The DPLL algorithm, unlike our backtracking search, requires that the input expressions be of a particular form, known as conjunctive normal form, or CNF. An expression is in CNF if it consists of conjunction of clauses, each of which is a disjunction of literals. In other words, we have a set of clauses; each clause consists of variables and negated variables which are all or-ed together; all the clauses are then and-ed together. For example, the expressions $A \wedge B \wedge (C \vee \overline A)$ and $(A \vee B) \wedge \overline A$ are in conjunctive normal form (CNF), but the expressions $\overline{\overline{A}} \vee B$ and $\overline{A \vee B}$ are not.

It may seem at first that an algorithm that *only* works on such a restricted set of expressions is useless. Our backtracking search works on *any* bolean expression! What's the point of DPLL if it requires something in CNF?

It turns out that any boolean expression can be *converted* to CNF fairly efficiently. If we apply De Morgan's laws and distribute "or"s over "ands" repeatedly, we eventually get an expression in CNF. Once we have our expression in CNF, we can feed it to DPLL.

De Morgan's laws allow you to distribute negations over conjunctions (ands) and disjunctions (ors):

- $\overline{A \wedge B} = \overline{A} \vee \overline{B}$.
- $\overline{A \vee B} = \overline{A} \wedge \overline{B}$. Applying this repeatedly ensures that the only things inside a negation are literals like $A$ or $B$, and not compound expressions, like $A \wedge B$.

Distributing disjunctions over conjunctions allows you to float the conjunctions to the outer layer, which we require for CNF: $$A \vee (X \wedge Y) = (A \vee X) \wedge (A \vee Y)$$

Before discussing DPLL, let's go ahead and implement that conversion. We begin by two helper functions: one which tries to apply De Morgan's laws, and one which tries to distribute an or over an and:

```
-- Get rid of negations on everything but
-- literals by applying De Morgan's laws
-- and removing double negations.
fixNegations :: Expr -> Expr
fixNegations expr =
case expr of
-- Remove double negatives
Not (Not x) -> fixNegations x
-- De Morgan's laws
Not (And x y) -> Or (fixNegations $ Not x) (fixNegations $ Not y)
Not (Or x y) -> And (fixNegations $ Not x) (fixNegations $ Not y)
-- Deal with constants.
Not (Const b) -> Const (not b)
-- Recurse on sub-terms.
Not x -> Not (fixNegations x)
And x y -> And (fixNegations x) (fixNegations y)
Or x y -> Or (fixNegations x) (fixNegations y)
x -> x
-- Attempt to distribute Or over And.
-- For example, A or (B and C) becomes (A or B) and (A or C).
distribute :: Expr -> Expr
distribute expr =
case expr of
-- Distribute over and in either position.
Or x (And y z) ->
And (Or (distribute x) (distribute y))
(Or (distribute x) (distribute z))
Or (And y z) x ->
And (Or (distribute x) (distribute y))
(Or (distribute x) (distribute z))
-- Recurse on sub-terms.
Or x y -> Or (distribute x) (distribute y)
And x y -> And (distribute x) (distribute y)
Not x -> Not (distribute x)
x -> x
```

Using these two helpers, we can now convert any expression to CNF by applying the two helpers repeatedly. As soon as they stop changing anything, we should have converged to CNF.

In this implementation, we check this by comparing the result against the input, which requires traversing the entire tree an extra time. We could do this much more efficiently by returning whether or not the expression was modified, but we will not here, for the sake of clarity:

```
-- Convert an expression to CNF.
-- This guarantees that the output is CNF.
cnf :: Expr -> Expr
cnf expr =
if updated == expr
then expr
else cnf updated
where
updated = distribute (fixNegations expr)
```

This can produce quite huge results. For example, take a look at the expression $A \vee \overline{B \wedge (A \vee C)}$ when converted to CNF:

```
cnf $ Or (Var 'a') (Not (And (Var 'b') (Or (Var 'a') (Var 'c'))))
```

The result is a rather length $(A \vee \overline{B} \vee \overline{A}) \wedge (A \vee \overline{B} \vee \overline{C})$, which is indeed in CNF.

### The DPLL Algorithm

The DPLL algorithm adds two shortcuts to our backtracking search, and these two shortcuts require that the input be in CNF.

First of all, in includes a **literal elimination** step. If some literal (like $A$) is seen in only one polarity (that is, it is always $A$ or $\overline{A}$), we can immediately determine the truth value of that literal. If it is in positive polarity, it must be true, and if it is only in negative polarity, it must be false. For example, in the expression $(A \vee B) \wedge (A \vee \overline{C})$, we can immediately assign true to $A$, since $A$ occurs in positive polarity only.

Second, it includes a **unit propagation** step. If there exists a clause in the top-level conjunction that consists of only one literal, we can immediately assign a value to that literal. For example, in the expression $A \wedge (B \vee \overline{A})$, we immediately know that $A$ must be true, since it occurs alone.

#### Literal Elimination

To demonstrate, let's implement incredibliy simple variants of these two steps. We'll start with literal elimination, in which we build a list of all literals and then check whether each one occurs in only one polarity:

```
import Data.Set (Set)
import qualified Data.Set as Set
-- Get all the literals in an expression.
literals :: Expr -> Set Char
literals (Var v) = Set.singleton v
literals (Not e) = literals e
literals (And x y) = Set.union (literals x) (literals y)
literals (Or x y) = Set.union (literals x) (literals y)
literals _ = Set.empty
-- Literal polarity. Positive means the literal
-- is never negated; negative means it always is.
data Polarity = Positive | Negative | Mixed deriving (Show, Eq)
-- Determine whether a literal has a polarity.
-- Return Nothing if the literal doesn't appear in the expression.
literalPolarity :: Expr -> Char -> Maybe Polarity
-- Literal in positive polarity
literalPolarity (Var v) v'
| v == v' = Just Positive
| otherwise = Nothing
-- Literal in negative polarity
literalPolarity (Not (Var v)) v'
| v == v' = Just Negative
| otherwise = Nothing
-- Combine polarities of sub expressions
literalPolarity e v =
case e of
And x y -> combinePolarities [x, y]
Or x y -> combinePolarities [x, y]
Not x -> error $ "Not in CNF: negation of a non-literal: " ++ show x
Const _ -> Nothing
where
combinePolarities es =
let polarities = mapMaybe (flip literalPolarity v) es
in case polarities of
[] -> Nothing
ps -> if all (== Positive) ps
then Just Positive
else if all (== Negative) ps
then Just Negative
else Just Mixed
-- catMaybes :: [Maybe a] -> [a]
-- Extract only the Just values of a list.
import Data.Maybe (catMaybes)
literalElimination :: Expr -> Expr
literalElimination e =
let ls = Set.toList (literals e)
ps = map (literalPolarity e) ls
-- Find assignments we can make
extractPolarized :: Char -> Maybe Polarity -> Maybe (Char, Bool)
extractPolarized v (Just Positive) = Just (v, True)
extractPolarized v (Just Negative) = Just (v, False)
extractPolarized _ _ = Nothing
-- Find *all* possible assignments
assignments :: [(Char, Bool)]
assignments = catMaybes $ zipWith extractPolarized ls ps
-- Apply all the assignments.
replacers :: [Expr -> Expr]
replacers = map (uncurry guessVariable) assignments
replaceAll :: Expr -> Expr
replaceAll = foldl (.) id replacers
in replaceAll e
```

We can verify that it works by testing it on our earlier example of $(A \vee B) \wedge (A \vee \overline{C})$:

```
literalElimination $ And (Or (Var 'a') (Var 'b')) (Or (Var 'a') (Not (Var 'b')))
```

You'll note that the literal elimination left $B$ alone, since it appeared in positive and negative polarity, but replaced occurences of $A$ with an assignment of true, since all $A$s occurred in positive polarity.

#### Unit Propagation

Next, we'll implement the other DPLL heuristic, unit propagation, in which we find single-literal clauses and assign them a value. First, we'll determine which of the clauses are unit clauses (have only one literal):

```
-- Given a clause, determine whether it is a unit clause.
-- If it is, then return the variable name and the polarity;
-- if it isn't, return Nothing.
unitClause :: Expr -> Maybe (Char, Bool)
unitClause (Var v) = Just (v, True)
unitClause (Not (Var v)) = Just (v, False)
unitClause _ = Nothing
-- Reduce an expression into a list of clauses.
-- This has to traverse a tree of And constructors
-- and create a list out of them.
clauses :: Expr -> [Expr]
clauses (And x y) = clauses x ++ clauses y
clauses expr = [expr]
-- Extract all unit clauses from a CNF expression.
allUnitClauses :: Expr -> [(Char, Bool)]
allUnitClauses = mapMaybe unitClause . clauses
```

Then, we effect our replacement for all the unit clauses, just like we did in literal elimination:

```
unitPropagation :: Expr -> Expr
unitPropagation expr = replaceAll expr
where
assignments :: [(Char, Bool)]
assignments = allUnitClauses expr
replaceAll :: Expr -> Expr
replaceAll = foldl (.) id (map (uncurry guessVariable) assignments)
```

Testing it on our earlier example of $A \wedge (B \vee \overline{A})$ yields what we expect:

```
unitPropagation $ And (Var 'a') (Or (Var 'b') (Not (Var 'a')))
```

Using unit propagation and literal elimination steps, we can augment our backtracking search to implement the DPLL algorithm. Note how the majority of the code looks identical to what we had before:

```
-- The only important thing that changed is the
-- addition of the 'where' clause.
satisfiableDPLL :: Expr -> Bool
satisfiableDPLL expr =
case freeVariable expr' of
Nothing -> unConst $ simplify expr'
Just v ->
let trueGuess = simplify (guessVariable v True expr)
falseGuess = simplify (guessVariable v False expr)
in satisfiableDPLL trueGuess || satisfiableDPLL falseGuess
where
-- Apply our backtracking search *after* literal elimination
-- and unit propagation have been applied!
expr' = literalElimination $ cnf $ unitPropagation expr
```

Luckily we get the same result on our puzzle:

```
satisfiableDPLL constraints
```

At this point, you should have a good understanding of backtracking search for SAT solving, as well as DPLL, which is the basis for all further optimizations and heuristics that must come together for a high performance SAT solver. As a disclaimer, note that the code above is *not* meant for performance; it makes a number of assumptions and uses data structures that are not optimal if you want a fast SAT solver.

In the next section, we'll discuss what tricks can be used to make DPLL perform extraordinarily quickly, allowing modern SAT solvers to handle millions of variables and clauses in seconds. We won't focus any more on code, as it quickly gets too unweidly for a blog post when it comes to writing high performance code.

## Further Heuristics and Optimizations

When discussing DPLL and backtracking, we've left a few key steps of the algorithm only partially specified. For example:

- If there are multiple free variables available in an expression, how do we choose which one to branch on?
- How do we decide whether we should guess true or false for a variable?
- What data structures should we use for storing clauses, and how do we detect unit clauses and polarized literals?

Accelerating SAT solving relies on answering these, and other questions, in clever ways. As mentioned before, we can't improve the *asymptotic* complexity of our algorithms by answering these questions differently, but we can improve real-world efficiency of our solvers.

Deciding on the next branching literal can be done in a number of ways. In our implementation, we simply used the first literal we encountered. This is not necessarily the best solution – one can imagine a situation in which the first encountered literal happens to conflict with the last chosen literal, and so if the wrong path is chosen, half the search space must be looked at before the error can be fixed.

There exist a number of heuristics which have been used to compute branching literals. For example, you can find the literal that is mentioned in the most clauses. By doing so, you either reduce the search space dramatically (by reducing the number of clauses and maybe variables), or you quickly reach a conflict, which means you waste less time searching through that part of the solution space. While this heuristic is pretty good, it can be fairly expensive to keep track of how many times a literal is used, especially once you have thousands or millions of variables!

Heuristics can also exist for the initial guess on a variable. For example, if a variable occurs more times in the negative than in the positive polarity, it may be better to guess a value of false for it before guessing true. Once again, the value of the heuristic has to be weighed against the cost of computing it.

One last optimization worth mentioning is known as **clause learning**. Clause learning is based off two ideas. First of all, when we encounter a conflict, we can deduce what decisions we made to cause that conflict; from that, we can learn a new clause, forcing us not to make those decisions. Second, we can track which variable assignments imply other assignments, and when a conflict is reached, instead of backtracking one level higher, we can *backjump* as many levels up as necessary to fix the source of the conflict. Together, these two things allow quickly pruning large regions of the search space, dramatically increasing the number of clauses and variables SAT solvers can handle. Many modern SAT solvers are based off DPLL augmented with clause learning, yielding an algorithm known as Conflict-Driven Clause Learning (CDCL).