Propositional Logic
Improving Boolean Satisfiability Algorithms
Introduction
A propositional formula in Conjunctive Normal Form (CNF) is a conjunction of clauses , with . Each clause being a disjunction of literals and each literal being either a positive () or a negative () propositional variable, with . By denoting with the possible presence of , we can formally define as:
The Boolean Satisfiability Problem (SAT) consists in determining whether there exists a truth assignment in (or equivalently in ) for the variables in .
DPLL with Branching Heuristics
The Davis-Putnam-Logemann-Loveland (DPLL) algorithm is a complete (will answer SAT if a solution exists) and sound (it will not answer SAT for an unsatisfiable formula) procedue that combines backtracking search and deduction to decide satisfiability of propositional logic formula in CNF. At each search step a variable and a propositional value are selected for branching purposes. With each branching step, two values can be assigned to a variable, either 0 or 1. Branching corresponds to assigning the chosen value to the chosen variable. Afterwards, the logical consequences of each branching step are evaluated. Each time an unsatisfied clause (ie a conflict) is identified, backtracking is executed. Backtracking corresponds to undoing branching steps until an unflipped branch is reached. When both values have been assigned to the selected variable at a branching step, backtracking will undo this branching step. If for the first branching step both values have been considered, and backtracking undoes this first branching step, then the CNF formula can be declared unsatisfiable. This kind of backtracking is called chronological backtracking.
Essentially, DPLL
is a backtracking depth-first search through partial truth assignments which uses a splitting rule to replaces the original problem with two smaller subproblems, whereas the original Davis-Putnam procedure uses a variable elimination rule which replaces the original problem with one larger subproblem. Over the years, many heuristics have been proposed in choosing the splitting variable (which variable should be assigned a truth value next).
Search algorithms that are based on a predetermined order of search are called static algorithms, whereas the ones that select them at the runtime are called dynamic. The first SAT search algorithm, the Davis-Putnam procedure is a static algorithm. Static search algorithms are usually very slow in practice and for this reason perform worse than dynamic search algorithms. However, dynamic search algorithms are much harder to design, since they require a heuristic for predetermining the order of search. The fundamental element of a heuristic is a branching strategy for selecting the next branching literal. This must not require a lot of time to compute and yet it must provide a powerful insight into the problem instance.
Two basic heuristics are applied to this algorithm with the potential of cutting the search space in half. These are the pure literal rule and the unit clause rule.
the pure literal rule is applied whenever a variable appears with a single polarity in all the unsatisfied clauses. In this case, assigning a truth value to the variable so that all the involved clauses are satisfied is highly effective in the search;
if some variable occurs in the current formula in a clause of length 1 then the unit clause rule is applied. Here, the literal is selected and a truth value so the respective clause is satisfied is assigned. The iterative application of the unit rule is commonly reffered to as Boolean Constraint Propagation (BCP).
def dpll_satisfiable(s, branching_heuristic=no_branching_heuristic):
"""Check satisfiability of a propositional sentence.
This differs from the book code in two ways: (1) it returns a model
rather than True when it succeeds; this is more useful. (2) The
function find_pure_symbol is passed a list of unknown clauses, rather
than a list of all clauses and the model; this is more efficient.
>>> dpll_satisfiable(A |'<=>'| B) == {A: True, B: True}
True
"""
return dpll(conjuncts(to_cnf(s)), prop_symbols(s), {}, branching_heuristic)
def dpll(clauses, symbols, model, branching_heuristic=no_branching_heuristic):
"""See if the clauses are true in a partial model."""
unknown_clauses = [] # clauses with an unknown truth value
for c in clauses:
val = pl_true(c, model)
if val is False:
return False
if val is None:
unknown_clauses.append(c)
if not unknown_clauses:
return model
P, value = find_pure_symbol(symbols, unknown_clauses)
if P:
return dpll(clauses, remove_all(P, symbols), extend(model, P, value), branching_heuristic)
P, value = find_unit_clause(clauses, model)
if P:
return dpll(clauses, remove_all(P, symbols), extend(model, P, value), branching_heuristic)
P, value = branching_heuristic(symbols, unknown_clauses)
return (dpll(clauses, remove_all(P, symbols), extend(model, P, value), branching_heuristic) or
dpll(clauses, remove_all(P, symbols), extend(model, P, not value), branching_heuristic))
Each of these branching heuristics was applied only after the pure literal and the unit clause heuristic failed in selecting a splitting variable.
MOMs
MOMs heuristics are simple, efficient and easy to implement. The goal of these heuristics is to prefer the literal having Maximum number of Occurences in the Minimum length clauses. Intuitively, the literals belonging to the minimum length clauses are the most constrained literals in the formula. Branching on them will maximize the effect of BCP and the likelihood of hitting a dead end early in the search tree (for unsatisfiable problems). Conversely, in the case of satisfiable formulas, branching on a highly constrained variable early in the tree will also increase the likelihood of a correct assignment of the remained open literals. The MOMs heuristics main disadvatage is that their effectiveness highly depends on the problem instance. It is easy to see that the ideal setting for these heuristics is considering the unsatisfied binary clauses.
def min_clauses(clauses):
min_len = min(map(lambda c: len(c.args), clauses), default=2)
return filter(lambda c: len(c.args) == (min_len if min_len > 1 else 2), clauses)
def moms(symbols, clauses):
"""
MOMS (Maximum Occurrence in clauses of Minimum Size) heuristic
Returns the literal with the most occurrences in all clauses of minimum size
"""
scores = Counter(l for c in min_clauses(clauses) for l in prop_symbols(c))
return max(symbols, key=lambda symbol: scores[symbol]), True
Over the years, many types of MOMs heuristics have been proposed.
MOMSf choose the variable with a maximize the function:
where is the number of occurrences of in the smallest unknown clauses, k is a parameter.
def momsf(symbols, clauses, k=0):
"""
MOMS alternative heuristic
If f(x) the number of occurrences of the variable x in clauses with minimum size,
we choose the variable maximizing [f(x) + f(-x)] * 2^k + f(x) * f(-x)
Returns x if f(x) >= f(-x) otherwise -x
"""
scores = Counter(l for c in min_clauses(clauses) for l in disjuncts(c))
P = max(symbols,
key=lambda symbol: (scores[symbol] + scores[~symbol]) * pow(2, k) + scores[symbol] * scores[~symbol])
return P, True if scores[P] >= scores[~P] else False
Freeman’s POSIT
def posit(symbols, clauses):
"""
Freeman's POSIT version of MOMs
Counts the positive x and negative x for each variable x in clauses with minimum size
Returns x if f(x) >= f(-x) otherwise -x
"""
scores = Counter(l for c in min_clauses(clauses) for l in disjuncts(c))
P = max(symbols, key=lambda symbol: scores[symbol] + scores[~symbol])
return P, True if scores[P] >= scores[~P] else False
Zabih and McAllester’s
def zm(symbols, clauses):
"""
Zabih and McAllester's version of MOMs
Counts the negative occurrences only of each variable x in clauses with minimum size
"""
scores = Counter(l for c in min_clauses(clauses) for l in disjuncts(c) if l.op == '~')
return max(symbols, key=lambda symbol: scores[~symbol]), True
DLIS & DLCS
Literal count heuristics count the number of unresolved clauses in which a given variable appears as a positive literal, , and as negative literal, . These two numbers an either be onsidered individually or ombined.
Dynamic Largest Individual Sum heuristic considers the values and separately: select the variable with the largest individual value and assign to it value true if , value false otherwise.
def dlis(symbols, clauses):
"""
DLIS (Dynamic Largest Individual Sum) heuristic
Choose the variable and value that satisfies the maximum number of unsatisfied clauses
Like DLCS but we only consider the literal (thus Cp and Cn are individual)
"""
scores = Counter(l for c in clauses for l in disjuncts(c))
P = max(symbols, key=lambda symbol: scores[symbol])
return P, True if scores[P] >= scores[~P] else False
Dynamic Largest Combined Sum considers the values and combined: select the variable with the largest sum and assign to it value true if , value false otherwise.
def dlcs(symbols, clauses):
"""
DLCS (Dynamic Largest Combined Sum) heuristic
Cp the number of clauses containing literal x
Cn the number of clauses containing literal -x
Here we select the variable maximizing Cp + Cn
Returns x if Cp >= Cn otherwise -x
"""
scores = Counter(l for c in clauses for l in disjuncts(c))
P = max(symbols, key=lambda symbol: scores[symbol] + scores[~symbol])
return P, True if scores[P] >= scores[~P] else False
JW & JW2
Two branching heuristics were proposed by Jeroslow and Wang in
The one-sided Jeroslow and Wang’s heuristic compute:
and selects the assignment that satisfies the literal with the largest value .
def jw(symbols, clauses):
"""
Jeroslow-Wang heuristic
For each literal compute J(l) = \sum{l in clause c} 2^{-|c|}
Return the literal maximizing J
"""
scores = Counter()
for c in clauses:
for l in prop_symbols(c):
scores[l] += pow(2, -len(c.args))
return max(symbols, key=lambda symbol: scores[symbol]), True
The two-sided Jeroslow and Wang’s heuristic identifies the variable with the largest sum , and assigns to value true, if , and value false otherwise.
def jw2(symbols, clauses):
"""
Two Sided Jeroslow-Wang heuristic
Compute J(l) also counts the negation of l = J(x) + J(-x)
Returns x if J(x) >= J(-x) otherwise -x
"""
scores = Counter()
for c in clauses:
for l in disjuncts(c):
scores[l] += pow(2, -len(c.args))
P = max(symbols, key=lambda symbol: scores[symbol] + scores[~symbol])
return P, True if scores[P] >= scores[~P] else False
CDCL with 1UIP Learning Scheme, 2WL Lazy Data Structure, VSIDS Branching Heuristic & Restarts
The Conflict-Driven Clause Learning (CDCL) solver is an evolution of the DPLL algorithm that involves a number of additional key techniques:
non-chronological backtracking or backjumping;
learning new clauses from conflicts during search by exploiting its structure;
using lazy data structures for storing clauses;
branching heuristics with low computational overhead and which receive feedback from search;
periodically restarting search.
The first difference between a DPLL solver and a CDCL solver is the introduction of the non-chronological backtracking or backjumping when a conflict is identified. This requires an iterative implementation of the algorithm because only if the backtrack stack is managed explicitly it is possible to backtrack more than one level.
def cdcl_satisfiable(s, vsids_decay=0.95, restart_strategy=no_restart):
"""
>>> cdcl_satisfiable(A |'<=>'| B) == {A: True, B: True}
True
"""
clauses = TwoWLClauseDatabase(conjuncts(to_cnf(s)))
symbols = prop_symbols(s)
scores = Counter()
G = nx.DiGraph()
model = {}
dl = 0
conflicts = 0
restarts = 1
sum_lbd = 0
queue_lbd = []
while True:
conflict = unit_propagation(clauses, symbols, model, G, dl)
if conflict:
if dl == 0:
return False
conflicts += 1
dl, learn, lbd = conflict_analysis(G, dl)
queue_lbd.append(lbd)
sum_lbd += lbd
backjump(symbols, model, G, dl)
clauses.add(learn, model)
scores.update(l for l in disjuncts(learn))
for symbol in scores:
scores[symbol] *= vsids_decay
if restart_strategy(conflicts, restarts, queue_lbd, sum_lbd):
backjump(symbols, model, G)
queue_lbd.clear()
restarts += 1
else:
if not symbols:
return model
dl += 1
assign_decision_literal(symbols, model, scores, G, dl)
Clause Learning with 1UIP Scheme
The second important difference between a DPLL solver and a CDCL solver is that the information about a conflict is reused by learning: if a conflicting clause is found, the solver derive a new clause from the conflict and add it to the clauses database.
Whenever a conflict is identified due to unit propagation, a conflict analysis procedure is invoked. As a result, one or more new clauses are learnt, and a backtracking decision level is computed. The conflict analysis procedure analyzes the structure of unit propagation and decides which literals to include in the learnt clause. The decision levels associated with assigned variables define a partial order of the variables. Starting from a given unsatisfied clause (represented in the implication graph with vertex ), the conflict analysis procedure visits variables implied at the most recent decision level (ie the current largest decision level), identifies the antecedents of visited variables, and keeps from the antecedents the literals assigned at decision levels less than the most recent decision level. The clause learning procedure used in the CDCL can be defined by a sequence of selective resolution operations, that at each step yields a new temporary clause. This process is repeated until the most recent decision variable is visited.
The structure of implied assignments induced by unit propagation is a key aspect of the clause learning procedure. Moreover, the idea of exploiting the structure induced by unit propagation was further exploited with Unit Implication Points (UIPs). A UIP is a dominator in the implication graph and represents an alternative decision assignment at the current decision level that results in the same conflict. The main motivation for identifying UIPs is to reduce the size of learnt clauses. Clause learning could potentially stop at any UIP, being quite straightforward to conclude that the set of literals of a clause learnt at the first UIP has clear advantages. Considering the largest decision level of the literals of the clause learnt at each UIP, the clause learnt at the first UIP is guaranteed to contain the smallest one. This guarantees the highest backtrack jump in the search tree.
def conflict_analysis(G, dl):
conflict_clause = next(G[p]['K']['antecedent'] for p in G.pred['K'])
P = next(node for node in G.nodes() - 'K' if G.nodes[node]['dl'] == dl and G.in_degree(node) == 0)
first_uip = nx.immediate_dominators(G, P)['K']
G.remove_node('K')
conflict_side = nx.descendants(G, first_uip)
while True:
for l in prop_symbols(conflict_clause).intersection(conflict_side):
antecedent = next(G[p][l]['antecedent'] for p in G.pred[l])
conflict_clause = pl_binary_resolution(conflict_clause, antecedent)
# the literal block distance is calculated by taking the decision levels from variables of all
# literals in the clause, and counting how many different decision levels were in this set
lbd = [G.nodes[l]['dl'] for l in prop_symbols(conflict_clause)]
if lbd.count(dl) == 1 and first_uip in prop_symbols(conflict_clause):
return 0 if len(lbd) == 1 else heapq.nlargest(2, lbd)[-1], conflict_clause, len(set(lbd))
def pl_binary_resolution(ci, cj):
for di in disjuncts(ci):
for dj in disjuncts(cj):
if di == ~dj or ~di == dj:
return pl_binary_resolution(associate('|', remove_all(di, disjuncts(ci))),
associate('|', remove_all(dj, disjuncts(cj))))
return associate('|', unique(disjuncts(ci) + disjuncts(cj)))
def backjump(symbols, model, G, dl=0):
delete = {node for node in G.nodes() if G.nodes[node]['dl'] > dl}
G.remove_nodes_from(delete)
for node in delete:
del model[node]
symbols |= delete
2WL Lazy Data Structure
Implementation issues for SAT solvers include the design of suitable data structures for storing clauses. The implemented data structures dictate the way BCP are implemented and have a significant impact on the run time performance of the SAT solver. Recent state-of-the-art SAT solvers are characterized by using very efficient data structures, intended to reduce the CPU time required per each node in the search tree. Conversely, traditional SAT data structures are accurate, meaning that is possible to know exactly the value of each literal in the clause. Examples of the most recent SAT data structures, which are not accurate and therefore are called lazy, include the watched literals used in Chaff .
The more recent Chaff SAT solver
def unit_propagation(clauses, symbols, model, G, dl):
def check(c):
if not model or clauses.get_first_watched(c) == clauses.get_second_watched(c):
return True
w1, _ = inspect_literal(clauses.get_first_watched(c))
if w1 in model:
return c in (clauses.get_neg_watched(w1) if model[w1] else clauses.get_pos_watched(w1))
w2, _ = inspect_literal(clauses.get_second_watched(c))
if w2 in model:
return c in (clauses.get_neg_watched(w2) if model[w2] else clauses.get_pos_watched(w2))
def unit_clause(watching):
w, p = inspect_literal(watching)
G.add_node(w, val=p, dl=dl)
G.add_edges_from(zip(prop_symbols(c) - {w}, itertools.cycle([w])), antecedent=c)
symbols.remove(w)
model[w] = p
def conflict_clause(c):
G.add_edges_from(zip(prop_symbols(c), itertools.cycle('K')), antecedent=c)
while True:
bcp = False
for c in filter(check, clauses.get_clauses()):
# we need only visit each clause when one of its two watched literals is assigned to 0 because, until
# this happens, we can guarantee that there cannot be more than n-2 literals in the clause assigned to 0
first_watched = pl_true(clauses.get_first_watched(c), model)
second_watched = pl_true(clauses.get_second_watched(c), model)
if first_watched is None and clauses.get_first_watched(c) == clauses.get_second_watched(c):
unit_clause(clauses.get_first_watched(c))
bcp = True
break
elif first_watched is False and second_watched is not True:
if clauses.update_second_watched(c, model):
bcp = True
else:
# if the only literal with a non-zero value is the other watched literal then
if second_watched is None: # if it is free, then the clause is a unit clause
unit_clause(clauses.get_second_watched(c))
bcp = True
break
else: # else (it is False) the clause is a conflict clause
conflict_clause(c)
return True
elif second_watched is False and first_watched is not True:
if clauses.update_first_watched(c, model):
bcp = True
else:
# if the only literal with a non-zero value is the other watched literal then
if first_watched is None: # if it is free, then the clause is a unit clause
unit_clause(clauses.get_first_watched(c))
bcp = True
break
else: # else (it is False) the clause is a conflict clause
conflict_clause(c)
return True
if not bcp:
return False
class TwoWLClauseDatabase:
def __init__(self, clauses):
self.__twl = {}
self.__watch_list = defaultdict(lambda: [set(), set()])
for c in clauses:
self.add(c, None)
def get_clauses(self):
return self.__twl.keys()
def set_first_watched(self, clause, new_watching):
if len(clause.args) > 2:
self.__twl[clause][0] = new_watching
def set_second_watched(self, clause, new_watching):
if len(clause.args) > 2:
self.__twl[clause][1] = new_watching
def get_first_watched(self, clause):
if len(clause.args) == 2:
return clause.args[0]
if len(clause.args) > 2:
return self.__twl[clause][0]
return clause
def get_second_watched(self, clause):
if len(clause.args) == 2:
return clause.args[-1]
if len(clause.args) > 2:
return self.__twl[clause][1]
return clause
def get_pos_watched(self, l):
return self.__watch_list[l][0]
def get_neg_watched(self, l):
return self.__watch_list[l][1]
def add(self, clause, model):
self.__twl[clause] = self.__assign_watching_literals(clause, model)
w1, p1 = inspect_literal(self.get_first_watched(clause))
w2, p2 = inspect_literal(self.get_second_watched(clause))
self.__watch_list[w1][0].add(clause) if p1 else self.__watch_list[w1][1].add(clause)
if w1 != w2:
self.__watch_list[w2][0].add(clause) if p2 else self.__watch_list[w2][1].add(clause)
def remove(self, clause):
w1, p1 = inspect_literal(self.get_first_watched(clause))
w2, p2 = inspect_literal(self.get_second_watched(clause))
del self.__twl[clause]
self.__watch_list[w1][0].discard(clause) if p1 else self.__watch_list[w1][1].discard(clause)
if w1 != w2:
self.__watch_list[w2][0].discard(clause) if p2 else self.__watch_list[w2][1].discard(clause)
def update_first_watched(self, clause, model):
# if a non-zero literal different from the other watched literal is found
found, new_watching = self.__find_new_watching_literal(clause, self.get_first_watched(clause), model)
if found: # then it will replace the watched literal
w, p = inspect_literal(self.get_second_watched(clause))
self.__watch_list[w][0].remove(clause) if p else self.__watch_list[w][1].remove(clause)
self.set_second_watched(clause, new_watching)
w, p = inspect_literal(new_watching)
self.__watch_list[w][0].add(clause) if p else self.__watch_list[w][1].add(clause)
return True
def update_second_watched(self, clause, model):
# if a non-zero literal different from the other watched literal is found
found, new_watching = self.__find_new_watching_literal(clause, self.get_second_watched(clause), model)
if found: # then it will replace the watched literal
w, p = inspect_literal(self.get_first_watched(clause))
self.__watch_list[w][0].remove(clause) if p else self.__watch_list[w][1].remove(clause)
self.set_first_watched(clause, new_watching)
w, p = inspect_literal(new_watching)
self.__watch_list[w][0].add(clause) if p else self.__watch_list[w][1].add(clause)
return True
def __find_new_watching_literal(self, clause, other_watched, model):
# if a non-zero literal different from the other watched literal is found
if len(clause.args) > 2:
for l in disjuncts(clause):
if l != other_watched and pl_true(l, model) is not False:
# then it is returned
return True, l
return False, None
def __assign_watching_literals(self, clause, model=None):
if len(clause.args) > 2:
if model is None or not model:
return [clause.args[0], clause.args[-1]]
else:
return [next(l for l in disjuncts(clause) if pl_true(l, model) is None),
next(l for l in disjuncts(clause) if pl_true(l, model) is False)]
VSIDS Branching Heuristic
The early branching heuristics made use of all the information available from the data structures, namely the number of satisfied, unsatisfied and unassigned literals. These heuristics are updated during the search and also take into account the clauses that are learnt.
More recently, a different kind of variable selection heuristic, referred to as Variable State Independent Decaying Sum (VSIDS), has been proposed by Chaff authors in
def assign_decision_literal(symbols, model, scores, G, dl):
P = max(symbols, key=lambda symbol: scores[symbol] + scores[~symbol])
value = True if scores[P] >= scores[~P] else False
symbols.remove(P)
model[P] = value
G.add_node(P, val=value, dl=dl)
Restarts
Solving NP-complete problems, such as SAT, naturally leads to heavy-tailed run times. To deal with this, SAT solvers frequently restart their search to avoid the runs that take disproportionately longer. What restarting here means is that the solver unsets all variables and starts the search using different variable assignment order.
While at first glance it might seem that restarts should be rare and become rarer as the solving has been going on for longer, so that the SAT solver can actually finish solving the problem, the trend has been towards more aggressive (frequent) restarts.
The reason why frequent restarts help solve problems faster is that while the solver does forget all current variable assignments, it does keep some information, specifically it keeps learnt clauses, effectively sampling the search space, and it keeps the last assigned truth value of each variable, assigning them the same value the next time they are picked to be assigned.
Luby
In this strategy, the number of conflicts between 2 restarts is based on the Luby sequence. The Luby restart sequence is interesting in that it was proven to be optimal restart strategy for randomized search algorithms where the runs do not share information. While this is not true for SAT solving, as shown in
The exact description of Luby restarts is that the restart happens after conflicts, where is a constant and is defined as:
A less exact but more intuitive description of the Luby sequence is that all numbers in it are powers of two, and after a number is seen for the second time, the next number is twice as big. The following are the first 16 numbers in the sequence:
From the above, we can see that this restart strategy tends towards frequent restarts, but some runs are kept running for much longer, and there is no upper limit on the longest possible time between two restarts.
def luby(conflicts, restarts, queue_lbd, sum_lbd, unit=512):
# in the state-of-art tested with unit value 1, 2, 4, 6, 8, 12, 16, 32, 64, 128, 256 and 512
def _luby(i):
k = 1
while True:
if i == (1 << k) - 1:
return 1 << (k - 1)
elif (1 << (k - 1)) <= i < (1 << k) - 1:
return _luby(i - (1 << (k - 1)) + 1)
k += 1
return unit * _luby(restarts) == len(queue_lbd)
Glucose
Glucose restarts were popularized by the Glucose solver, and it is an extremely aggressive, dynamic restart strategy. The idea behind it and described in
A bit more precisely, if there were at least conflicts (and thus learnt clauses) since the last restart, and the average Literal Block Distance (LBD) (a criterion to evaluate the quality of learnt clauses as shown in
def glucose(conflicts, restarts, queue_lbd, sum_lbd, x=100, k=0.7):
# in the state-of-art tested with (x, k) as (50, 0.8) and (100, 0.7)
# if there were at least x conflicts since the last restart, and then the average LBD of the last
# x learnt clauses was at least k times higher than the average LBD of all learnt clauses
return len(queue_lbd) >= x and sum(queue_lbd) / len(queue_lbd) * k > sum_lbd / conflicts