MCSAT in Rust

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

IT 22 005

Examensarbete 30 hp
Februari 2022

Prototyping an mcSAT-based SMT


solver in Rust

Dennis Örnberg

Institutionen för informationsteknologi


Department of Information Technology
Abstract
Prototyping an mcSAT-based SMT solver in Rust

Dennis Örnberg

Teknisk- naturvetenskaplig fakultet


UTH-enheten Satisfiability modulo theories, or SMT, is the decision problem of determining
whether a set of formulas is satisfiable or not, given one or more background
Besöksadress: theories. The model-constructing satisfiability calculus, or mcSAT, is a framework
Ångströmlaboratoriet
Lägerhyddsvägen 1 used for solving SMT problems. It is an alternative to the framework DPLL(T) which
Hus 4, Plan 0 is commonly used for many state-of-the-art SMT solvers today, and it has shown
promising results in a number of different prototypes.
Postadress:
Box 536
751 21 Uppsala Today there are many SMT solvers available that are developed using C/C++.
Another language that has increased in popularity is Rust, which has a focus on safety
Telefon: and performance. The language is designed to be memory-safe, and the compiler can
018 – 471 30 03 analyze and eliminate many memory-safety issues at compile time. There are no
Telefax: known SMT solvers written in Rust today.
018 – 471 30 00
The goal of this thesis was to define an architecture for an mcSAT-based SMT solver
Hemsida: that can be implemented using Rust, and to create a simple prototype with some of
http://www.teknat.uu.se/student
the key parts of the architecture. A big focus is on figuring out what works well with
Rust and whether there are any pitfalls with how the language works. An architecture
specifically for Rust is introduced, along with an early prototype using this
architecture and a preliminary evaluation. The prototype works for simple examples
using a Boolean theory. Not all of the rules of mcSAT have been implemented, and
both the architecture and the prototype are in the early stages of development with
room for improvements. The result shows that Rust is suitable for writing this type of
software, even though there are a few pitfalls that developers should be aware of.

Handledare: Amanda Stjerna


Ämnesgranskare: Philipp Rümmer
Examinator: Mats Daniels
IT 22 005
Tryckt av: Reprocentralen ITC
Acknowledgement

I would like to thank my two supervisors Amanda Stjerna and Philipp Rümmer at
Uppsala University for making this thesis possible. Without your expertise in SMT,
mcSAT and Rust, this project would not be nearly as comprehensive as it now turned
out to be. Thank you for always being available for questions and for all the great advice
during the last year.
Contents

1 Introduction 1
1.1 Satisfiability modulo theories . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 DPLL(T) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Model-constructing satisfiability calculus . . . . . . . . . . . . . . . . . . 2
1.4 Rust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Goal and structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Background 5
2.1 Model-constructing satisfiability calculus . . . . . . . . . . . . . . . . . . 5
2.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Rust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Ownership and borrowing . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Dynamic dispatch . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.4 Lifetimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.5 Additional features . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Hash consing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Existing implementations 15
3.1 CVC4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.1 Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.2 Trail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.3 Plugins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 mcBV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4 Architecture 18
4.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.1 Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.2 Constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.3 Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.4 Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3.5 Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3.6 Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3.7 Predicate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3.8 Literal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3.9 Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.4 Alternative architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4.1 Expression enum . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4.2 Trait objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.5 Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.5.1 Trail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.5.2 TrailElement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.5.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.6 Theory & solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.6.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.6.2 BooleanTheory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.6.3 Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.6.4 State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5 Implementation 33
5.1 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3 Hash consing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.4 Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.5 Theory & solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.5.1 Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6 Evaluation 41
6.1 Performance and correctness . . . . . . . . . . . . . . . . . . . . . . . . 41
6.1.1 Z3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.2 The Rust experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.2.1 The good parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.2.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3 Possible improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

7 Conclusions and Future Work 47

Bibliography 48
Chapter 1

Introduction

1.1 Satisfiability modulo theories


Satisfiability modulo theories, or SMT, is the decision problem of determining whether
a set of formulas is satisfiable or not given one or more background theories. These for-
mulas are normally expressed using first-order logic, which is a type of formal language
often used to describe problems in areas like computer science, mathematics and philos-
ophy. It refers to logic that uses quantified variables and is di↵erent from propositional
logic which only deals with propositions that can be either true or false. As an example,
the statement “there exists a person with the name John” is formulated using first-order
logic. It can be formulated the following way:
9p.(person(p) ^ name(p) = john)
The symbol person(p) is an example of a predicate which is true if p is a person, and
false otherwise. The symbol name(p) is a function where the output is the name of the
person p, and the symbol john is a constant that represents the name “John”. The
name of the person p is then compared to the name “John” using equality. The ability
to express problems using first-order logic makes it easier to build software that can
reason about these problems.
SMT is computationally hard and there are no known deterministic algorithms that
can solve an arbitrary instance in polynomial time. However, there are software ap-
plications called SMT solvers that can solve many instances in practice. SMT solvers
are often used in formal verification where you prove or disprove the correctness of a
program.
Theories are a central aspect of SMT. In order for an SMT solver to solve instances
for a specific theory, the solver must support that particular theory. Some SMT solvers
have support for several theories while others specialize in efficiently solving instances
for one particular theory. Worth noting is that not all theories are decidable, an example
of this is non-linear integer arithmetic.
SMT shares substructure with the Boolean satisfiability problem, or SAT. The main
di↵erence is that SMT can reason about first-order logic, while SAT is about satisfying
Boolean constraints. In this sense, SMT can be seen as an extension of SAT, or as a
generalization of SAT into the first-order domain [1].

1.2 DPLL(T)
There are di↵erent frameworks used for solving SMT problems, one that is commonly
used is called DPLL(T). It is an extension of the algorithm known as DPLL (Davis-

1
Putnam-Logemann-Loveland) that has been used for solving SAT problems for decades.
While DPLL is a search algorithm able to decide the satisfiability of propositional logic
formulas, DPLL(T) takes that a step further and also allows formulas using first-order
logic.
The most substantial di↵erence between DPLL and DPLL(T) is that the latter in-
stantiates a theory solver for some theory T and then relies on that for theory-specific
functionality. The DPLL procedure still handles things that are not theory-specific like
propagations and clause learning, among other things [2]. Clause learning is the pro-
cess of learning conflicts that might occur during the search and then using them to
backtrack in an efficient manner. This is a powerful technique that is the cornerstone
of another popular algorithm for solving SAT problems called Conflict Driven Clause
Learning, or CDCL.
If a problem turns out to be unsatisfiable, the theory solver must produce a reason
for why it is unsatisfiable. DPLL(T) is incremental and allows for backtracking in the
case that the solver makes a decision about some literal that turns out to be wrong. A
literal is an atomic formula that is either negated or not.
The algorithm behind DPLL(T) transforms an SMT problem into a SAT problem
that it then solves, using a decision procedure to check if the solution works under the
specified theory. This means that the framework separates the Boolean symbols from
the theory that is used. It is possible to combine the use of a normal SAT solver for the
DPLL-part and a theory solver for anything that is theory-specific.

1.3 Model-constructing satisfiability calculus


The model-constructing satisfiability calculus, or mcSAT, was introduced as an alterna-
tive to DPLL(T) in 2013 [1]. Instead of transforming the problem into a SAT problem
like DPLL(T), mcSAT allows for direct model construction and conflict resolution with-
out the need for a separate decision procedure. In this case, a model works as a partial
solution that describes what values the di↵erent variables have been assigned. Model-
construction means that the framework will incrementally build up a model in order to
satisfy the problem instance using searching and propagation. Conflict resolution is the
process of resolving conflicts as they appear, this is usually done by performing back-
tracking of some sort. A benefit from direct model construction is that the knowledge
from the theory can be used directly in the procedure. A paper from 2016 explored
the theory of fixed-length bit-vectors with mcSAT, another paper from 2017 presented
a method for solving nonlinear integer constraints that relies on mcSAT, both with
promising results [3, 4].

1.4 Rust
Rust is a programming language with a focus on safety and performance that was first
introduced in 2010. The language has increased in popularity lately, and it is often
compared to other high-performance languages like C/C++ and some consider Rust to
be a modern replacement of those languages. Rust applications should in theory be well
suited for embedded systems and performance-critical services and it can easily integrate
with other languages. One of the more unique features of Rust is its ownership concept,
which enables the language to make memory-safety guarantees. The ownership system
uses a set of rules that the programmer must adhere to, and these rules are enforced and
checked by the compiler at compile time. This enables the program to automatically
allocate and free memory without the need for the programmer to do so explicitly,
and without the need for a garbage collector. However, this approach is di↵erent from

2
how many other languages handle memory management and it can take some time
for developers to get used to it. Another advantage of these rules is that it simplifies
concurrent programming, since the compiler will turn many of the common concurrency
errors into compile time errors [5]. The high performance combined with the memory-
safety guarantees and potential for concurrency makes Rust especially interesting in the
context of scientific computing.
Ryan Levick from Microsoft claimed that Rust seemed to be the best currently
available alternative to C and C++ in 2019 [6]. One reason for this is because of the
memory-safety guarantees, and for the fact that around 70% of all of their common
vulnerabilities and exposures are related to memory-safety. They also claim that if
more of their software was written in Rust then most, if not all, of these issues would
disappear.

1.5 Goal and structure


Today there are zero known existing SMT solvers written in Rust. Most of the commonly
used SMT solvers are written in C/C++, including the solvers Z3, CVC4, OpenSMT
and Yices 2. A big reason for using C/C++ for these types of applications is the high
performance of those languages, together with the tight control that they o↵er in terms
of memory usage. A solver can technically be implemented in any general-purpose
language, but some will o↵er better performance than others.
The goals of this project are listed below:
1. Define an architecture/framework for an mcSAT-based SMT solver in Rust.
2. Create a simple prototype of some of the key components of a solver using Rust.
3. Make a preliminary evaluation of the prototype.

The architecture/framework should be designed so that it is able to handle arbitrary


theories and the focus will be on making sure it is easy to understand, extend and
modify.
The prototype consists of some, but not all, parts of the mcSAT-framework. The goal
is not to build a complete and fully functional SMT solver with competitive performance,
but rather to investigate how suitable Rust is for building this prototype, how flexible
the language is and whether there are any pitfalls. The fact that Rust handles memory
management di↵erently compared to other programming languages can make it hard
to implement a solver in the same way that it would be implemented in for example
C++. The prototype can be evaluated by showing that it works for a few example
instances using a specific theory, and by analyzing how flexible it is regarding the use
and implementation of di↵erent theories.
The structure of this report is as follows: A few of the existing implementations
of mcSAT are explored and their implementation details are explained briefly. The
framework mcSAT and the language Rust are explained in more detail including some
examples of how they work. The design of the architecture is explained and motivated,
and the implementation is described in more detail. Lastly, the resulting framework
and the prototype are evaluated and a few of the improvements that can be made are
listed. This thesis is written from the perspective of someone who does not have any
prior knowledge or experience with either the framework mcSAT or the language Rust.

3
1.6 Results
The prototype was evaluated by implementing a simple Boolean theory and then con-
structing a few instances, both satisfiable and unsatisfiable. The prototype works cor-
rectly for every instance except one, and uses the necessary rules of mcSAT in a correct
way. Not all of the rules are implemented in this prototype, only the ones that are
necessary for the simple examples to work.
The prototype is also evaluated by running the same instances in the solver Z3 and
then comparing the output and the run time. Since these instances are small, it is hard
to evaluate how well the prototype can perform for larger instances.
The chosen architecture is valid and functional, but the features of Rust steered the
design into being slightly less flexible than first imagined in regards to di↵erent theories.
The prototype is intended to be a first version, and there is room for improvements in
many areas.

4
Chapter 2

Background

The following chapter will go into detail about how the model-constructing satisfiability
calculus works, along with an example of an unsatisfiable instance and how it can be
solved. The language Rust is introduced along with a common technique for reducing
memory usage called hash consing.

2.1 Model-constructing satisfiability calculus


The model-constructing satisfiability calculus consists of a procedure that can be de-
scribed as an abstract transition system. A state is represented as a pair hM, Ci, where
M is a sequence of elements, also called a trail, and C is a set of clauses. In this report,
a clause consists of a finite collection of literals. A literal is an atomic formula that may
or may not be negated. A clause is a disjunction of literals, where the whole clause
is true if at least one literal in the clause is true. An example would be the clause
x = 1 _ x = 2 _ x = 3, where the whole clause evaluates to true if x is equal to either 1,
2 or 3. The trail M is the core of the procedure, it contains trail elements that describe
the decisions and propagations that have taken place up to the current state in order to
satisfy the clauses C.
There are three di↵erent types of trail elements: decided literals, propagated literals
and model assignments. A decided literal is a literal L that is assumed to be true,
for example: L = (x > 0). A propagated literal is a literal L that is implied in the
current state by the clause C, denoted as C ! L. For example, if we already know
that x = 1 and we have the clause (x = 0 _ y = 5), then that implies that the following
propagated literal holds true: (x = 0 _ y = 5) ! (y = 5). For both of these types of
trail elements, we say that L 2 M . A model assignment is an assignment of some value
↵ to an uninterpreted first-order symbol x, denoted as x 7! ↵. An example of a model
assignment could be z 7! 3. Decided literals and model assignments are referred to as
decisions.
It is possible to construct a first-order interpretation, denoted as v[M ], from the
model assignments on a trail M . The interpretation v[M ](t) of some term t is then
either the value of that term under the interpretation, or Undef if the term cannot be
evaluated. In this report, a term is either a constant, a variable or a function that takes
n number of terms, where n > 0. As an example, if we have a trail M that looks like the
following: M = x > 0, x 7! 1, y 7! 0, z = 0 , then the first-order interpretation would
look like this: v[M ] = [x 7! 1, y 7! 0]. v[M ](x) would be equal to 1 and v[M ](z) would
be equal to Undef.
Two functions are defined, valueB and valueT , these are used to evaluate literals

5
using the content in the trail. They are defined as follows:
8
>
<True L2M
valueB (L, M ) = False ¬L 2 M (2.1)
>
:
Undef otherwise
8
>
<True v[M ](L) = True
valueT (L, M ) = False v[M ](L) = False (2.2)
>
:
Undef otherwise
The function valueB interprets the literal based on the decided literals and propagated
literals in the trail, also called the Boolean assignment, while valueT looks at the first-
order interpretation of the literal. Using these two functions, two predicates can be
defined: consistent(M ) and complete(M ).
(
True 8L 2 M , valueT (L, M ) 6= False
consistent(M ) = (2.3)
False otherwise
(
True 8L 2 M , valueT (L, M ) = True
complete(M ) = (2.4)
False otherwise
A trail is consistent when the first-order interpretations are not in conflict with any of
the Boolean assignments, and a trail is complete when each Boolean assignment has a
corresponding first-order interpretation that also evaluates to True. A trail not being
consistent implies that the trail is also infeasible, meaning that there is a set of literals in
M that are not satisfiable according to the first-order interpretation. As an example, the
trail M = x > 0, x 7! 0 is neither consistent nor complete since the model assignment
for x is in conflict with the decided literal. The trail M = x > 0 is consistent but not
complete since there is no model assignment for the variable x; the function valueT will
return Undef for x. The trail M = x > 0, x 7! 1 is both consistent and complete since
the decided literal and the model assignment are not in conflict.
One invariant for mcSAT is that the trail must always be in a consistent state. As
long as the trail is consistent, the value of a literal can be defined using the following
function: (
valueB (L, M ) valueB (L, M ) 6= Undef
value(L, M ) = (2.5)
valueT (L, M ) otherwise
If a trail is consistent, the values of valueB and valueT will not disagree for any of the
literals. The value of a literal can therefore first be evaluated using valueB , and if it
returns Undef then the result of valueT can be used instead.
There are a total of 15 di↵erent rules described in the paper that introduced mcSAT
[1]. The rules will not be formally introduced in this report, but a high-level explanation
of how the calculus works will be provided.

DECIDE RESOLVE T-PROPAGATE


PROPAGATE CONSUME T-DECIDE

CONFLICT BACKJUMP T-CONFLICT


SAT UNSAT T-CONSUME
FORGET LEARN T-BACKJUMP-DECIDE

6
These rules are used for transitioning between di↵erent states by either searching or
conflict resolution. For example, if a search rule is applied that assigns some value to
a symbol, then a new trail element that explains this assignment will be added to the
trail. If a conflict occurs and the procedure needs to perform backtracking, one or more
trail elements will be removed from the trail, starting with the element that was added
last. When conflicts occur, a conflict clause that explains what the problem is will be
added to the problem. In order for the problem to be satisfiable, the conflict clause must
evaluate to True. As an example, if we have made a decision that a variable x should
be assigned the value 1 but we also have a clause in the original problem that says that
x > 1, then x > 1 would be a valid conflict clause. In this case, the conflict might be
solvable by backtracking and undoing the assignment that x is 1.
In order for the procedure to work, a theory must be defined. A theory can define
the constants, predicates and functions that can be used. An example of a theory is
a Boolean theory where the only constants are True and False, the only predicate is
equality and where there are no functions. Another example is a theory where the
constants can be any integer, the predicates are < and >, and the functions are + and
. The theory should also contain a heuristic that is responsible for making decisions
and assigning values to variables.
To ensure that the procedure always terminates, all literals used in the procedure
must come from a finite set of literals called the basis B. This basis should at least contain
all the literals from the example instance, both negated and not. As an example, if we
have a Boolean theory and the instance x = True _ y = True, then the finite basis
B could consist of the literals x = True, ¬(x = True), y = True and ¬(y = True).
Note that in this theory ¬(x = True) is equivalent to x = False. There are no other
variables than x and y in this example, and they cannot have any other values than
True or False. It is also allowed to have literals with equalities between variables, for
example x = y. A basis is mostly a theoretical component of a solver, and it can be
hard to define in practice.
Another important responsibility of the theory is the ability to generate explanations,
which are clauses that can be used to describe theory-specific propagations and conflicts.
All the literals in the explanation must come from the finite basis B, the literals must
all evaluate to False on the current trail, and the explanation must also be a valid
formula. A theory-specific propagation can be made when there is a literal L with an
undefined value in the current trail, and when the trail would be infeasible if ¬L would
be added to the trail. In this case, an explanation clause E should be created to explain
the propagated literal E ! L that is added to the trail.
A theory-specific conflict will happen if the current trail is infeasible, in that case
an explanation that explains why the trail is infeasible should be created and added as
a conflict clause. An example of this could be if a trail contains the literals x > 0 and
x < 0, the explanation could then be ¬(x > 0) _ ¬(x < 0).
The rules that start with a T are theory-specific rules while the other ones are more
general rules that can be used regardless of the underlying theory. In many cases there
can be multiple rules that are applicable and it is not always obvious how these rules
should be prioritized. Since mcSAT is a framework, it is up to the user of the framework
to figure out how to apply these rules in an efficient manner and to find heuristics to
get the best possible performance. It is possible to define new rules as long as they do
not violate the invariants that the trail must be consistent and that none of the initial
clauses of the problem are removed.

7
2.1.1 Example
Consider the following example with the two Boolean variables x and y, and the values
True and False:

C = (x = True _ y = True) ^ (x = True _ y = False)


^ (x = False _ y = True) ^ (x = False _ y = False)

There are no possible assignments to x and y that satisfies all of the clauses. Following
is an example of how mcSAT can be used to find out that the example is unsatisfiable,
where T represents True and F represents False:

8
h , Ci

1. T-DECIDE
h x 7! T , Ci

2. PROPAGATE
h x 7! T, (x = F _ y = T) ! y = T , Ci

3. PROPAGATE
h x 7! T, (x = F _ y = T) ! y = T, (x = F _ y = F) ! y = F , Ci

4. T-CONFLICT
h x 7! T, (x = F _ y = T) ! y = T, (x = F _ y = F) ! y = F , Ci
` ¬(y = T) _ ¬(y = F)

5. RESOLVE
h x 7! T, (x = F _ y = T) ! y = T , Ci ` ¬(y = T) _ x = F

6. RESOLVE
h x 7! T , Ci ` x = F

7. BACKJUMP
h x = F ! x = F , Ci

8. T-DECIDE
h x = F ! x = F, x 7! F , Ci

9. PROPAGATE
h x = F ! x = F, x 7! F, (x = T _ y = T) ! y = T , Ci

10. PROPAGATE
h x = F ! x = F, x 7! F, (x = T _ y = T) ! y = T, (x = T _ y = F) ! y = F , Ci

11. T-CONFLICT
h x = F ! x = F, x 7! F, (x = T _ y = T) ! y = T, (x = T _ y = F) ! y = F , Ci
` ¬(y = T) _ ¬(y = F)

12. RESOLVE
h x = F ! x = F, x 7! F, (x = T _ y = T) ! y = T , Ci ` ¬(y = T) _ x = T

13. RESOLVE
h x = F ! x = F, x 7! F , Ci ` x = T

14. BACKJUMP
h x = F ! x = F, x = T ! x = T , Ci

15. T-CONFLICT
h x = F ! x = F, x = T ! x = T , Ci ` ¬(x = F) _ ¬(x = T)

16. RESOLVE
h x = F ! x = F , Ci ` ¬(x = F)

17. RESOLVE
h , Ci ` F

18. UNSAT

9
We must start by making a decision about either of the two variables, in this case we
can assign the value True to x by using the rule T-DECIDE. This rule takes any of the
variables in the set of clauses that does not currently have a value, and assigns a value
to it so that the trail remains consistent. This will add the model assignment x 7! True
to the trail.
After this we can apply the rule PROPAGATE which will perform Boolean constraint
propagation. In this case, we have the clause x = False _ y = True where x = False
evaluates to False on the current trail and y = True evaluates to Undef. In order for
this clause to evaluate to True, we can propagate that y = True. This will add the
propagated literal (x = False _ y = True) ! y = True to the trail.
It is now possible to use the same rule again and propagate that (x = False _ y =
False) ! y = False. We then end up with a trail containing two literals saying
that y = True and y = False. This is infeasible which means that we can apply the
next rule which is T-CONFLICT, this rule will produce an explanation of the infeasibility
and set it as the conflict clause. In this case, the explanation produced is the clause
¬(y = True) _ ¬(y = False).
In order to solve the conflict, we must use any of the conflict resolution rules. In
this case we can use the rule RESOLVE which will produce a new conflict clause that is
implied by the current conflict clause and the latest trail element, (x = False _ y =
False) ! y = False. The negation of y = False is in the conflict clause, which means
that the latest trail element can be removed from the trail and ¬(y = False) can be
removed from the conflict clause. However, we must make sure to add x = False to the
conflict clause, since that was a literal in the trail element that was removed.
We can use the same rule again to remove the trail element (x = False_y = True) !
y = True, after this the trail will contain only the model assignment x 7! True and the
conflict clause will simply say that x = False. We must somehow undo the decision
we made in the beginning of the example that x is True, because that will never lead
to the problem being satisfied. We can do this by applying the rule BACKJUMP, which
will exit the conflict resolution state by removing at least one decision from the trail
and replacing it with a propagated literal. The rule can be applied if at least one
decision and all later trail elements are removed from the trail, and all the literals in
the conflict clause evaluate to False on that trail except for one literal which evaluates
to Undef. This means that the conflict clause will imply the literal that evaluated to
Undef. In this case, we will end up with a trail containing only the propagated literal
x = False ! x = False and no conflict clause.
After this, we will repeat the same rules in the same order as before but with slightly
di↵erent values. T-DECIDE will add the model assignment x 7! False, PROPAGATE will
then add two propagated literals to the trail saying that y = True and y = False.
T-CONFLICT will produce the same explanation as before, ¬(y = True) _ ¬(y = False),
and RESOLVE will remove the propagated literals and change the conflict clause to x =
True. BACKJUMP will exit the conflict resolution by removing the model assignment
x 7! False and replacing it with the propagated literal x = True ! x = True.
At this stage, we will have a trail saying that x = False and x = True which,
again, is infeasible. The rule T-CONFLICT is applied which will add the explanation
¬(x = False) _ ¬(x = True) as a conflict clause. We can once again use the rule
RESOLVE twice to remove both of the trail elements, the conflict clause will then simply
be False. This means that there is nothing that can be done to get rid of the conflict
clause. Finally, the rule UNSAT is used to say that the problem is unsatisfiable.
This is not the only way to reach UNSAT for this particular problem. Instead of
applying the rule PROPAGATE at step 3, we could also apply the rules T-DECIDE, CONFLICT
and BACKJUMP, in that order. T-DECIDE could assign the value True to y based on the
previous propagation, but we would then end up with a conflict. We would apply

10
CONFLICT which would add the conflict clause x = False _ y = False. This clause must
be from the set of clauses that are described in the problem, and it must evaluate to
False on the current trail. BACKJUMP can be applied to undo the latest model assignment
and add the propagated literal (x = False _ y = False) ! y = False to the trail. This
would put us in exactly the same state as we were in after step 3 in the first example.
Another option is to start by applying the rule DECIDE which would take any of the
literals in the problem and just add it to the trail.
It is easy to make the mistake of thinking that the literal x = ↵ is exactly the same
thing as the model assignment x 7! ↵. The di↵erence is that literals do not necessarily
say exactly which value a variable must have, while a model assignment is a direct
assignment of some value to a variable. Another way of thinking of it is that x = ↵
implies that “x should be equal to ↵”, while x 7! ↵ says that “x has the value ↵”.

2.2 Rust
2.2.1 Ownership and borrowing
The ownership rules in Rust are rather simple in theory: each value must have exactly
one variable (or place) that works as an owner, and when the owner goes out of scope,
the values that it owns will also go out of scope. These rules ensure that each owner will
automatically clean up all the data that it owns when it goes out of scope. An example
of this that may be confusing to newcomers is the following code snippet:
let s1 = String::from("Hello");
let s2 = s1;
println!("{} world!", s1);

Without knowledge about how Rust works, it is easy to assume that the program will
output “Hello world!”, but instead it does not compile at all. The compiler shows the
error “borrow of moved value: s1” which indicates that the reference held by the variable
s1 is somehow invalid. String-objects are allocated on the heap, they are stored as a
vector of bytes and they are mutable and can change size. The variable s1 has a pointer
to the string on the heap, but when s2 is created, it takes over the ownership of that
string and that invalidates s1. Using Rust’s terminology, we say that s1 will move into
s2. The modified example below will compile and print “Hello world!”, since s2 will not
borrow the value s1 when it is cloned:
let s1 = String::from("Hello");
let s2 = s1.clone();
println!("{} world!", s1);

Another common example of a similar quirk is when a function is called with an owned
object as an argument. If the object is passed directly to the function, then the function
will take ownership of that object. This means that the ownership moves from the caller
to the function, and the caller can no longer use the variable that held the reference
to the object in the first place. Sometimes this is the desired behavior, but it is also
possible to instead pass the reference to the object as an argument to the function. This
will not a↵ect the ownership of the object, and this is the desired behavior if the object
should be passed around to multiple di↵erent functions one after another.
There are three types of semantics in Rust: Copy, Borrow, and Move. Copy means
that a copy of the object is created, Borrow means that the object is borrowed and later
returned, and Move means that the ownership of the object is transferred. A key detail
about the concepts of ownership and borrowing is that they are accomplished through

11
zero-cost abstractions, meaning that there is no penalty when it comes to performance
or memory consumption. All of the analysis is done at compile time, and if the rules
are not followed then the project will simply not build at all.

2.2.2 Traits
Rust does not have inheritance in the same sense as many other languages. In Rust, you
cannot let a struct inherit anything from another struct. This is where another feature
of the language comes in: Traits. A trait is similar to what is often called an interface
in object-oriented languages, it describes some behavior that can then be implemented
by other types. A trait can contain function prototypes, with or without default imple-
mentations, that can then be implemented by all the types that are conforming to this
trait. If a function in a trait does not have a default implementation, then that function
must be implemented by each type that implements this trait. As the name implies,
a trait typically describes what a type can do rather than what it is. Examples of
built-in traits in Rust are Clone which implements cloning functionality and Send which
indicates that a value is safe to send between threads. The compiler can automatically
generate implementations for some of the built-in traits. A key di↵erence between traits
and how interfaces work in many other languages is that it is possible to implement
traits for types that the user did not define.

2.2.3 Dynamic dispatch


An important feature in many programming languages is polymorphism, which allows
the user to define a common interface for several di↵erent data types. Rust allows
dynamic dispatch with something that is called trait objects. Dynamic dispatch is when
a program can dynamically select the implementation of a polymorphic function at run
time. A trait object is essentially an object that points to a concrete implementation
that implements the given trait, along with a table containing the trait methods.
As an example, we can define a trait called Shape that defines a function that returns
the area of the shape. Each object that implements this trait must also implement this
function. We could then have two structs, Square and Rectangle, which both implement
this trait. A trait object of type Shape can then refer to an instance of either of these
two types. This makes it possible to pass around trait objects of type Shape without
having to explicitly say which type of shape it is.
A key detail about trait objects is that the compiler cannot determine the size of
them at compile time, which in turn means that the compiler will put some restrictions
on what the developer is allowed to do with them. For example, a Square can have
just one variable that specifies the length of all of its sides while a Rectangle would
need to have two variables. Because of this, a Rectangle would occupy more space in
memory. Thus if there is a function that takes any trait object of type Shape as an
argument, the compiler cannot determine how large this type is at compile time. It is
very common to wrap trait objects inside a pointer type since pointers will always have
a known size. There are a few di↵erent types of pointers in Rust, a common one is Box
which is used for heap allocations, or Arc which is a thread-safe pointer which provides
shared ownership with automatic reference counting.

2.2.4 Lifetimes
A common issue that can occur in many programming languages are dangling pointers,
which is when you have pointers that do not point at valid data. This can happen
if there are two actors who share the same reference to some data, and one of them
removes that data before the other one tries to access it.

12
Rust is solving this by introducing a concept called lifetimes, which describes for how
long a reference is going to be valid. The compiler can often figure out the lifetimes of
references automatically, this is called lifetime elision, but sometimes the developer has
to explicitly add lifetime annotations. The important thing is that all references have
lifetimes and they are all either implicit or explicit. Below is an example of how to add
an explicit lifetime to a reference inside a struct:

struct Square< a> {


side: & a f64,
}

This is a square that contains a reference to a double precision floating point type which
describes the length of all of its sides. Here we are saying that Square has a lifetime
that we call a, and this lifetime is also used for side. By doing this, we now have a
guarantee that all squares that are still in scope will also contain references that are in
scope, there is no way that a square can outlive its references. Without these explicit
lifetime annotations, the compiler will not know for how long the references in this
example can be valid and the code will not compile.

2.2.5 Additional features


Rust contains both a safe and an unsafe programming language. Safe Rust is considered
the ”true” Rust programming language where, for example, null pointers are not per-
mitted and data races are guaranteed to be absent. Unsafe Rust is just like Safe Rust
except it allows the programmer to do additional things where safety cannot be guar-
anteed. Depending on what the language is used for, this can sometimes be necessary.
For the purpose of this thesis, only Safe Rust will be considered.
Rust also features a package manager called Cargo which enables users to both
download third-party libraries, compile packages and also distribute packages to the
Rust community’s package registry.

2.3 Hash consing


An instance of an SMT-problem can consist of many di↵erent clauses, and it is possible
to have the same literals in several of these clauses. One thing to consider for an
implementation of an SMT solver is that you want to avoid storing the same literal
several times. If there is a literal that exists in several di↵erent clauses, a good solution
could be to store this literal in only one location in the memory and then letting the
clauses have references to that literal.
One way of addressing this issue is to use hash consing, a technique that can be used
to share functional data that are structurally equal [7]. A typical implementation is to
have a global hash table that stores the values that have been allocated, and to look in
this hash table every time a value is to be created. If it is already in the hash table, then
there is no need to store it again. This assures that each value will be stored at most
once. Finding a value in a hash table has an average time complexity of O(1) unless
hash collisions occur.
Hash consing is not always ideal. In order to check whether a literal is already stored
in the hash table, the literal must first be constructed. Hash consing will not prevent the
unnecessary construction of literals, but it can reduce the memory usage. Additionally,
there will always be some overhead from storing these literals and from checking the
hash table every time new data is to be created.

13
If the same literals are used in many di↵erent clauses, it could make sense to use
hash consing. But if every literal is unique, it would be a waste of time to always
have to check the hash table for a duplicate. SMT problems can often have the same
literals in many di↵erent clauses, which means that hash consing can be a useful tool
for minimizing the memory usage for large problem instances.
There is a crate named hashconsing written in Rust which implements the func-
tionality for hash consing values of the same type. A data type HConsed is used as
a wrapper for the actual data and it consists of two things: a unique identifier and a
pointer to the actual data that is allocated on the heap. The library guarantees that
two data values are identical if their unique identifiers match, and this comparison is
done in constant time.
A factory, which is a class that creates objects, is used for constructing the data
that should be hash consed. This is explained further in the Implementation-chapter.
Structs and enums can be hash consed as long as they implement the traits Hash,
Clone, PartialEq and Eq, and those implementations can automatically be derived by
the compiler in many cases.

14
Chapter 3

Existing implementations

In this chapter a few existing implementations of the mcSAT-framework will be listed


and described.

3.1 CVC4
Jovanović, De Moura and Barrett created an implementation of mcSAT for the CVC4-
solver [8]. CVC4 is an open source theorem prover for SMT problems, and mcSAT
was implemented as an independent engine within that solver. The associated paper
describes how the implementation works and shows the experimental results of a few
benchmarks compared to other solvers.
The implementation is written in C++ and is openly available on GitHub. The
main algorithm is described in the paper, the solver is searching for a solution to the
problem while there are still variables that do not yet have an assigned value. If a conflict
occurs, the conflicting clause is analyzed and the algorithm either performs backtracking
if possible, or it returns UNSAT if the conflict clause is empty.
The code for the implementation mostly uses the same terminology as the first paper
about mcSAT. There are trails, clauses, literals and variables, among other things. A
literal contains one variable and a Boolean value indicating whether it is negated or
not, and each clause stores a number of literals in an array. The main loop is located
in solver.cpp, which is the top-module of the implementation. The most important
components of the solver is the clause database, the variable database, the trail and the
reasoning plugins.

3.1.1 Databases
There are two databases in the implementation, clause db and variable db which
stores clauses and variables respectively. The clause database contains both the clauses
from the input and any clauses that are learned during execution. It is possible to fetch
clauses from the database using clause references, or CRefs. A CRef has a reference to
the clause database that it belongs to, and it is possible to compare CRefs with each
other either by value or by reference. CRefs do not make any assumptions on how the
clauses are structured, which is a form of encapsulation that allows other parts of the
solver to use them without having to know exactly how clauses work under the hood.
It is also possible to have multiple clause databases in a so called clause farm. The
solver-class has one clause database for the input clauses only and a clause farm for all
the clauses. The clause database does not handle memory automatically. When adding
new clauses to the clause database, the memory needs to be explicitly allocated with the

15
correct size, and the current size and capacity of the memory must also be known. Both
databases have methods for performing garbage collection which will free up memory
that is not used anymore.

3.1.2 Trail
Instead of having three types of trail elements like in the original paper, this imple-
mentation has four: Boolean decisions, semantic decisions, clausal propagations and
semantic propagations. A Boolean decision is a literal L that is assumed to be true,
while a semantic decision is a non-Boolean variable x that is assigned some value ↵.
A clausal propagation is a literal L that is derived to be true by a clause C through
Boolean constraint propagation, while a semantic propagation marks a literal L that
can be evaluated to true. The solver-class has two assertions where it checks if a trail
element is a semantic decision, apart from that the solver-class does not know anything
about the other three types as they are only used in the trail.
They also introduced the concept of levels for the di↵erent decision elements. The
level of a decision element or a clausal propagation indicates how many decisions that
were made in the trail up to and including the element itself. The level of a semantic
propagations represents the level of the highest semantic decision used for evaluating
the literal L. This level is used for the backtracking functionality, when a conflict occurs
it is simple to remove the elements from the trail that have a level above some decided
threshold. This functionality is used extensively in the solver-class.
The trail has public methods for getting the elements from the trail, the size of the
trail (at a given decision level) and functions for evaluating literals and variables. The
trail also has a function for checking whether it is consistent or not, the result of this
function is used as an assertion in some parts of the code. There is a vector that contains
the propagated literals that are inconsistent, and checking if the trail is consistent is done
by simply checking whether this list is empty. This list is maintained incrementally when
new propagations are added and when decisions are removed, meaning that it is very
efficient to check if the trail is consistent at any time during execution.

3.1.3 Plugins
Plugins are used to extend the functionality of the implementation. In this implemen-
tation there are three plugins available that can reason about Boolean structures, linear
real arithmetic and uninterpreted function symbols. Plugins can for example perform
propagation, detect conflicts or select values for variables.
There are two di↵erent plugin features, CAN PROPAGATE and CAN DECIDE VALUES,
these can be added to plugins to show that they have the functionality for propagating
and making decisions. There are also five notification types that the plugins can reg-
ister for which enables them to handle assertions, restarts, conflicts, conflict resolution
and backjumps. This allows for specialized plugins that only handles a subset of these
features and notifications. The solver will dispatch notifications to the plugins during
execution, and the solver-class does not need to know exactly how the plugins work or
even how many of them there are.

There are short comments in the code that explain what the di↵erent parts are doing
without going into too much detail. Most functions are documented with only one
sentence. Since the implementation is written for CVC4, you need to be able to build the
solver with all of its dependencies which can prove to be tricky. Some of the dependencies
are GNU Compiler Collection (gcc/g++), Boost C++ Libraries and GNU Multiple
Precision Arithmetic Library.

16
3.2 mcBV
Zeljić, Wintersteiger and Rümmer presented an mcSAT calculus for fixed-length bit-
vectors along with an implementation called mcBV written in F# [3]. A fixed-length
bit-vector is an array data structure consisting of a fixed number of bits, a datatype
commonly used in SMT. While the CVC4-implementation of mcSAT is more generalized,
this implementation is made specifically for bit-vectors. The implementation is openly
available on GitHub under the MIT license.
The implementation uses the Z3 parser and preprocessor to prepare the data before it
starts searching for a solution. The main loop resides under Rules, and it loops while the
current state is “not solved”. If there is a conflict then it tries to resolve it, otherwise it
continues searching. Rules do not have any knowledge about how propagation, search or
conflict resolution works, it only knows when to apply them. DecisionRules implements
functions for making Boolean- or theory-decisions, and Rules is the only module that
uses it.
Literals are bit-vectors with defined lengths, and if the length of a bit-vector is zero
then it is treated as a Boolean. The type Literal is just an integer, and a clause is just
a list of literals. There is a function for evaluating whether two clauses are equal to each
other, this is done by first sorting the literals in the clauses and then doing a pairwise
comparison.
State contains many things, such as whether the solver is searching or not, databases
for clauses and variables and functions for pushing and popping trail elements, among
other things. It is tightly coupled with many other parts of the solver.
F# is a functional-first general purpose language that also allows for imperative and
object-oriented code. Since the language has a functional focus, the syntax is a bit spe-
cial compared to the other big object-oriented languages. The language is not as widely
used as for example C/C++ or Java.

There are very few comments in the code that explains what the di↵erent parts are
doing, which can make it difficult to analyze the implementation. Some code snippets
are commented out and there are many todo-annotations in the code that are hard to
interpret. While the implementation in CVC4 is more of a reference implementation,
this can be seen as more of a prototype for bit-vectors specifically with the purpose of
showing that it has good enough performance.

17
Chapter 4

Architecture

The following chapter explains the architecture of the solver. The requirements are
explained along with the approach of designing the solver. This includes explanations
of how clauses and literals can be represented, how the trail works and how a theory
can be implemented, among other things. Alternative design choices are also explored
and discussed.

4.1 Requirements
There are some parts that an mcSAT-based SMT solver must have. There must be a
way to represent clauses and literals, and there must be a way to represent a trail and
all possible trail elements. Without these, it will not be possible to represent di↵erent
states and to transition between them.
In order to implement the di↵erent rules of mcSAT, the design must have some
additional features. There must be a way to evaluate literals and clauses based on the
contents of the trail. It should be possible to check if the trail is consistent, complete,
or infeasible. In order to implement the theory-specific rules, there must also be a
component that describes the theory that is used for the solver. This is a critical
part of the design, since one of the main goals of this project is to make the design
flexible regarding the use of di↵erent theories. There should also be a way to generate
explanations, which is directly tied to the given theory that the solver is using. Another
important aspect of the solver is to use the right data structures, since this can greatly
impact the performance.
The design is supposed to contain everything that is needed to run the mcSAT-
procedure for some given example, but not much else. This is intended to be a reference-
implementation, meaning that it only has the functionality that is discussed by the
authors of mcSAT. A complete SMT solver can contain many additional things, such as
advanced search heuristics or the functionality to parse user input directly.

4.2 Approach
There are di↵erent ways of approaching the design choices that need to be made. For
example, one could do a top-down approach of designing this system where the high-level
components are formulated and implemented first and the low-level details are refined
later. The opposite of this is a bottom-up approach where the low-level details are given
more attention in the early stages and the design builds up from there.

18
A good thing about the bottom-up approach is that it allows the designer to consider
some of the details about the implementation early. If the designer does not have a lot
of previous knowledge about the domain or the language for the implementation, it can
be easy to make assumptions about the language that are wrong. This is especially
important when designing a solver to be written in Rust, since the language has some
constraints that an unexperienced developer might not know about until it is too late.
Some design choices can work well for some languages and worse for others.
Another option is to do a combination of the two and alternate between doing a top-
down and a bottom-up approach. This can be done by building a partially functional
solver and then modify and extend it iteratively to fulfill all the requirements. This is
also known as an agile approach [9]. For this project, the agile approach appears to
be the best option given the fact that there are no known previous designs for SMT
solvers specifically for Rust, and for the fact that the creator of the design and the
implementation does not have any prior experience with the language.
The design and the implementation in this report are tightly intertwined. As soon
as any major design decisions are made, they are also implemented and tested to see if
the design choices are feasible or not. A common saying in software development is that
“it is good to fail early”, meaning that it can be difficult and time-consuming to change
the design of something late in the project and it is therefore a good thing to find out
early if something does not work as expected.
As an example, if we want to use hash consing to prevent having duplicate literals,
we have to make sure that literals can be hash consed. If literals are implemented as
trait objects that can be di↵erent types, then it becomes hard to use hash consing for
those objects in Rust. That might not be an issue at all in some other languages, so
it is important to take into consideration that this design is specifically for Rust. This
particular example will be explained in further detail later in the report.

19
20
Figure 4.1: A diagram showing the architecture.
4.3 Expressions
We can again use the following example to demonstrate how the solver is designed:

C = (x = True _ y = True) ^ (x = True _ y = False)


^ (x = False _ y = True) ^ (x = False _ y = False)

We have two di↵erent types of terms in this example: we have the two variables x and
y, and also the constants True and False. Constants and variables should be able to
have the same types of values, the di↵erence is that a constant will always have the same
value while a variable can change value, or not have a value at all. We need a common
way to represent these two types of terms and the values that they can have. At the
very bottom of the design we have an enum called Value together with two structs called
Constant and Variable.

4.3.1 Value
A value can be part of a constant or assigned to a variable, it is also the result of
evaluating a function which is something that will be explained later in the report. In
this design a value can be either True, False or a 32-bit signed integer. In theory
there are many di↵erent types that can be used as values, but these will suffice for the
prototype.

4.3.2 Constant
A constant is a type of term that has exactly one value. It will always have this value
and it can never be changed, as the name implies. It also has an evaluation-function
that just returns the value of the constant.

4.3.3 Variable
A variable is a term that has an identifier stored as a String. The identifier can be
thought of as the name of the variable, for example x1. As opposed to constants,

21
variables do not have a value directly associated with them, and they do not have an
evaluation-function. The value of a variable is instead stored in a hash table in the
model, which is a part of the solver that is explained later in the report. This model
can then be used for evaluating any of the variables in the problem instance.

We want a common way to represent all types of terms, no matter if they are constants
or variables. We also want a way to represent and apply di↵erent functions to these
terms. This is done by introducing two enums called Term and Function.

4.3.4 Term
A term is an object that is either a constant, a variable or a function application. Terms
have an evaluation-function and the result will always be a value if the term can be
evaluated, or None if it cannot be evaluated yet. If a term is a function, then it should
have a list of other terms that the function should be applied to.
For clarification, in the figure where it says Constant(Constant), the first Constant
is the name of the enum member while the second Constant refers to the type Constant
that was introduced earlier. It is not necessary for the name of the member to be the
same as the type it contains.
The importance of having a common object for both constants and variables becomes
more apparent as we move on to the implementation details later. It also makes it
possible to use hash consing for all the terms and thereby save memory.

4.3.5 Function
In this design, a function is a symbol capable of producing an output depending on
the operator and the values of the terms in the function. It has an evaluation-function
which takes a list of values, performs the operation on these values and then returns
the result as a new value. In this architecture only two operators have been considered,
Plus and Minus. Worth noting is that right now these two functions only make sense
if they are applied to values of type integer, since you cannot add or subtract Boolean
values. When adding new theories to the implementation, new function cases can be
introduced to the Function-enumeration, along with an implementation of how they
should be evaluated.

22
As an example, a problem instance could have a clause saying that x + 5 = 10. This
is where functions come in, in order to get the value of x + 5 we will have to evaluate the
function Plus with the variable x and the constant 5. This becomes a simple task since
we already have ways to evaluate constants and variables. At this moment there are
no restrictions on how many operands we can have, it is possible to construct a Plus-
function with more than 2 operands. Although this makes it easier to construct longer
functions, it also technically allows for Minus-functions with more than 2 operands which
can be hard to interpret since the order of the operands matter.

We now have a common way to represent values and also apply functions to them.
Another key ingredient that we need is the ability to have predicates which allows us
to formulate comparisons between terms. This is done by introducing two more enums:
Formula and Predicate.

4.3.6 Formula
While terms are symbolic expressions that evaluate to objects, formulas define how these
terms relate to each other. Formulas can also be evaluated, the result will be either true
or false if the formula can be evaluated, or None if it cannot be evaluated yet. In this
design, there are three di↵erent types of formulas: true, false and predicate.
A true-formula will always evaluate to true while a false-formula will always evaluate
to false. Having true and false as formulas like this is not strictly necessary, since it is
possible to construct formulas that are always true or false even without them. For
example, the formula 1 = 1 which compares two identical constants can always be
evaluated and it will always be true, while 1 = 0 will always be false. Sometimes it
is necessary to construct clauses that are just true or false, for example when dealing
with conflict clauses. For this reason, having these simple formulas representing true
and false can make it easier to build the prototype of this solver. It is possible to use
hash consing for formulas as well as for terms.

4.3.7 Predicate
A predicate is a symbol that can be either True or False, depending on its arguments. In
this architecture there are five predicates defined: <, , >, and =. In order to evaluate

23
whether a predicate is true or not, all the terms that appear in the predicate must be
evaluated first. If there are any terms that cannot be evaluated yet, the predicate will
evaluate to None.
An alternative design would be to allow the evaluation of predicates even if all the
terms are not known. For example, if there is a predicate 1 + x x > 0 where x is
an unknown integer, then the predicate will always be true since we can just simplify
the expression into 1 > 0. However, this approach would be more complicated to im-
plement and it is not certain that it would yield better performance. As an example,
the predicate 1 + x x > 0 can be created by letting the left side of the inequality be
a Plus-function where the first operand is 1 and the second operand is itself a Minus-
function with the variable x as its two operands: Plus(1, Minus(x, x)). The right side
of the predicate is then the constant term 0.

The design that we have so far is fairly versatile in what it allows us to construct. We can
now construct the formulas that we need for our example, like x = True or y = False,
but we still do not have a way to represent disjunctions and whole clauses. There are
two more things we need before we can move on to how the solver and the trail should
work: two structs called Literal and Clause.

4.3.8 Literal
A literal is two things: a formula and a Boolean value which indicates whether the
formula is negated or not. Literals have two methods, one which takes a literal and
negates it by flipping the negation value, and one which will evaluate the literal using
the model. The evaluation will result in an optional Boolean, meaning that it will return
either True if the literal evaluates to true, False if it evaluates to false, or None if it
cannot be evaluated yet.
An important thing to consider here is that there are now multiple ways to represent
the same mathematical expression. As an example, the expressions x < 2 and ¬(x 2)
are equivalent even though they use di↵erent predicates. This can be a problem in some
cases. As an example, if there is a literal ¬(x 2) in the trail then that cannot be used
directly to assert if x < 2. The reason for this is that the function that interprets the
literal x < 2, valueB , will only check if x < 2 or ¬(x < 2) are in the trail or not. The
rule called T-PROPAGATE can be used for some of these cases.

4.3.9 Clause
A clause is a non-empty disjunctive list of literals. These literals essentially make up
the clause that they are contained in, and if any of the literals evaluate to True then the

24
whole clause evaluates to True. An example would be the clause x > 0 _ y > x which
contains the two literals x > 0 and y > x. In this case, the clause would be satisfied
if x is assigned a values that is greater than 0, no matter what value y might have.
It is common to represent clauses as disjunctive lists in SMT solvers, as we saw when
investigating existing implementations of mcSAT [8, 3]. Clauses also have a function
that returns a list of all of its literals. This enables other components to access the
literals of any given clause. A possible addition to this architecture is to allow the
existence of clauses without any literals. Those would be considered empty clauses and
they would always evaluate to False, which could be useful when constructing conflict
clauses. Implementing this should be trivial since the evaluation-function should just
return False if there are no literals.

4.4 Alternative architectures


4.4.1 Expression enum
An alternative design that was explored was to let each clause be represented as a tree-
structure instead of just a list of literals. With this design, each clause would be an
Expression, where each Expression is in turn either a Term or a Formula. Below is an
example of what this architecture would look like.

Logical formulas can be constructed as recursive trees where the nodes represent the
operators and operands in the expression, also known as S-expressions. A child node
in this tree can itself be another expression, which means that this structure is very
versatile and it is possible to construct arbitrary expressions. With this architecture,
the clause x > 0 _ y > x would be represented the following way:

Disjunction(GreaterThan(x, 0)), GreaterThan(y, x)))

It is possible to represent formulas that are mixing both conjunctions and disjunctions,
which makes it more of a high-level design than the one that was described previously.

25
A downside with this architecture is that it can become expensive to iterate through
the “literals” in the expression. If the literals are listed in an array then we simply
need to iterate through that list to access any of the literals, but if each clause is one
expression then it could be computationally expensive to recurse down the tree to access
each literal. As an example, Boolean constraint propagation becomes fairly simple to
implement and to reason about if there is only a list of literals that need to be evaluated.
For SMT solvers in general, it is common to work with clauses that are disjunctive lists
of literals since it is often easier to work in this more restricted setting.

4.4.2 Trait objects


A slightly di↵erent design that was also explored was to let expressions, terms and
formulas be traits rather than enums or algebraic data types. A trait object of type
Expression would then be treated as something that is either a formula or a term, and
the other parts of the implementation do not need to know exactly how they are im-
plemented. This is essentially an object-oriented design that is more dynamic than our
current approach which is more static. Adding a new term or formula would be as easy
as defining a new struct and then implement these traits for that struct, and through
dynamic dispatch the solver would automatically evaluate them correctly. The under-
lying representation of expressions is abstracted and other parts of the implementation
do not need to know exactly how they are implemented.
Although this architecture sounds tempting, there are some difficulties in practice
with an approach like this. In Rust there is a trait called Sized which is implemented
by the types that have a known size at compile time, but this will not work for trait
objects since they are dynamically sized types. As an example, a formula for representing
equality should somehow contain the two expressions that should be checked for equality.
But if it owns these expressions as trait objects, the compiler will show an error about
how the size of these objects cannot be determined at compile time, and this will result
in the formula itself having an unknown size which can cause other problems in other
parts of the architecture.
This can be circumvented by letting the formula own pointers to the trait objects
instead of the trait objects directly. This will not interfere with the Sized-constraint
on predicates and formulas since pointers always have a known size in Rust. Since the
compiler is strict about making sure that no resources are deallocated prematurely, the
developer can be forced to explicitly specify the lifetimes of these trait objects which
will increase the complexity of the code and make it a bit less readable.
Sized is not the only trait that formulas should implement. Another trait that be-
comes problematic is Clone, which enables cloning of objects. Cloning is used extensively
in the prototype, as we will see later in the Implementation-chapter. This trait is not
compatible with trait objects since it requires that the object that should be cloned
should have a known size. It is only possible to make so called object safe traits into
trait objects, and a trait is object safe as long as the return type is not Self and there
are no generic type parameters. The Clone-trait has a function clone(&self) which
must return the type Self, meaning that it is not possible to implement this for trait
objects.
As we will see later in the report, we also want the ability to store terms in hash
tables. This is not possible if terms are trait objects, since objects need to implement
the traits Eq and Hash to allow them to be stored in hash tables. Implementing Eq and
Hash will enable us to check the equality between these objects and for hashing them
which is essential if we want to store them in hash tables. Storing a pointer to a trait
object is not going to work either, since the pointer can not directly implement these
two traits if the contained type does not have them.

26
Another problem with this approach is that it does not seem to be well suited for
hash consing, since we must know the structure of the data to check for structural
equivalence. As we will see in the next chapter we will use a library called hashconsing
to perform hash consing, but there is a requirement that the data must have a size that
is known at compile time. It is technically possible to hash cons references since they
have a known size, but it does not make much sense in practice. The process of hash
consing would then involve constructing the trait object and then searching the hash
table for the reference to the object that was just created. Even if a similar object exists,
the references will not match and the newly created object will be put in the hash table
even though a similar object might already exist.

4.5 Search
Looking back at our example, we can now represent the full problem using this archi-
tecture. The problem can be represented using four clauses where each clause has two
literals. These literals are formulas with the predicate Equal with the variable-terms
x or y and the constant-terms True or False. It is also simple to evaluate literals. In
order to evaluate a literal we must first evaluate its formula, but before that we need to
evaluate all the terms in the formula, and so on. At the bottom of this chain we have
constants and variables, and their values will propagate upwards so that we can finally
get the value of the literal. However, in order to evaluate variables we also need a model
where we can look up the variable to see if it has a value assigned to it. A trail is also
necessary to keep track of the decisions and propagations that have been made. Next,
we will introduce the parts that are necessary for making assignments to variables along
with the trail and the di↵erent types of trail elements.

27
4.5.1 Trail
The trail keeps track of the decisions and propagations that have been made, and o↵ers
several functions that can modify the trail. It has a list of trail elements along with
functions for adding and removing elements. The trail typically works like a stack,
where new trail elements are pushed and trail elements that should be removed are
popped. There are several data structures that can be used to implement a trail, a good
choice would be a data structure that can do these operations efficiently.
There is a model associated with the trail. The trail and the model must always be
synchronised. As an example, if a model assignment is pushed onto the trail, the same
assignment must also be saved in the model. If a model assignment is removed from the
trail, it must also be removed from the model. In order to make sure that this always
happens, we can create separate functions for adding and removing trail elements to the
trail. Each one of these functions can then make sure that the model is updated at the
same time. As long as these functions are implemented correctly and there are no other
ways to modify the trail, the model will always be synchronised with the trail.
The trail also has the ability to evaluate literals and clauses, either from the trail
elements that are currently in the trail or from the model. The reason for having both a
trail and a model at the same time is that the trail is essentially a timeline based on the

28
decisions and propagations that have been made and it can be used for backtracking,
while the model is just a representation of the current values of the di↵erent terms with
fast access.
Finally, there are functions for checking whether the trail is consistent, complete or
satisfied. These three functions can be implemented by using the value-functions to
evaluate the literals on the trail or the clauses found in the original problem.

4.5.2 TrailElement
A TrailElement is one of three things: a decided literal, a propagated literal, or a model
assignment. Each one of them can have data attached to describe what the element
represents. Decided literals store a literal, propagated literals store the clause that
implies the literal along with the literal itself, and model assignments store a term and
its value. With these three types of trail elements we can represent any decisions or
propagations.

4.5.3 Model
The model is a representation of the first-order interpretation that is used for evaluating
terms. It has a hash table that maps terms to values, and whenever the solver makes a
decision about a value for some variable, this decision is saved in the hash table. This
is a partial model since it will be modified as the algorithm is running, and it will only
be complete if the algorithm can find a satisfiable solution. Whenever a variable needs
to be evaluated, we just check if that variable has a value assigned in this table. It is
important to use a data structure that allows for fast access, as well as inserting and
updating values efficiently. There are three functions defined for the model that allows
us to set, get and clear the value for any of the terms.

4.6 Theory & solver


If we go back to our example, it is now possible to not only create the clauses, but also
to construct a trail and add or remove all the trail elements that are needed. With the
addition of a model we can now also evaluate any clause or literal. As we can recall
from the Background chapter, our first step to solve the example is to make a decision
on some value for either of the terms x and y. Since this decision is theory-specific, we
need to define how a theory can be incorporated into this design.
This can be done by introducing a trait called Theory, and a struct that implements
this theory. As an example, we have chosen to implement a theory that only uses Boolean
variables and it is called BooleanTheory. In order to tie all of these components together,
we will also introduce a struct called Solver together with an enum called State.

29
4.6.1 Theory
In a theory, we should be able to define the following things:
The predicates, functions and constants that belongs to the theory.
Theory propagations.
Making theory decisions on values.
Detecting theory conflicts.
T-CONSUME.
T-BACKJUMP.
Producing explanation-clauses for conflicts.
Since mcSAT has these five rules that are theory-specific, it makes sense to let each
theory decide on how these rules should be implemented. A theory does not necessarily
need to implement all of them in order to function, but there should be a common
interface that all theory-implementations follow. This is done by letting Theory be a
trait that defines these functions.
The predicates, functions and constants from the theory can be used to construct
the instance that the solver should solve. Note that we do not define a common way
to get these components, this is something that the concrete implementation needs to
provide.

30
4.6.2 BooleanTheory
BooleanTheory is the concrete implementation of Theory and it defines how the func-
tions should be implemented for a Boolean theory. Here we can define helper-functions
that construct and return the predicates, functions and constants of the theory. Since
there is no common interface for this, it allows the developer to create any type of helper-
function. As an example, there might be a theory where a certain type of complicated
predicate is used extensively, we can then define a helper-function that constructs and
returns this particular predicate. We do not have to do multiple function-calls to the
theory and then “stitch” together the resulting components to create the predicate.
An alternative design would be to define functions for the Theory-trait that returns all
the predicates, functions and constants for any given theory. It would still be possible
to use any helper-functions from the implemented theory.

4.6.3 Solver
The solver is the main component that is responsible for running the algorithm, it is
essentially the glue that holds all the di↵erent components together to form a working
solver. It contains a number of things:
A list of clauses that should be satisfied.

A list of terms that do not yet have a value assigned.


A timeout that specifies how long the solver is allowed to run before terminating.
The current state of the solver.

A trail.
The underlying theory for the problem that should be solved.
A function that runs the main loop of the algorithm.
Several functions for the di↵erent rules of mcSAT that are not theory-specific.

The heart of the solver is the main loop which resides in the function run(). This loop
needs to somehow figure out which rules that can be applied, select one of them, and
apply it before moving on to the next iteration. The details on exactly how this function
is implemented will be explained further in the implementation chapter. The timeout-
functionality can be implemented in a number of ways, a simple way would be to check
the time when the solver starts and then check the time again after each iteration. If
the time delta is above some threshold then the solver should terminate.

4.6.4 State
In order to keep track of the state that the solver is currently in, we introduce an
enumeration called State. There are four states that the solver can have:
Search - there are no conflicts and the solver is searching for a solution.
Conflict - there is a conflict that must be resolved.

Sat - the problem has been satisfied.


Unsat - the problem cannot be satisfied.

31
The conflict-state also has a conflict clause attached to it, because if there is a conflict
then there has to be a clause that explains it. Since Rust has support for attaching
data directly to enum variants, it makes sense to have the conflict clause be a part of
the conflict-state only. An alternative would be to have an optional clause in the solver
which would be None when there is no conflict.

All of the important parts of the architecture have now been introduced and motivated,
along with some alternative designs. The next step is to implement a prototype and
evaluate how well the chosen architecture works together with Rust.

32
Chapter 5

Implementation

In the following chapter we will go through some of the details of the implementation,
including which dependencies that are used, how the trail is implemented, how clauses
can be constructed, how hash consing works and how a solver can be instantiated. The
implementation can be found at the following url: https://github.com/DennisOrn/
mcsat-rust (version “v0.1.0”).

5.1 Dependencies
The implementation requires four libraries to run:
colored - adds the ability to color the output to the terminal.

hashconsing - adds support for hash consing.


log - provides a lightweight logging API (without implementation).
env logger - a simple and minimal implementation of a logger.
Neither of these four dependencies are essential for the architecture, the hash consing
library is used for convenience purposes and the other three are mainly for debugging.
Worth noting is that three of these dependencies have other dependencies themselves, it
is a common practice for packages to depend on other packages and the dependency-tree
can become very big if we are not careful when choosing our dependencies. Having too
many dependencies can increase the time it takes to build the project, although this
e↵ect would still be diminished by the fact that the Rust compiler uses incremental
building, which prevents the compilation of packages that are already compiled and
unmodified. There was a problem with older versions of the Rust compiler that could
trigger miscompilations in incremental builds, but since version 1.54.0 this is not believed
to be an issue anymore [10].
Apart from these four dependencies, we are also using a development dependency
library called criterion which is a microbenchmarking framework used for measuring
performance improvements or regressions. This can be used to check how long it takes
to run specific function or to solve specific instances, and the library will generate
graphs that displays the average time it took to run the function, among other things.
Development dependencies are not included in the binary when building the prototype
normally, they are only used when running tests or benchmarks.

33
5.2 Expressions
There are some traits that values must conform to: it should be possible to check if two
values are equal to each other, it should be possible to compute the hash of any value
and it should be possible to order values of the same type. It is possible to do operator
overloading in Rust which means that we can implement functionality for adding and
subtracting Value-objects directly without having to access the value member of the
objects. Since a value can have di↵erent types, we need to make sure that we only apply
functions between two values of the same type. Anything else will result in the program
terminating with an error message.

impl Add<Value> for Value {


type Output = Value;
fn add(self, other: Value) -> Value {
match (self, other) {
(Value::Integer(lhs), Value::Integer(rhs)) => {
Value::Integer(lhs + rhs)
}
_ => panic!("Cannot add {} and {}", self, other),
}
}
}

This makes it very simple to apply functions with values. Below is the code for evaluating
functions, it takes a list of values that will act as the arguments for the function. There
is currently an assertion that there should be exactly two arguments for each function,
this is a temporary addition to the code used for debugging purposes only and it can be
removed since there is no set limit on how many arguments a function can take.

impl Function {
pub fn evaluate(&self, args: &Vec<Value>) -> Value {
assert!(args.len() == 2);
match self {
Function::Plus => args[0] + args[1],
Function::Minus => args[0] - args[1],
}
}
}

The function evaluate can be found in many parts of the architecture that makes up
the expressions. It often works similarly to how it works for functions, where we perform
pattern matching on self to figure out how to evaluate correctly. The evaluation of
literals is a bit di↵erent:

pub fn evaluate(&self, model: &Model) -> Option<bool> {


self.formula
.evaluate(model)
.map(|b| if self.is_negated { !b } else { b })
}

In order to evaluate a literal, we must first evaluate the formula that the literal contains.
Since each literal has a flag which indicates whether it is negated or not, we also need
to take that into consideration before returning the result of evaluating the formula.

34
5.3 Hash consing
The library that provides functionality for hash consing has a macro called consign
which can be used to create a factory. This factory is then used to construct objects
that are hash consed.

consign! {
let FACTORY = consign(37) for Term;
}

The number 37 is the initial capacity of this factory. In the documentation for the library
the authors are claiming that this library provides maximal sharing, meaning that any
two equal values are shared. The is under the assumption that only one factory is
created for a given type. In this implementation there are two factories however, one
for formulas and one for terms. Since formulas can contain terms we could potentially
end up in a situation where some terms are stored in both factories at the same time,
using up twice as much memory than what is necessary and preventing maximal sharing.
Some tweaking can be done to prevent this, but we decided to leave this for now since it
does not break anything in the implementation. At worst, some terms are stored twice
but the solver is still functional.
Helper functions were added to simplify the creation of formulas and terms, below
is the helper function for creating an addition-formula for two terms:

pub fn plus(lhs: HConsed<Term>, rhs: HConsed<Term>)


-> HConsed<Term> {
FACTORY.mk(Term::Function(Function::Plus, vec![lhs, rhs]))
}

In most of the implementation we will refer to terms as HConsed<Term>, which is a


reference to a term that is hash consed. It is also possible to define a type alias to make
this less verbose. Removing the hash consing-functionality from the project should be
possible by simply removing the HConsed-wrappers for all terms and formulas.

5.4 Search
In this prototype the trail elements are implemented the following way:

pub enum TrailElement {


DecidedLiteral(Literal),
PropagatedLiteral(Clause, Literal),
ModelAssignment(HConsed<Term>, Value),
}

This means that each enumeration needs ownership of its data. We might be able to
use references instead, for example DecidedLiteral(&Literal), but this requires us to
explicitly specify the lifetime of that reference and for TrailElement. The downside of
not using references here is that each time we are creating a new trail element, we also
need to clone the literal, clause or term that should be stored in it. If we do not do that,
the value will be moved into the trail element which will invalidate the previous owner
of the value. The function below is how model assignments are pushed to the trail:

pub fn push_model_assignment(&mut self, variable: HConsed<Term>,


value: Value) {

35
let element = TrailElement::ModelAssignment(
variable.clone(), value
);
self.elements.push(element);
self.model.set_value(variable.get().clone(), value);
}

There are three functions for pushing trail elements, one for each type of element. The
model assignment also updates the model by inserting the variable and its new value in
the models hash table. If this model assignment is later removed, we also need to make
sure that this value is cleared from the model.
A central part of the trail is evaluating the values of literals, this is done by imple-
menting the functions that were described in the Background-chapter: valueB (L, M ),
valueT (L, M ) and value(L, M ).

pub fn value_literal(&self, literal: &Literal) -> Option<bool> {


let value_b = self.value_b(literal);
if value_b.is_none() {
return self.value_t(literal);
} else {
return value_b;
}
}

fn value_b(&self, literal: &Literal) -> Option<bool> {


let literals = self.all_literals();
let negated_literal = literal.negate();
for l in literals {
if l == literal {
return Some(true);
} else if l == &negated_literal {
return Some(false);
}
}
None
}

fn value_t(&self, literal: &Literal) -> Option<bool> {


literal.evaluate(&self.model)
}

The function value t evaluate the literal using the current model and return the result.
value b will check if the literal or its negation is already in the trail. Lastly, value
will return the result of value b unless it is None, in that case it will return the result
of value t. This is the only one of these three functions that is exposed publicly. A
problem that has already been mentioned is the fact that there can be multiple ways
to represent the same logical expression. This function in its current implementation
cannot see that the two literals x = True and ¬(x = False) mean the same thing.

36
5.5 Theory & solver
There is only one trait in the project and that is Theory. Only two functions are
currently implemented by BooleanTheory: decide and conflict.

pub trait Theory {


fn decide(&self, variable: &HConsed<Term>,
trail: &Trail) -> Option<Value>;
fn conflict(&self, variables: &VecDeque<HConsed<Term>>,
trail: &Trail) -> Option<Clause>;
...
}

The implementation of decide should have the variable whose value should be decided
as a parameter, along with the current trail. Using these two parameters, it should
be possible to make a decision for the variable that results in a consistent trail. The
implementation of conflict can be found below:

fn conflict(&self, variables: &VecDeque<HConsed<Term>>,


trail: &Trail) -> Option<Clause> {
for variable in variables {
let literal_true = Literal::new(
equal(variable.clone(), constant(Value::True)));
let literal_false = Literal::new(
equal(variable.clone(), constant(Value::False)));
let literal_true_value = trail.value_literal(
&literal_true);
let literal_false_value = trail.value_literal(
&literal_false);
match (literal_true_value, literal_false_value) {
(Some(true), Some(true)) => {
let explanation = Clause::new(vec![
literal_true.negate(),
literal_false.negate()
]);
return Some(explanation);
}
_ => (),
}
}
None
}

The main idea behind this implementation is that we want to check if any of the variables
can be evaluated to both True and False at the same time. One way to do this is to
construct two literals, one where the current variable is equal to True and one where it
is equal to False, and then evaluate both of those literals using the trail. If they both
evaluate to True, then the trail is inconsistent and we need to construct a conflict clause.
In this boolean theory we simply take the two literals, negate them and put them in
the same clause that we then return. The implementation of this function could be far
more advanced depending on which theory it should work on, but it seems to work well
enough for this theory.

37
5.5.1 Solver
The solver is implemented as a struct that contains all the data necessary to solve the
problem instance.
pub struct Solver {
theory: Box<dyn Theory>,
state: State,
trail: Trail,
clauses: Vec<Clause>,
variables: VecDeque<HConsed<Term>>,
undecided: VecDeque<HConsed<Term>>,
}

The constructor of the solver requires a theory, the clauses that should be satisfied and
the variables that needs to have a value assigned to them. It should be possible to extract
the variables from the clauses and therefore eliminate the need for passing the variables
to the constructor, but in this prototype they are passed manually to the constructor for
simplicity. The theory is stored as a reference to a trait object of type Theory, meaning
that we can pass along any theory as long as it implements this trait. The theory does
not need to implement any other traits, like Clone or Hash, which means that we do
not have to worry about the restrictions that prevented us from using trait objects to
represent expressions. A solver can be constructed the following way:
pub fn example_solver_unsat_1() -> Solver {
let t = BooleanTheory::new();
let clause1 = Clause::new(vec![
Literal::new(t.Eq(t.Var("x"), t.True())),
Literal::new(t.Eq(t.Var("y"), t.True())),
]);
let clause2 = Clause::new(vec![
Literal::new(t.Eq(t.Var("x"), t.True())),
Literal::new(t.Eq(t.Var("y"), t.False())),
]);
let clause3 = Clause::new(vec![
Literal::new(t.Eq(t.Var("x"), t.False())),
Literal::new(t.Eq(t.Var("y"), t.True())),
]);
let clause4 = Clause::new(vec![
Literal::new(t.Eq(t.Var("x"), t.False())),
Literal::new(t.Eq(t.Var("y"), t.False())),
]);

let clauses = vec![clause1, clause2, clause3, clause4];


let undecided = VecDeque::from(vec![t.Var("x"), t.Var("y")]);
Solver::new(Box::new(t), clauses, undecided)
}

This will construct the unsatisfiable example that was introduced earlier in this report.
Note that we have defined helper functions as a part of the theory to construct the
literals that we need for the four clauses. When doing this, the part of the prototype
that is constructing the solver does not need to have access to or know anything about
the underlying structure of the expressions, it can just use whatever functions that the
theory is exposing publicly.

38
After the solver-object is created, we can just call the function run which contains
the main loop. In each iteration of the main loop, the solver should select an available
rule and apply it before going to the next iteration. If either of the rules SAT or UNSAT
are applied, the solver should break out of the loop and return the result. There are a
few issues that can be identified here, namely how to represent a rule, how to check the
availability of the di↵erent rules and how to apply them. It is not desirable to have just
one function that contains all the logic for the main loop along with the rules and how
they are applied. It is usually difficult to modify functions that are very long, especially
if there are a lot of conditional statements and indentations.
In this prototype, rules are represented using an enum called Rule. Since it is possible
to define enums so that data can be attached directly to each enum variant, it becomes
easy to let each rule have the associated data that it needs in order to work. For example,
the rule PROPAGATE can store the clause and the literal that should be propagated, and
the rule CONFLICT can store the associated conflict clause.
When each rule has these details associated with it, it is easy to apply the rules.
The function apply takes a reference to a Rule and uses pattern matching to decide
what should be done. For example, if the rule PROPAGATE is applied then it just calls the
function push propagated literal on the trail with the data that is attached to the
rule. If the rule CONFLICT is applied, it just changes the solvers state to indicate that
there is a conflict and it attaches the conflict clause to that state. The function takes a
mutable reference to self in order to allow modifications to the members of the solver.

fn apply(&mut self, rule: &Rule) {


match rule {
Rule::Propagate(clause, literal) => {
self.trail.push_propagated_literal(clause, literal);
}
Rule::Conflict(clause) => {
self.state = State::Conflict(clause.clone());
}
...

The available rules are fetched from a function called get available rules, this func-
tion checks which rules that are applicable in the current state and with the current trail.
The function takes an immutable reference to self which means that it does not have
any side e↵ects. The rules that are currently implemented are PROPAGATE, CONFLICT,
SAT, UNSAT, RESOLVE, BACKJUMP, T-DECIDE and T-CONFLICT.
This structure makes it easy to implement the main loop. The loop simply checks
which rules that are available, picks the first one and applies it until the state becomes
either SAT or UNSAT.

pub fn run(&mut self) -> bool {


let mut iteration_counter = 1;
loop {
assert!(self.trail.is_consistent());
match self.state {
State::Sat => {
assert!(self.trail.is_complete());
return true;
}
State::Unsat => return false,
_ => (),
}

39
debug!("ITERATION {}", iteration_counter);

let rules = self.get_available_rules();


let rule = rules.first().unwrap();
self.apply(&rule);

debug!("{}", self);
iteration_counter += 1;
}
}

Since all of the rules are not yet implemented, we cannot make a guarantee that this
prototype will always terminate correctly for all instances. Another thing to consider
regarding termination is that there should ideally be a finite basis of literals, this is not
going to be an issue with a boolean theory where there are only two possible values that
we can assign to variables. Exactly how to implement a finite basis is not something
that has been explored in this project.

40
Chapter 6

Evaluation

The following chapter contains a simple evaluation of how fast the solver is capable
of running a few small examples, along with the overall impression of Rust and its
ecosystem. Possible improvements to the architecture and implementation are discussed.
This project was compiled using Rust version 1.55.0. The computer used for running
the benchmark is a 13-inch MacBook Pro from 2017 with an Intel i5-7267U processor
running macOS Big Sur 11.6. The specific versions of the dependencies used in the
implementations are specified in the file Cargo.toml. The tested version of Z3 is 4.8.12
(64 bit).

6.1 Performance and correctness


A few di↵erent example instances have been tested for the prototype. The first one will
be referred to as unsat-1 and it is the same exact example that was introduced earlier
in the report:

C = (x = True _ y = True) ^ (x = True _ y = False)


^ (x = False _ y = True) ^ (x = False _ y = False)

The second example is a modified version of the previous example:

C = (x = True _ y = True) ^ (x = False _ y = False)

This is satisfiable and it will be referred to as sat-1. The third example is based on the
pigeonhole principle which states that if n objects are to be placed into m containers
and n > m, then at least one container will contain more than one object. This can be
encoded into a SAT problem, where each container can only hold one object. This will
lead to the problem being unsatisfiable if we have for example two objects and only one
container. This example will be referred to as pigeonhole-1 and it can be encoded the
following way:

C = (x = True) ^ (y = True) ^ (x = False _ y = False)

Using the library criterion, the execution time for these three examples can be ob-
tained.

41
0.07

0.06

0.05
Density (a.u.)

0.04

0.03

0.02

0.01

160 180 200 220 240

Average Time (us)


Figure 6.1: The average time it takes for the solver to reach UNSAT for the example
unsat-1. The average execution time over 100 samples is 168.98 s, requiring 18 itera-
tions.

0.2
0.18
0.16
0.14
Density (a.u.)

0.12
0.1
0.08
0.06
0.04
0.02

30 35 40 45 50

Average Time (us)


Figure 6.2: The average time it takes for the solver to reach SAT for the example sat-1.
The average execution time over 100 samples is 36.20 s, requiring 4 iterations.

42
0.14
Density (a.u.) 0.12

0.1

0.08

0.06

0.04

0.02

100 110 120 130 140

Average Time (us)


Figure 6.3: The average time it takes for the solver to reach UNSAT for the example
pigeonhole-1. The average execution time over 100 samples is 108.58 s, requiring 12
iterations.

These graphs were generated by running the same code many times. The way criterion
works is that when a benchmark is performed, it runs the same code over and over for
3 seconds by default to “warm up” before the real measurements are taken. This is a
common thing when doing microbenchmarking since the first run is often the slowest
due to cache misses and page faults.
Running the compiled binary directly produces di↵erent results. There are two
di↵erent types of builds, debug and release. Debug builds will in this prototype print
information about what the solver is doing at each iteration, while the release builds will
only print the result when it finishes. The release build is also more optimized than the
debug build by default. As we can see in the table below, the first run is significantly
slower than the second run for both debug builds and release builds.

Example Build First run Second run


unsat-1 debug 0.305 s 0.014 s
unsat-1 release 0.222 s 0.005 s
sat-1 debug 0.336 s 0.009 s
sat-1 release 0.204 s 0.006 s
pigeonhole-1 debug 0.312 s 0.011 s
pigeonhole-1 release 0.211 s 0.006 s

Table 6.1: The time that it takes to run the compiled binary for the three di↵erent
examples using debug builds and release builds using the Unix-command time, measured
in seconds.

The release builds are consistently faster than the debug builds, which is expected.
The time it takes to run the three examples is very similar for the release builds, and
completing unsat-1 was even slightly faster than sat-1 here even though unsat-1 took
almost 5 times longer to complete than sat-1 during the microbenchmarking. This can
be explained by the significant overhead of running the full application and not just one
function. In order to do a more complete evaluation of the design and how it performs,
we would have to implement other theories and construct more examples, preferably

43
ones that are far larger. This is outside the scope of this report.
A fourth example called pigeonhole-2 was also evaluated. Following is the example
of trying to fit three objects into only two containers:

C = (a = True _ b = True) ^ (c = True _ d = True) ^ (e = True _ f = True)


^ (a = False _ c = False) ^ (a = False _ e = False)
^ (c = False _ e = False) ^ (b = False _ d = False)
^ (b = False _ f = False) ^ (d = False _ f = False)

This example will not terminate in the current version of the implementation since
there is an assertion that fails at iteration 28 because there are no rules that can be
applied. This should be expected for some instances since not all of the rules have been
implemented. By looking at the state of the solver at iteration 28 we can see that the
rule T-CONSUME could be applied if it was implemented. Implementing this rule may not
be enough to make the instance terminate, since other unimplemented rules might be
required at later iterations.
The examples that are running in this implementation are hardcoded, meaning that
the compiler can potentially optimize some parts that would not be optimized if the
example would be given as an input to the solver after it has been compiled. It is hard
to speculate how big of an impact this has on the performance of the prototype, in order
to do a proper evaluation there should also be a way to parse arbitrary inputs to the
solver.

6.1.1 Z3
It is hard to make a fair comparison with Z3 since it is a much larger and far more
complex solver, and because the chosen examples are very small. The total time it takes
to run Z3 on these three examples is around 0.02 seconds each according to the statistics
output when passing the -st-flag and encoding the examples in the SMT-LIB language.
This is equal to 20, 000 s, but since the resolution for these numbers is not as exact as
it is for the benchmarking framework used in the project, it is hard to draw conclusions
about the performance of the Rust prototype other than it works well enough for very
small instances. Running the compiled binary for the prototype is clearly slower than
Z3 the first run, but after that they seem to have similar performance. The execution
time of Z3 is not noticeably a↵ected by running it more than once.

6.2 The Rust experience


6.2.1 The good parts
Using Rust’s ecosystem, it is very easy to run a downloaded project as long as it is
only written in Rust. There is a file named Cargo.toml where the developer lists all
the dependencies that the project has, and by simply running the command cargo
build, the package manager will automatically download the dependencies and build
the project. The build procedure can become more complicated if the project has
components that are written in other languages.
Cargo also contains a built in code formatter called rustfmt, which will format all
the code in the project according to the style guidelines of Rust. This makes it easier
to collaborate on the project since the formatter will make the coding style consistent.
The chosen architecture is strongly typed, given that enums are used extensively. If
a developer uses a code editor with autocompletion, they will easily see all the types of

44
functions, predicates and values that are available and they do not have to guess which
ones that are working.
The Rust-compiler will give the developer informative error messages most of the
time when the project does not build, often pointing out exactly which character or
word that causes the error and why. This is in stark contrast to some other compilers
that sometimes print extremely verbose error messages that can be hard to interpret.
Since the compiler is so strict about ownership, borrowing and lifetimes, it can give
the developer more confidence that the project is stable and functional if it builds. It
does not su↵er from the same problems as some interpreted languages, where a program
can even start running until it crashes because of a simple syntax error.

6.2.2 Challenges
As someone with no prior experience with Rust, it was very easy to fall into the trap
of thinking that something is easy to implement when in fact it is not. The alternative
architecture that uses trait objects is a good example of this since it sounds feasible
in theory, but in practice it turned out that there were a lot of pitfalls. It can be
challenging to rewrite an existing SMT solver in Rust without taking the characteristics
of the language into consideration.
Changing an already existing implementation can sometimes be a very challenging
task in Rust, especially in an implementation like this where many of the parts are
implementing specific traits that are necessary in order for the solver to work correctly.
If we make a change to some struct then we have to make sure that it does not interfere
with any traits that the struct may implement. This is especially important for the
parts that make up the logical expressions in this architecture. For example, each clause
has a list of literals, which has a formula, which can have predicates and terms, and so
on. If we make a change to terms so that they can no longer be cloned, all of the types
that directly or indirectly owns terms will no longer be cloneable. Issues like these will
quickly cascade to other parts of the solver.
One problem with the implementation is that there is a lot of cloning of literals and
terms. This can be avoided by using more references, but it would in turn require us to
write explicit lifetime annotations which would make the code harder to understand for
people without any previous knowledge of Rust.
Another problem is that the solver will not treat the literals x = True and True = x
as equal, which means that we need to make sure that we are consistent when it comes
to the construction of literals. A potential way of solving this could be to apply some
preprocessing to the literals so that we never have symmetric predicates where the left
side is a constant and the right side is a variable. This can done in the constructors
where we create the expressions.

6.3 Possible improvements


While it may be possible to makes a few changes to the architecture and implement
the alternative design using traits and trait objects, it is arguably harder to do so than
using enums. Having to add lifetime annotations to the code will make it less readable,
especially for developers who do not have much prior experience with Rust, and it can
quickly become intimidating to work with.
Variables currently have a string-identifier, it would however be cheaper to have the
identifier as an integer instead as integers take up less memory than a heap-allocated
string. String-identifiers can be more human-readable, but there is no need for an
identifier to be human-readable internally. If we wish to print information about what

45
the solver is doing as it is running, we could introduce a table that maps integer-
identifiers to strings. The term “string interning” is used to describe this method of
storing only one copy of each distinct string and then using a lookup table for fetching
them.
When it comes to performance, a critical part of the solver is the procedure of
looking up which rules that can be applied. There is a lot of room for improvements
in this part of the solver. The function get available rules will return all rules that
can be applied at the current state, and how they can be applied. This is good for
debugging, as we can see the options that are available at any given iteration. To
improve performance we could introduce a priority to the di↵erent rules and check the
rules with the highest priority first and pick the first one that is available. Technically,
the current implementation do prioritize rules implicitly since we always apply the first
rule that is added to the list of available rules, but we still check the availability of
all the relevant rules at each iteration. Another thing worth pointing out is that there
is currently no concurrency implemented for this function, even though it should be
possible to check multiple rules in parallel. The function does not mutate the current
state of the solver, which means that it should be easy to split it into multiple functions
that can run concurrently.

46
Chapter 7

Conclusions and Future Work

We have introduced an architecture for an SMT solver using mcSAT that is possible to
implement using Rust. The goal was not to simply implement an SMT solver in Rust,
but to explore how well suited the language is for this type of application. Though
the prototype works for a few chosen examples, it should not yet be considered a fully
working and complete SMT solver. Nevertheless, the solver can in its current state solve
a few examples and it will print out information about how it was solved, including the
current state and which rules that were applied.
Even though Rust has many features that are good for developing maintainable, safe
and performant applications, it has to be acknowledged that there are a few obstacles
that can make the process of development time consuming. Aspiring Rust-developers
can be thrown o↵ by the fact that the Rust-compiler is strict about ownership, borrowing
and lifetimes. In the end, these things should be seen as features of the language rather
than limitations or obstacles, and when taking them into account it is fully possible to
construct a working SMT solver.
As for both the architecture and the prototype, there is room for improvements. In
this first version, it is more important to make the architecture and the implementation
flexible and easy to understand rather than performant. At such an early stage, there
is no need for premature optimization that could end up hurting the project later on.
The evaluation shows that the performance of the prototype is reasonable for very
small instances, but a more thorough evaluation would be required to show exactly how
well it performs against other state of the art solvers. The architecture in its current
state works well enough for the simple Boolean theory that was explored in this report,
but more work would be needed to verify how well it works for more advanced theories.

47
Bibliography

[1] L. De Moura and D. Jovanović, “A model-constructing satisfiability calculus,”


in International Workshop on Verification, Model Checking, and Abstract
Interpretation. Springer, 2013, pp. 1–12. [Online]. Available: http://csl.sri.com/
users/dejan/papers/mcsat-vmcai2013.pdf
[2] H. Ganzinger, G. Hagen, R. Nieuwenhuis, A. Oliveras, and C. Tinelli, “Dpll (t):
Fast decision procedures,” in International Conference on Computer Aided Verifi-
cation. Springer, 2004, pp. 175–188.
[3] A. Zeljić, C. M. Wintersteiger, and P. Rümmer, “Deciding bit-vector formulas with
mcsat,” in International Conference on Theory and Applications of Satisfiability
Testing. Springer, 2016, pp. 249–266.

[4] D. Jovanović, “Solving nonlinear integer arithmetic with mcsat,” in International


Conference on Verification, Model Checking, and Abstract Interpretation. Springer,
2017, pp. 330–346. [Online]. Available: http://csl.sri.com/users/dejan/papers/
jovanovic-vmcai2017.pdf
[5] “Fearless Concurrency - The Rust Programming Language,” https://doc.rust-lang.
org/book/ch16-00-concurrency.html, accessed: 2022-02-03.
[6] “Why Rust for safe systems programming - Microsoft Secu-
rity Response Center,” https://msrc-blog.microsoft.com/2019/07/22/
why-rust-for-safe-systems-programming/, accessed: 2022-02-03.
[7] J.-C. Filliâtre and S. Conchon, “Type-safe modular hash-consing,” in Proceedings
of the 2006 Workshop on ML, 2006, pp. 12–19.
[8] D. Jovanovic, C. Barrett, and L. De Moura, “The design and implementation of the
model constructing satisfiability calculus,” in 2013 Formal Methods in Computer-
Aided Design. IEEE, 2013, pp. 173–180.

[9] K. Beck, M. Beedle, A. Van Bennekum, A. Cockburn, W. Cunningham, M. Fowler,


J. Grenning, J. Highsmith, A. Hunt, R. Je↵ries et al., “Manifesto for agile software
development,” 2001.
[10] “Announcing Rust 1.54.0 — Rust Blog,” https://blog.rust-lang.org/2021/07/29/
Rust-1.54.0.html, accessed: 2022-02-03.

48

You might also like