Automata learning as a satisfiability modulo theories problem

This is part one of a two part series on automata learning and satisfiability modulo theories. Part two will appear soon.

Deterministic finite automata (DFAs) are useful in a variety of applications. However, the problem of learning a DFA of minimal size from positive (accepted) and negative (rejected) strings can be very hard. In fact, it is the optimization variant of the problem of finding a consistent DFA of a fixed size, which has been shown to be NP-complete. In 2010, Marijn Heule and Sicco Verwer presented an algorithm that encodes the problem of learning a DFA from labeled strings as a satisfiability (SAT) problem. Their algorithm has since won the StaMinA competition, and has led to the creation of the dfasat tool (for which Chris has created an exellent tutorial).

In this post, I present an encoding that takes a satisfiability modulo theories (SMT) perspective. This encoding is faster than the one used in dfasat, and benefits from the continuous efforts by fellow researchers on making SMT solvers more powerful. Moreover, I find it more natural, because it makes a distinction between the logic that is required to solve the problem, and the logic imposed by the background theories.