4 properties making automata particularly interpretable

The precision, speed and deterministic, algorithmic problem-solving strategies of computers are often idealized. Consequently, computers are often seen as unbiased and objective. This view is also transferred to automated decision making using machine-learned models. But this is dangerous for multiple reasons: Between false positives and false negatives, models can be wrong in more than one way. The effect of the human component in the data can be severe and is often ignored or grossly underestimated (see for example this paper here): The data we collect in real life has some context, and this context can introduce a bias.

Intrinsically criminal face according to
Example case: A bias in the training data has a strong impact on the usability of such a system, regardless of the quality or correctness of the learner itself. Figure taken from

For example in psychology, questionnaires and experiments are typically given to other students. On top of the data collection, to use supervised applications, data needs to be labeled. In many cases, human labeling can introduce more errors, e.g. by mislabeling, omission, or misinterpretation of the data sample. Moreover, effects and correlations present in society, e.g. caused by sexism, racism, or poverty can be preserved or amplified in collected data.
All in all, these problems lead to vague demands to (be able to) understand what our predictive models are doing, and why they are doing it. Responses to this demand have been diverse, and lead to the creation of workshops such as Interpretable ML@NIPS and WHI@ICML. Initiatives like the workshop on Fairness, Accountability and Transparency (FATML) can also be seen in this light. A paper I really like, The Mythos of Model Interpretability, sheds some light on the different definitions, needs, and motivations researchers and practitioners bring to the table. I think one key made in this paper, despite seemingly trivial, is:

If you don’t specify your needs for interpretation or explanations, you cannot expect your needs to be met by the model.

It seems that computer scientists tend to forget this. It is not too much of a surprise: We’re used to extract meaning from syntactical and mathematical structures because we use these structures to describe how computers work. But not every machine learning practitioner or receiver of a machine learned decision is a computer scientist, and not every mathematical description is readily accessible and understandable to computer scientists either.

In our work, we use finite state machines, as depicted in the next figure. Most computer scientists are taught finite state machines very early on, as one of the first formal systems to encounter—only to never really hear of them again. They are related to other, more expressive automata models like push-down automata, Büchi machines, hiddenMarkovv models, and other less well-known variants. In the field of grammatical inference/grammar learning, inferring such models from given data is the main task.

Finite State Machine with accepting and rejecting states.
Figure: Finite State Machine with accepting and rejecting states. Accepting samples: aa, b, bba; Rejecting samples: a, aaa, aabb

Finite state machines and variants are generators (or acceptors) of sequence data. They can accept or reject a given string, and therefore be used to cluster sequences. For a given string, seen as a prefix, an automaton can be used to obtain a list of possible continuations or a distribution over possible continuations. In this way, automata can be used for sequence prediction. Finite state machines are not Turing complete and have a limited expressiveness. They will not approximate any function very well. But in practice, a lot of problems are still described fairly well; in fact, they are almost as expressive as hidden Markov models which have an internal memory that is logarithmic to the number of states. For problems that require limited memory, e.g. high-level description of phenomena, they are a good choice. Very common use cases of automata are in software engineering, where they are used for specifying the desired behavior of systems to be implemented.

In terms of interpretation, they I think that 4 key properties make them very easy:

  1. Automata have an easy graphical representation as cyclic, directed, labeled graphs, offering a hierarchical view of sequential data.

Instead of looking a large set of long sequences, we can look at a model that has loops and cycles. It is a much more compact representation of the same data.

  1. Computation of automata is transparent.

In each step of the computation can be verified manually (e.g. visually), and compared to other computation paths through the latent state space. This makes it possible to analyze training samples and their contribution to the final model. It is also possible to answer questions like “What would happen if the data were different at this stop of the sequence?” or “What other data leads to the same computation outcome?”.

  1. Automata are generative models.

Sampling from the model, e.g. “pressing play”, helps to understand what it describes: By generating a wide range of possible computation paths, tools like model checkers can be used to query properties in a formal way, e.g. using temporal logic. This can help to analyze the properties of the model in a formal way.

  1. Automata are well studied in theory and practice.

We know a lot about composition and closure properties of automata and their sub-classes. We can relate them to equally expressive formalisms. In many cases, this allows us to think about the model as a composition of smaller parts and makes it easy for humans to transfer their knowledge onto it: The model is frequently used in system design as a way to describe system logic. We can use this knowledge to understand a learned model, and relate it to known functions.

We try to summarize these points, together with some more examples, in our paper online on arxiv. The abstract reads:

Automaton models are often seen as interpretable models. Interpretability itself is not well defined: it remains unclear what interpretability means without first explicitly specifying objectives or desired attributes. In this paper, we identify the key properties used to interpret automata and propose a modification of a state-merging approach to learn variants of finite state automata. We apply the approach to problems beyond typical grammar inference tasks. Additionally, we cover several use-cases for prediction, classification, and clustering on sequential data in both supervised and unsupervised scenarios to show how the identified key properties are applicable in a wide range of contexts.

I am very happy and grateful to receive your thoughts and feedback on it. What do you think about the interpretability and understandability of automata?