In linguistic applications, tasks typically are translating a sentence, or deciding whether a given string belongs to a specific language. In the past, popular models to learn rules were finite state machines, pushdown automata, and hidden Markov models. We understand these models fairly well, and they each describe a class in the Chomsky hierarchy. This makes them very apt to model formal systems. But when it comes to describing natural language and solving problems in NLP, the rules imposed by formal grammars are often too…Continue Reading “Looking beyond Automata Models: Transducing and Grammar Learning with Neural Machines”

The precision, speed and deterministic, algorithmic problem-solving strategies of computers are often idealized. Consequently, computers are often seen as unbiased and objective. This view is also transferred to automated decision making using machine-learned models. But this is dangerous for multiple reasons: Between false positives and false negatives, models can be wrong in more than one way. The effect of the human component in the data can be severe and is often ignored or grossly underestimated (see for example this paper here): The data we collect in…Continue Reading “4 properties making automata particularly interpretable”

I am very happy to announce that we finally have a nice introduction to our dfasat tool: a short python notebook tutorial  (html preview) originally developed for a 2-hour hands-on session at the 3TU BSR winter school. The notebook works you through basic usage and parameter setting. It also contains a small task to familiarize the user with the effect of different parameter settings. At the moment, dfasat has about 30 different options to choose from. Some can be combined, whereas other combinations have never been tried in combination….Continue Reading “A passive automata learning tutorial with dfasat”

Table of the performance metrics.

The Sequence PredIction ChallengE (SPICE), organized and co-located with the International Conference on Grammatical Inference (ICGI) 2016,  was won by Chihiro Shibata, who combined LSTM neural networks and strictly piecewise grammars (SP-k, proposed by Heinz et al), the latter capturing long-term dependencies in the input words. The combination beat the competitors using “pure” LSTM- and CNN-based neural networks. Overall, all networks used were not very deep (2 hidden layers), and deeper networks decreased performance. The task of the competition was to predict a (ranked) list of most…Continue Reading “The Performance of PDFA-Learning in the SPiCE Competition”