Categories
Announcement spotlight

The Performance of PDFA-Learning in the SPiCE Competition

The Sequence PredIction ChallengE (SPICE), organized and co-located with the International Conference on Grammatical Inference (ICGI) 2016,  was won by Chihiro Shibata, who combined LSTM neural networks and strictly piecewise grammars (SP-k, proposed by Heinz et al), the latter capturing long-term dependencies in the input words. The combination beat the competitors using “pure” LSTM- and CNN-based neural networks. Overall, all networks used were not very deep (2 hidden layers), and deeper networks decreased performance.

The task of the competition was to predict a (ranked) list of most likely continuations (a_1, …, a_5) for a given prefix (y_0, .., y_i), based on learning from a training set of complete words.

One of my students (competing as team PING) placed 7th, using the dfasat tool. The main goal was to test a python interface for dfasat (early release here). But what can we  take away from placing 7th? Is PDFA-learning not competitive for sequence prediction? The answer is a solid jein: By using dfasat, we assumed that all problem sets were generated by a deterministic and probabilistic finite state automaton (PDFA). In practice, most problem sets were generated by HMMs or contained linguistic data. Both data types cannot necessarily be learned very well by our PDFA models. The results reflect this, as outlined in the following table. For the HMM problems, we obtain OK-scores. That is expected because our PDFA models are not quite as expressive as the HMMs used to generate the data, but the gap is not too large. On the linguistics data, we really struggle to obtain reasonable scores (e.g. problem 10).

Table of the performance metrics.
Table of the performance metrics.

But problem 9 is a very interesting case: it contains software traces. For this problem type, PDFA models obtained the second best score and beat most of the RNN and CNN approaches. I expect that LSTM/RNN approaches can obtain equally good or better scores, but require a lot more data to learn a model of equal predictive quality. I am planning to analyze the character-level networks (e.g. with methods used here) used by the competitors to understand better what aspects they managed to learn.

I will add a more detailed description of the problem sets later on.

Categories
Announcement spotlight

Succes at the RERS challenge 2016 for automata learning

This year a team from the Radboud University and TU Delft used automata learning to compete in the RERS challenge 2016. The challenge provides (generated) source code where the challenge is to (dis)prove certain LTL formulas and to analyze which error states are reachable. Information on this challenge can be found here: http://www.rers-challenge.org/2016/. Commonly, learning is not used in this competition and only white-box methods are used.

This year, however, automata learning was applied to great success. For the problems where LTL formulas had to be (dis)proven, the team managed to get a perfect score. Other teams did not manage to get so many results here. For the reachability problems, they performed well but did not win in the rankings. The team applied state of the art learning algorithms, but did not tweak or alter them for the challenge.

It is interesting that a black box technique can get such good scores, compared to white box methods. Indeed, less information is used and by using black box techniques one cannot have 100% guarantees. But more results are obtained. So it seems one can trade confidence for scaling to bigger problems.

More information will follow.

Categories
Announcement spotlight

Useful links added

We made a page with some useful resource on automata learning. Of course, any help is welcome. So if you know some tools, benchmarks, use cases, or anything related, please contact us.