Finite State Automata for Autonomous Driving

This is our team’s first attempt at applying automata learning theory to problems in autonomous driving. Our starting point, learning patterns of car-following behavior, is the most fundamental task in daily driving scenarios. It has been widely recognized the first milestone towards autonomous or semi-autonomous driving is a good cruise-controller for save car-following. The goal is to succinctly describe how a car follows the car in front of it, often called the lead car.

The research present here was accepted in a paper titled “Car-following Behavior Model Learning Using Timed Automata” at The 20th World Congress of the International Federation of Automatic Control, one of the three top conferences in area of automatic control.

We learn a timed automaton model from the Next Generation SIMulation dataset on the I-80 highway. This dataset is from a program funded by the U.S. Federal Highway Administration. It contains car trajectory data, and is so far unique in the history of traffic research, providing a great and valuable basis for validation and calibration of microscopic traffic models. A timed automaton is essentially a finite state machine, consisting of a finite set of states describing the current states, connected by transitions labeled from a finite alphabet. A timed automaton additionally has a guard on each transitions that imposes a time restriction in form of an interval: If the time passed since arriving in the state falls within the interval, the guard is active, otherwise the inactive guard blocks the transition. It imposes a semi-markov condition on the time passed since the last event. The input to a timed automaton is a “time word”: A sequence of symbols (representing a discrete event, like acceleration) annotated with the time passed since the last symbol.

The model we learn from traces of discrete events extracted from the dataset is highly succinct and interpretable for car-following behavior analysis. Using a subsequence clustering technique on the states of the automaton model (i.e., the learned latent state space), the timed automation is partitioned into some regions. Each identified cluster has an interpretation as a semantic pattern, e.g. representing “approaching” and “short/medium/long distance car-following”. A complete car-following period consists of multiple such patterns. The following Figure 1 shows the timed automaton we learned. All clusters (indicating patterns) are distinguished with diff erent colors.

Figure 1: A timed automaton model representing car-following behavior. The colored rectangles represent the clusters identified in a subsequence clustering step on the latent state space.

There are loops with signi cantly large occurrences in cluster 6, e.g., state sequence: 1-6-11-16-1 with symbolic transitions loop: d-j-c-j. We use a clustering as a symbolic representation for the original numeric data, see the code book in Figure 2. The relative distances of “c” and “d” are very close, see the code book in Table 2, but negative and positive respectively. They are associated with “j”, which has a very small speed di fference. This sequence can be interpreted as the steady car-following behavior at short distances, i.e., adapting the speed di fference with the lead vehicle around 0. Similarly interesting and signi cant loops can also be seen in cluster 2 and cluster 4, which are steady long distance and steady medium distance car-following behaviors respectively. An intermediate state S15 in cluster 5 has many incoming transitions, which explains how to transfer between clusters. For the example S6-S15-S4 with transitions “h, i”, i.e., slowing down and speeding up to catch up, from the short distance following in cluster 6 to the medium distance following in cluster 4. The time split can also be seen in two branches of [0, 37] i and [38, 542] i from S15. They share the same symbolic transition condition but have distinct time guards. It means the “i” speed up action followed by short or long duration of “h”, i.e., after how much time the subject vehicle driver notices that their relative distance has been expanded by the lead vehicle and begins to catch up.

Figure 2: Code book of clustering centroids for numeric data.

Figure 3 illustrates a complete car-following example in our dataset.

Figure 3: A car-following example plotted in a 3d feature space.


It starts from the bottom (colored orange), passes through clusters 6, 5, and 3, then finishes in cluster 4. In the beginning, the subject vehicle is following the lead vehicle at short distances. Then the lead vehicle speeds up, see the positive relative speed and the increasing relative distance in cluster 5. The subject vehicle then also speeds up to approach the lead vehicle, see the negative relative speed and the decreasing relative distance in cluster 3. Finally, it follows the lead vehicle at medium distances in cluster 4 2 . We can see that in cluster 6 and cluster 4, the subject car enters an unconscious reaction region, also called a steady car-following episode, i.e., the relative distance and the relative speed are both bounded in a small area. Cluster 3 and 5 can be both treated as intermediate transition processes. Source code as well as an animated video can be found in our code repository on bitbucket.

Imagine that the vehicle under observation is following another car. Its driving status, e.g. approaching, short distance following, or long distance following can be recognized by tracking its states and the corresponding cluster in our model. In future work, we will consider more complex driving scenarios including behaviors such as lane changing, turning, etc. Precise recognition or identification helps autonomous vehicles to better understand its surrounding environment and other traffic.

Another interesting further application of our work is on human-like cruise controller design. The drawbacks of current automatic cruise control (ACC) system lie in inconsistencies between systems and human drivers: 1) driver’s overconfidence or distrust on the system; 2) a mode awareness error when the system consists of two types of ACCs e.g., a high-speed range ACC and a low-speed range ACC; 3) a difference in timing of acceleration/deceleration between drivers and system [1]. The reason is that the control algorithm of an ACC focuses more on mathematical optimization of safety or comfort rather than driving behaviors.

Note that in this line of our work, the model is learned from a large population of drivers’ car-following data. However, it is possible to learn such a controller from a single driver if enough of his/her driving data are available. This is a promising approach for designing a specified car-following controller that actually mimics an individual driver’s driving behavior and habit! Another advantage of our model is an active control strategy, e.g., we can force a state switching from short-distance following to a medium distance in the automaton. We have already done this part of simulation in our journal version.


Why is learning so effective in software testing?

The Communications of ACM published an article on automata learning in software engineering last February. The techniques described in the article are used to obtain models for the (input/output) behaviour of software. Even without access to source code, one can now use model checking or other bug finding tools on these models. The article shows many successful applications. Why is this possible at all?