AVS #3: Measuring Safety – Why AI cannot do it: Rare Event Analysis

AI is the single most relevant technology that enabled the emergence of Vehicle Automation. Naturally then AV Safety topic follows. But there is a problem: AI and Safety do not fit in one sentence. Why?

AI is all about achieving the goal without having to do the systems engineering legwork – letting the machine build the control model. Driving is an extremely complex task, so AI seems the way to go. However, the AI model is empiric. It requires evidence (training) to identify the pay-offs.

Safety, on the other hand is the very opposite of empiric. Evidence is scarce. But while it is a no-brainer to realise that driving into a wall at 60 mph is dangerous, the AI still needs empiric evidence.

For this reason, encountering a novel road situation in an AV is always a safety-critical situation. Some argue that the number of road situations is finite – stipulating the existence of the Set of All Scenarios, while others believe the complexity is so vast, we will never get there – because more rain to occlude the sensors is always an option.

What is the answer? Nobody knows yet. We will likely figure it out in a decade or two. But AV Safety cannot wait this long. That is why, the AI Black Box systems need a fence. It is our task to draw a map, and we use advanced statistical methods to navigate the Operational Design Domain.

Example: finding the biased coin

Instead of virtually infinitely complex Roadcraft, let us begin with basics. Imagine that you are making bets on a coin tosses. Should you choose (H)eads or (T)ails? An ideal coin should yield a bias of 50-50. But in reality – we do not know and want to find out. Even if it is just 49.5-50.5, we earn £1 for each £100 turned.

Figure 1. How many coin tosses, before we know it really is 50-50? (Image: "Pound Coins" by wwarby)

Figure 1. How many coin tosses, before we know it really is 50-50? (Image: “Pound Coins” by wwarby)

The Beta distribution offers a model of random variable in a [0,1] interval, that allows us to draw a conclusion given limited available evidence. A point on the curve indicates the likelihood of given bias. After 10 tosses, as visualised in Fig. 2, all we can say is that we have a chance that the bias is somewhere between 60-40 and 40-60, not very helpful. Finding precise value requires many thousands of tosses.

How is it relevant to AV Safety testing? How safe is your AV? It is impossible to give the exact number, but we can tell if it is safe enough. This means the so called upper bound confidence interval is considered, i.e. being 95% sure that the vehicle is 99.9% reliable.

Figure 2: Beta distribution - evolution of coin bias belief throughout the first 10 tosses, from yellow as first to blue as tenth.

Figure 2: Beta distribution – evolution of coin bias belief throughout the first 10 tosses, from yellow as first to blue as tenth. Before the 1st toss, the prior is uniform, afterwards linear. Slowly, the distribution converges to maximise around the solution.

Managing limited evidence

To further visualise our approach to safeguarding AV safety, using limited testing evidence, let us return to the previous article, AVS #2. There, we have identified a big weakness of ISO 22737, where a test is passed after 5 successes. Again, assume that a failure happens, pessimistically, once every 100 occurrences, 1%. Repeating the test 5 times, yields the chance of detecting a failure equal to 4.9%. This means that 95% of the time, the test is passed with an unsafe system!

How can we predict probabilities given limited testing available?

There is no escape from the sheer number of tests. But we can be clever about it. Let us lump up all the tests across two, independent parameters: belonging to given Scenario Type & belonging to given Event Variation. The Scenario Type aggregation assumes that failure probabilities are similar across certain road scenarios, such as T-junctions, right-turns, etc. This aggregation allows us to draw probability estimates from much larger pools, significantly improving the confidence in the result.

Next, for Event Variation, we assume that differences between varying scenarios are similar in nature. A set of i.e. T-junctions varies within by relative approach time, light condition, presence of occlusions, precipitation, etc. A right-turn or a roundabout varies by similar parameters. We assume that the bias in the collision likelihood, is similar regardless of the Scenario Type.

Finally, we obtain a matrix of Scenario probabilities and Variation biases. While this approach looses some potential precision in the estimation, in return it maximizes the use of information across the testing domain, maximising precision whilst minimising the computational effort.


Evidence is scarce, and humans are masters of finding solutions with limited data. When it comes to AVs, while we can pay for mistakes with terabytes of training data when it comes to software, when it is a part of Cyber-physical-social system, every bit of information matters. That is why, we push our mathematical toolkit to its limits, to enable Autonomous Vehicle Safety.

Written by: Dr Marcin Stryszowski – Lead Engineer

Please get in touch if you have any questions or have got a topic in mind that you would like us to write about. You can submit your questions / topics via: Tech Blog Questions / Topic Suggestion.


Got a question? Just fill in this form and send it to us and we'll get back to you shortly.


© Copyright 2010-2022 Claytex Services Ltd All Rights Reserved

Log in with your credentials

Forgot your details?