Is Deep Learning Really the Solution for Everything in Self-Driving Cars?

Posted: 11/07/2017

The higher the level of automation for a car, the more “intelligent” the vehicle needs to become. Evolved from the study of pattern recognition and computational learning theory in artificial intelligence, machine learning explores the study and construction of algorithms that can learn from and make predictions on data – such algorithms overcome following strictly static program instructions by making data-driven predictions or decisions, through building a model from sample inputs.

One of the biggest movements in the automotive industry now is the uprising of self-driving cars (autonomous vehicles, intelligent vehicles are some of the other names that are used). The degree to which the vehicle is given the freedom to drive autonomously is called Level of Automation. There are 5 levels in total starting from Level 0 (fully controlled by the driver), Level 1 (simple ADAS functions, such as steering control or acceleration), Level 2 (steering and acceleration can be controlled by the car, but the driver needs to be ready to take over). The best example of Level 2 automation is Tesla’s Autopilot. Level 3 automation still requires the driver to be present, but is able to completely shift "safety-critical functions" to the vehicle, under certain traffic or environmental conditions. Level 4 is considered to make the vehicle fully autonomous, up to some exceptional driving scenarios. Finally, Level 5 represents a fully autonomous self-driving car. 

Current state of technology already includes Level 2 (Tesla auto-pilot). Audi claims that it’s next A8 model will have Level 3 automation present ([1]). 

The higher the level of automation, the more “intelligent” the vehicle needs to become. Vehicle intelligence can be divided into 4 parts: sensing, perception, prediction and planning. In the sensing part, the data is gathered from the sensors, perception stage is responsible for using this information to create an understanding of the environment by means of computer vision. Prediction makes use of existing scenarios and training data to predict new outcomes and situations (such as the next movement of an object) and planning is the phase where the car makes purposeful decisions about its actions.

The last three of the four stages make use of machine learning.

Evolved from the study of pattern recognition and computational learning theory in artificial intelligence, machine learning explores the study and construction of algorithms that can learn from and make predictions on data – such algorithms overcome following strictly static program instructions by making data-driven predictions or decisions, through building a model from sample inputs. ([2])  

A special case of machine learning is deep learning, where the algorithms used mimic the human brain, by using so-called artificial neural networks (ANN). To quote the authors in [3]:

From a layman’s standpoint, deep learning is a high performance, dynamic way of computerized decision-making that can learn features, objects, and patterns automatically and more accurately with the more (and better quality) data you give it. A deep learning system identifies and classifies patterns utilizing a set of analytical layers. The first layer does a relatively primitive task, such as identifying the edge of an image. It then sends the output to the next layer, which does a slightly more complex task, such as identifying the corner of the image. This process continues through each successive layer until every feature is identified. In the final, deepest layer, the system should reliably and quickly recognize the pattern.”

Deep learning is becoming very popular in the area of vehicle intelligence (especially in the progressive Silicon Valley circles) and some argue that this should be the preferred machine learning technique used in a fully intelligent, self-driving vehicle. In this article we discuss some of the current approaches in deep learning and review arguments for and against one of the main questions in the self-driving car industry currently discussed: “is deep learning really the solution for everything?”

How deep learning is used and what are its advantages

The main application of deep learning within the automotive domain is that of advanced computer vision and perception. Visual tasks, including, but not limited to lane detection, pedestrian detection, road signs recognition and blind-spot monitoring are handled more effectively with deep learning.

The main difference between machine learning and deep learning is the depth to which the system can autonomously teach itself. Where as machine learning uses features from input (from training data) and makes predictions based on a single or a few layers of nodes, a deep neural network contains many hidden layers that adds new features and exceeds human coding capacity. This makes deep learning more powerful for complex computing tasks such as object recognition. Noteworthy is the improvement of deep learning using a convolutional neural network, where the input is the whole image and thus embeds feature extraction. The following picture represents a neural network that is used to recognize an image of a cat (first example of the use of deep learning in computer vision)([4]).


Although in certain scenarios, such as driving in high resolution mapped cities or along fixed routes, simple machine learning algorithms are sufficient to handle the above tasks, in more complex situations, such as multiple unknown destinations or changing routes, deep learning is the more suitable option ([3]).  

But in what ways is deep learning specifically applied in self-driving cars? There are two main approaches, which both have their own advantages and shortcomings. 

The first one is using semantic abstraction, where the problem of autonomous driving is broken down into several components. These are algorithms that are focused only on one part of the task. For example, one component could be focused on pedestrian detection, another to detecting lane markings and a third one to detecting objects beyond the lanes. At the end, these components are “glued together” into a master network that makes the driving decisions. On the other hand, a network can be constructed that detects and classifies multiple classes or even does semantic segmentation. 

The advantages of such a system, is the lower tolerance for mistakes, the ability to pinpoint the errors more easily and the capability to manage unpredictable situation better. Its shortcomings, however, are also big, since it requires huge pre-work and complex programming ([3]). 

The second approach is the more “disruptive” end-to-end learning approach. This is where the car actually teaches itself how to drive, based on a huge set of human driving data. Although this approach also has big shortcomings, such as the requirement of having a huge training data set and the difficulty to be trained and tuned properly, it is very promising for the future of intelligent vehicles. 

In the seminal paper [5], the authors from Nvidia trained a convolutional neural network (CNN) to map raw pixels from a single front-facing camera directly to steering commands.

Shortcomings and challenges of deep learning

Processing power:

First of all, since deep learning requires such a high level of computing power, a very powerful “brain” is needed to handle the big data capabilities and processing requirements. Currently, the most suitable technology is the so-called GPU (graphical processing unit), since it is designed to handle heavy image processing tasks (known from for example the computer gaming industry). Currently the companies Nvidia and Intel are on their way to position themselves as leaders supplying the “brains” for the intelligent vehicle market. 

However, it is still a challenge to have a low cost GPU that operates within the energy consumption and other boundaries, such as heat management, that is required for a market-ready vehicle. Moreover, companies still struggle with bandwidth and synchronization issues.

Available training data:

As noted before, an end-to-end learning system especially, requires to be fed a huge amount of training data, in order to predict as many driving scenarios as possible and to fulfil a minimum safety requirement. 

Some claim that at least a billion kilometers of training data from realistic road scenarios are needed in order to make conclusions about safety of the vehicle. Not only that, the data needs to be diverse enough for it to be useful (driving one kilometer a billion times back and forth won’t do the job!)


One of the main challenges with safety of deep neural networks is the fact that they are unstable under so-called adversarial perturbations ([6]). For example, minimal modifications in camera images, such are resizing, cropping and the change of lighting conditions might cause the system to misclassify the image. 

Additionally, in general, safety assurance and verification methods for machine learning are poorly studied. The prevailing automotive safety standard of ISO26262, does not have a way to define safety for self-learning algorithms such as deep learning. Hence, there is still no way to standardize the safety aspect yet, due to the fast pace of current technology. 

A prominent example of a safety failure is the 2016 Tesla auto-pilot accident, where the sensors of the vehicle were blended by the sun and the system failed to recognize the truck coming from the right, leading to the crash [9]. This shows that a lot still needs to be investigated before we can conclude that the current configuration of a (partially) self-driving car is safe. 

What is some of the other criticism of deep learning? We asked Dr. George Siogkas, founder of CVRLab and consultant to Panasonic Automotive and Industrial GmbH about his opinion, which can be summarized in the following points:

  • Generalisation to scenarios unseen during training is doubtful and has to be thoroughly investigated. 
  • Problems with very solid geometrical foundations (optical flow, stereo vision, structure from motion) would in theory be solved more efficiently with other methods
  • Annotation time and effort increases exponentially. Realistic simulations would help, but the tools are not mature yet.
  • Debugging system failures becomes harder (for now). This becomes very problematic in the case of end-to-end systems. Powerful visualisation tools for network layers are one first step towards this. 
  • Need for very powerful, automotive grade hardware that is also affordable for mass production.
  • Deep Learning on data coming from sensor fusion is not solved yet and poses great difficulty, due to the diverse nature of the data and the sheer volume of it.
  • As with all ground-breaking technologies, there is a trend of selling it as a "silver-bullet" to get funding. This is dangerous, especially when combined with the marketing hype that misinterprets autonomous vehicles.

Other AI techniques in self-driving cars

In the previous paragraphs we have been focusing on deep learning and its pros and cons. But there are a number of other machine learning techniques that are currently being used in ADAS applications. To name a few [11]: 

Regression algorithms: 

Estimate the relationship between two or more variables and compare the effects of variables measured on different scales. They are mostly used to develop image-based models for prediction and feature selection. 

Pattern recognition and classification algorithms: 

Are used filter out irrelevant data points and reduce data. 

Clustering algorithms: 

Are applied to images that are unclear or difficult to detect. They use inherent structures in the data to best organize it into groups of maximum commonality.

Decision matrix algorithms: 

These are mainly used for decision making. These algorithms are good at systematically identifying, analyzing, and rating the performance of relationships between sets of values and information.

Although computer vision is the main area of application of machine learning for (partially) autonomous vehicles, there are other applications of machine learning or AI in general that are also important. The following two are examples in the area of vehicle-driver interaction.  

Driver state monitoring using machine learning

This of course applies to only partially automated driving systems, where the driver is still involved. In [6] the authors proposed a method for non-intrusive, real-time detection of visual distraction using vehicle dynamics. The method used here is the so-called support vector machine (SVM). SVM is a non-probabilistic binary linear classifier, that is used to divide examples (represented by points on a plane) into two categories which are divided with a gap that is as wide as possible. There is also a non-linear version of this method. 

Natural language processing and in-car voice commands

Another interesting branch of artificial intelligence is the so-called natural language processing (NLP). This is a method, where the machine learns to understand and generate natural language, thus enabling an interaction between computer and human languages. The most prominent examples of NLP are Amazon’s Alexa or Apple’s Siri. As for automotive applications, NLP is already used for some basic entertainment purposes, such as choosing your favorite music or in some cases even processing commands such as rolling down windows or locking the door. 

When extending this logic to the scenario when we have a fully self-driving car, it becomes obvious that the occupant will have to be bale to communicate with the car in order to for example change the destination or the route. As the author in [10] points out, this will not be a one-way communication, since the car will have to give feedback on the occupant’s commands to for example present dangerous scenarios. 


The aim of this article was to shed a critical light on the subject of deep learning and its use in the realms of autonomous vehicles. Even though this special machine learning method is very promising for becoming the main type of algorithm that is used to let car drive autonomously, it still has some serious shortcomings that need improvement. We have pointed out some of them in the third section of this article. On top of that there are many other effective machine learning methods are currently being used for ADAS applications and will be applied in future self-driving cars. All in all, deep learning still seems to be the most promising method used for automotive computer vision and will certainly become even more important, as the technology evolves. 


[1] Audi's A8 self-driving tech depends on regulatory changes. Source:

[2] Wikipedia: Machine Learning. Source:

[3] “I see. I think. I drive. (I learn).”-  KPMG Article. Source:

[4] “10 Misconceptions about neural networks”-  Turing Finance. Source:

[5] “End to End Learning for Self-Driving Cars”-  Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, Xin Zhang, Jake Zhao, Karol Zieba (Nvidia) Source:

[6] “Safety verification of deep neural networks” - Xiaowei Huang, Marta Kwiatkowska, Sen Wang and Min Wu. Source:

[7] “Real-Time Detection System of Driver Distraction Using Machine Learning” – Fabio Tango, Marco Botta (CRF). Source:

[8] “Machine learning used in self-driving cars” – Kdnuggets.  Source:

[9] “Understanding the fatal Tesla accident autopilot and the NTHS probe.” Source:

[10] “In-car voice commands NLP for self-driving cars.” Source:

[11] Machine learning algorithms in self-driving cars” – Dr. Anshul Saxena. Source:

Posted: 11/07/2017



Hilton Cologne, Cologne, Germany
December 12 - 14, 2017
Twickenham Stadium, London, United Kingdom
January 22 - 25, 2018