Deep learning

Deep learning

What is deep learning

Deep learning is a type of machine learning in which a model learns to perform classification tasks directly from images, text, or sound. Simply it teaches computers to do what comes naturally to humans by learning from examples. It is the key technology behind every single marvellous invention in the world of AI. Like driverless cars, enabling them to recognize a stop sign, or to distinguish a pedestrian from a lamppost, virtual assistants ranging from Alexa, and Siri to Google Assistant which provide the opportunity to learn more about your voice and accent, thereby providing you with a secondary human interaction experience. It is the key to voice control in consumer devices like phones, tablets, TVs, and hands-free speakers. While deep learning was first theorized in the 1980s, it is getting lots of attention lately for two main reasons. Also, it’s achieving results that were not possible before.

  1. Deep learning requires large amounts of labelled data. For example, driverless car development requires millions of images and thousands of hours of video.
  2. Deep learning requires substantial computing power. High-performance GPUs have a parallel architecture that is efficient for deep learning. When combined with clusters or cloud computing, this enables development teams to reduce training time for a deep learning network from weeks to hours or less.

In deep learning, a computer model learns to perform classification tasks directly from images, text, or sound. Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding human-level performance. Models are trained by using a large set of labelled data and neural network architectures that contain many layers.

Deep Learning as a State-of-the-Art

  • It is because the following three technological enablers make this degree of accuracy possible.
  • It is easy to access massive sets of labelled data
  • Increase of computing power – High-performance GPUs accelerate the training of the massive amounts of data needed for deep learning, reducing training time from weeks to hours
  • Pretrained models built by experts.

Day-to-day applications of deep learning

  • Deep learning applications are used in industries from automated driving to medical devices.
  • Automated Driving: Automotive researchers are using deep learning to automatically detect objects such as stop signs and traffic lights. In addition, deep learning is used to detect pedestrians, which helps decrease accidents.
  • Aerospace and Defense: Deep learning is used to identify objects from satellites that locate areas of interest, and identify safe or unsafe zones for troops.
  • Medical Research: Cancer researchers are using deep learning to automatically detect cancer cells. Teams at UCLA built an advanced microscope that yields a high-dimensional data set used to train a deep learning application to accurately identify cancer cells.
  • Industrial Automation: Deep learning is helping to improve worker safety around heavy machinery by automatically detecting when people or objects are within an unsafe distance of machines.
  • Electronics: Deep learning is being used in automated hearing and speech translation. For example, home assistance devices that respond to your voice and know your preferences are powered by deep learning applications.
  • An ATM rejects a counterfeit bank note.
  • A smartphone app gives an instant translation of a foreign street sign.

How does it work?

First of all, let us dive deep and figure out the way how it processes.

Most deep learning methods use neural network architectures, which is why deep learning models are often referred to as deep neural networks.



A deep neural network combines multiple nonlinear processing layers, using simple elements operating in parallel and inspired by biological nervous systems. It consists of an input layer, several hidden layers, and an output layer. The layers are interconnected via nodes, or neurons, with each hidden layer using the output of the previous layer as its input.

Let’s say we have a set of images where each image contains one of four different categories of object, and we want the deep learning network to automatically recognize which object is in each image. We label the images in order to have training data for the network.

Using this training data, the network can then start to understand the object’s specific features and associate them with the corresponding category. Each layer in the network takes in data from the previous layer, transforms it, and passes it on. The network increases the complexity and detail of what it is learning from layer to layer. Notice that the network learns directly from the data—we have no influence over what features are being learned.

Deep learning models are trained by using large sets of labelled data and neural network architectures that learn features directly from the data without the need for manual feature extraction. A convolutional neural network (CNN, or ConvNet) is one of the most popular algorithms for deep learning with images and video. Like other neural networks, a CNN is composed of an input layer, an output layer, and many hidden layers in between. The internal process in a CNN is extracted below using a flow diagram descriptively.

Deep learning can be extracted as an AI function that mimics the workings of the human brain in processing data for use in detecting objects, recognizing speech, translating languages, and making decisions.