Introduction to Deep Learning

Deep learning is an incredibly powerful tool of Machine Learning that is based on learning data representations instead of task-specific algorithms. Deep Learning uses networks where data transforms through a number of layers before producing the output.


Traditional machine learning algorithms typically try to define as the set of rules or features in the data and these are usually hand-engineered that’s why it tends to be brittle in practice. For example, if you want to perform facial detection, the first thing you have to do is classify or recognize mouth, eyes, ears and all in the image, if you find everything then you can say there is a face in the image. To recognize each thing you have to define a set of features.

The key idea of deep learning is that you will need to learn the features just from raw data, you just have to take a bunch of images of faces and then the deep learning algorithm is going to develop some hierarchical representation of first detecting lines and edges, using these lines and edges to detect corners and mid-level features like eyes, noses, mouths, etc then composing these together to detect higher-level features like jawlines of the face, etc which can be used to detect the final face structure.

The fundamental building block of a neural network which is a single neuron also called Perceptron


Forward Propagation:
We define a set of inputs to that neuron as x1 through xm. Each of these inputs has a corresponding weight w1  to wm. With each of these inputs and weights, we can multiply them correspondingly together and take a sum of all then we take a single number i.e. summation and we pass it through non-linear activation function that produces our final output ‘Y’. But this is not entirely correct, we also have bias term in this neuron. The purpose of the bias term is to allow to shift activation function to the left or right regardless of your inputs.

We can rewrite the output equation as a linear algebra in terms of vectors and dot products. Now we have a vector of inputs and weights and then we apply non-linearity which is ‘g’(non-linearity activation function:

One common example of the activation function is called the sigmoid function. This function takes any real number as input and it transforms a real number into a scalar output between 0 and 1. And there are many activation functions other than sigmoid.



Comments

Popular posts from this blog

Importance of Activation Functions

The idea of Neural Network