Posts

Convolutional Neural Networks(Part-4)

Image
  AlexNet AlexNet is considered to be the first paper/ model which rose the interest in CNNs when it won the ImageNet challenge in 2012. AlexNet is a deep CNN trained on ImageNet and outperformed all the entries that year. It was a  major improvement  with the next best entry getting only 26.2% top 5 test error rate. Compared to modern architectures, a relatively simple layout was used in this paper. ZFNet ZFNet is a modified version of AlexNet which gives a better accuracy.  One major difference in the approaches was that ZFNet used 7x7 sized filters whereas AlexNet used 11x11 filters. The intuition behind this is that by using bigger filters we were losing a lot of pixel information, which we can retain by having smaller filter sizes in the earlier conv layers. The number of filters increase as we go deeper. This network also used ReLUs for their activation and trained using batch stochastic gradient descent. GoogLeNet The GoogLeNet architecture is very different f...

Convolutional Neural Networks(Part-3)

Image
  Forward and Backward Propagation using Convolution operation For the forward pass, we move across the CNN, moving through its layers and at the end obtain the loss, using the loss function. And when we start to work the loss backwards, layer across layer, we get the gradient of the loss from the previous layer as  ∂L/∂z.  In order for the loss to be propagated to the other gates, we need to find  ∂L/∂x  and  ∂L/∂y . Now, lets assume the function  f is a convolution  between   Input X and a Filter F.  T he basic difference between  convolution  and  correlation  is that the  convolution  process rotates the matrix by 180 degrees .   Input X is a 3x3 matrix and Filter F is a 2x2 matrix, as shown below: Convolution between Input X and Filter F, gives us an output O. This can be represented as: To derive the equation of the gradients for the filter values and the input matrix values we will consider that ...

Convolutional Neural Networks(Part-2)

Image
  Cross- Correlation Cross-correlation  is a  measure of similarity  of two series as a function of the displacement of one relative to the other. This is also known as a  sliding  dot product  or  sliding inner-product . It is commonly used for searching a long signal for a shorter, known feature. It has applications in  pattern recognition ,  single particle analysis ,  electron tomography ,  averaging ,  cryptanalysis , and  neurophysiology .  Steps followed in Cross- Correlation Take two matrices with same dimensions. Multiply them one by one, element by element (i.e , not the dot product, just the simple multiplication). Sum the elements together. VGG16 VGG16 is a convolutional neural network model proposed by K. Simonyan and A. Zisserman from the University of Oxford in the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition”.  The model achieves 92.7% top-5 test accuracy in Image...

Convolutional Neural Networks(Part-1)

Image
  Convolutional Neural Networks Convolutional Neural networks are designed to process data through multiple layers of arrays. This type of neural networks is used in applications like image recognition or face recognition. The primary difference between CNN and any other ordinary neural network is that CNN takes input as a two-dimensional array and operates directly on the images rather than focusing on feature extraction which other neural networks focus on. The classic, and arguably most popular, use case of these networks is for image processing. Image classification is the task of taking an input image and outputting a class (a cat, dog, etc) or a probability of classes that best describes the image. When a computer sees an image (takes an image as input), it will see an array of pixel values. Depending on the resolution and size of the image, it will see a 32 x 32 x 3 array of numbers (The 3 refers to RGB values). What we want the computer to do is to be able to differentiate ...