Valid Padding in Convolution Neural Networks

neural_network
machine_learning

#1

Hi all,
I was going through the valid padding technique in convolution neural nets on udacity deep learning course, where I came across a quiz in which we have a 2828 image on which we ran 33 convolution with input depth of 3 and output depth of 8. We have to write the output dimension. I have attached a screenshot below with the solution.

I am unable to understand output dimension of the valid padding with 1 stride, why the output dimension turned out to be 26*26? Please help me with that.
Thanks in advance,
Syed Danish


#2

This type of problem is easier to explain through visualization, I would recommend you to see convolution demo here. (Its in the summary part of convolutional layer explanation)


#3

I used some examples to understand this .You can draw and work this out like me.

Stride mechanism for valid padding -

Case 1:
image size 4x4x3, filter size = 3x3.

I start from index 0 take 1 stride and reach the end of the image Its time for me to go downwards.
and then for the final place where I stopped I get 1 pixel in output. So in total I get totally 2 pixel in output for left to right iteration.
In similar manner I shift my filter from top to down and I get again 2 pixels for the next iteration.

We can think of it like for each stride, I get 1 output pixel.
I asked myself - how many strides I take just before the last pixel I get when traversing from left to right. The number of strides must be (input_layer size - filter size) for the above example 4-3 =1 .
Then for the place where I stop - the end of the image I get 1 pixels. So overall 1+1 = 2 pixels
Now coming to your example -
No. of strides I take just before I get the last pixel ( in L to R traversal) = 28 - 3 = 25. I get 25 pixels output
Now for the last stride I get 1 output pixel => total output pixels = 25 + 1 = 26.
Similarly I move downwards. I hope you are able to visualise.

Generally -
(stride_length)* x + filter_size <= input_layer_size
dimensions will the be x+1,x+1,depth , where x will be the number of strides you can take