1- By creating the convolution, we are able to isolate specific aspects of the image. For this picture example we are isolating sharp edges and straight lines, meaning the convolution will only pass through the sharp edges and straight lines.
The first image is of the original picture.
This second image emphasizes the horizontal lines. This is clear when looking at the top left of the picture.
The third image emphasizes the vertical lines.
Pooling is able to reduce the amount of information within the image but still maintain the features that can be identified. For this example, the size of the image 512 pixels; however, now the pooling and specified code of ‘new_x = int(size_x/2)’ and ‘new_y = int(size_y/2)’ reduced the image down to 256 pixels. We are using max pooling so that convolution will look at the pixels and their immediate neighbors, then it will take the largest and transfer it to a new image. Therefore, the image will only be ¼ of the original size.
The CNN performed better, because it was better able to recognize the features when adding the convolution layers. By adding more convolution layers the accuracy was able to improve but it also caused the run time to take longer since the data had to be processed through additional filters. When using the callback end_epochs function, I ran the code through two layers of convolution and max pooling.
When both layers were set to 32 filters, the accuracy was 99.1%.
Then for when the layers were set to 64 and 16 filters the accuracy was 98.78%.
When the layers were 64 and 32 the accuracy was 98.83%.