Building a platform for exploring and visualizing deep convolutional networks

2018

Deep artificial neural networks are showing a lot of promise when it comes to tasks

involving images, such as object recognition and image classification. In recent

years, there has been a steady increase in computing power, which has opened for

the possibility of training deeper and more complex artificial neural networks. This,

in addition to improved training methods, has been a major contributing factor

for creating a new wave of AI within computer science. However, as artificial

intelligence becomes more complex, it gets increasingly harder to explain an AIs

reasoning. It is intriguing that computers achieve human-like results for image

classification tasks, but from a research point of view, the reason why it performs

so good might be even more interesting.

In this thesis, we aim to get a better understanding of deep convolutional neural

networks. We attempt to increase our knowledge of these networks by creating

a platform where one can apply different visualization methods on different pretrained

network architectures. First, we introduce the concept of convolutional

neural networks, and how they work. We then give an introduction to the field of

visualizing neural networks by explaining several state-of-the-art methods which

aim to give a better understanding of neural networks. After introducing multiple

methods, we explain our implementation of a selected few. We also give an

introduction to the platform we created for handling the different visualization

techniques. Included in this platform is a user interface, which simplifies the process

of applying visualization techniques to different networks and retrieving the

results. Finally, we examine the implemented techniques, while trying to explain

their behavior and what information they can give us about a convolutional network.

Additionally, we try to combine different visualization methods, to see if they

offer any useful information beyond what each method offers individually.

Interpreting the results of visualizations proved to be a challenging task, but we

still feel like there was some information to be gained from each distinct method.

Certain techniques showed results which could be useful for troubleshooting faulty

networks, while others indicated features which might be vital for correctly classifying

images. Combining different techniques yielded results that were difficult to

interpret clearly, but could prove to be a path worth researching further.

NTNU