What is Filter Visualization?
Deep Neural Networks are often seen as black-boxes: they map some input to some output, and we can make them do this surprisingly well. However, we usually have no idea how this mapping works. Particularly Convolutional Neural Networks (CNNs), which employ “convolutions” as filters, achieved some impressive results (before Vision Transformers came along).
Filter Visualization can help us understand what kind of patterns the convolutional filters in CNNs detect.
Why would we want to do it?
Visualizing filters can help us to get an understanding of what the neural network is doing. The method can also be used to identify filters that are not required for the model, because they are redundant copies of other filters, or compute not valuable features at all.
How does it work?
Filter visualization aims to find the input $x$ that activates a certain convolutional filter the most. Mathematically, this means we are solving $$ \arg \max_x \mathcal{L} (f(x)) = \sqrt{\sum_i \sum_j f(x)_{ij}^2} $$
where $f(x)_{ij}$ refers to the value at position $i,j$ in the feature map (the output of the filter) computed by $f$.
In practice, we solve this optimization problem via gradient descent (or, in this case, ascend, since we aim to maximize the activation). That is: we start with a randomly initialized input, $x$, calculate the gradient of the magnitude of the filter activation $\nabla_x \mathcal{L}$, and iteratively update $x$ to increase the magnitude.
|
|
Results
What we observe is that, the deeper we go, the more abstract the features become. While the lower features - lines with different orientations, certain colors, and color blobs - are comparatively straightforward to intepret, guessing the meaning of the lower-level features feels more like a Rorschach test.
These observations are evidence for the hypothesis that neural networks learn increasingly abstract, high-level features in upper layers. On the other hand, this also means that we can not really get an understanding of what these lower layers are doing.