Filter Visualization

Code: Here

What is Filter Visualization?

Deep Neural Networks are often seen as a black-box: they map some input to some output, and we can make them do this surprisingly well. However, we usually have no idea how this mapping works.

Why would we want to do it?

Visualizing filters can help us to get an understandin of what the neural network is doing. The method can also be used to identify filters that are not required for the model, because they are copies of other filters, or compute not valuable features at all.

How does it work?

Filter visualization, as mentioned above, aims to find the input $x$ that activates a certain convolutional filter the most. Mathematically, this means we are solving $$ \arg \max_x \mathcal{L} (f(x)) = \sqrt{\sum_i \sum_j f(x)_{ij}^2} $$

where $f(x)_{ij}$ refers to value at position $i,j$ in the feature map computed by $f$.

In practice, we solve this optimization problem via gradiet descent (or, in this case, ascend, since we aim to maximize the activation). That is: we start with a randomly initialized $x$, calculate $\nabla_x \mathcal{L}$ and iteratively update $x$ accordingly.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
x = torch.randn(size=(3,256,256))

x_v = Variable(x.unsqueeze(0)).cuda()
x_v.requires_grad = True

for i in range(50):
    out = net.conv1(x_v)
    f = out[0, filter_no]

    loss = f.pow(2).sum().sqrt()

    # zero grads
    if x_v.grad is not None:
        x_v.grad.data = torch.zeros_like(x_v.grad.data)

    loss.backward()

    with torch.no_grad():
        # gradient normalization and upate 
        x_v.grad /=  x_v.grad.pow(2).mean().sqrt() + 0.000001
        x_v += x_v.grad * 0.01 

Results

What we observe is that, the deeper we go, the more abstract the features become. While the lower features - lines with different orientation, certain colors, and color blobs - are comparatively straight forward to intepret, guessing the meaning of the lower level features feels more like a Rorschach test.

Filter visualization of the first concolutional layer of a ResNet 101

Filter visualization of the first concolutional layer of a ResNet 101

These observations are evidence for the hypothesis that neural networs learn increasingly abstract, high-level features in upper layers. On the other hand, this also means that we can not really get an understanding of what these lower layers are doing.

Filter visualization of the last layer of a ResNet 101

Filter visualization of the last layer of a ResNet 101


Last Updated: 27 Jul. 2022
Categories: Deep Learning
Tags: Deep Learning · Introspection · Explainability