Learn Challenge: Plotting Attention Heatmaps | Applying Transformers to NLP Tasks

Section 3. Chapter 4

single

Swipe to show menu

Visualizing attention weights with a heatmap helps you interpret how a transformer model distributes its focus across a sentence. You use matplotlib to plot a heatmap where both the x-axis and y-axis represent the tokens from the sentence. Each cell in the heatmap shows the attention weight between a pair of tokens: the row corresponds to the query token, and the column corresponds to the key token.

Start by splitting your sentence into tokens:

sentence = "Transformers help models focus on important words."
tokens = sentence.split()

Next, define your attention matrix as a NumPy array. Each value represents the attention weight from one token to another:

import numpy as np
attention = np.array([
    [0.20, 0.10, 0.05, 0.10, 0.25, 0.10, 0.20],
    [0.05, 0.30, 0.10, 0.10, 0.15, 0.20, 0.10],
    [0.10, 0.15, 0.35, 0.10, 0.10, 0.10, 0.10],
    [0.10, 0.10, 0.10, 0.30, 0.10, 0.15, 0.15],
    [0.15, 0.10, 0.10, 0.10, 0.30, 0.10, 0.15],
    [0.10, 0.10, 0.15, 0.10, 0.10, 0.35, 0.10],
    [0.20, 0.15, 0.10, 0.10, 0.10, 0.10, 0.25],
])

To plot the heatmap, use the following code:

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(8, 6))
im = ax.imshow(attention, cmap="viridis")

ax.set_xticks(np.arange(len(tokens)))
ax.set_yticks(np.arange(len(tokens)))
ax.set_xticklabels(tokens, rotation=45, ha="right")
ax.set_yticklabels(tokens)

cbar = plt.colorbar(im, ax=ax)
cbar.set_label("Attention Weight")

ax.set_title("Attention Heatmap")
ax.set_xlabel("Key Tokens")
ax.set_ylabel("Query Tokens")

plt.tight_layout()
plt.show()

Brighter or darker colors indicate higher or lower attention values, depending on the colormap. When you look at the heatmap, you can see which words the model pays the most attention to when processing each token. For example, if the cell at row focus and column important is bright, the model strongly connects focus to important in its internal representation. This visualization helps you understand which parts of the input sentence influence each other and is useful for diagnosing or interpreting model behavior in natural language processing tasks.

Now, execute the code to see the resulting heatmap and then write your first visualizing plot.


              123456789101112131415161718192021222324252627282930313233
            
import numpy as np
import matplotlib.pyplot as plt

sentence = "Transformers help models focus on important words."
tokens = sentence.split()

attention = np.array([
    [0.20, 0.10, 0.05, 0.10, 0.25, 0.10, 0.20],
    [0.05, 0.30, 0.10, 0.10, 0.15, 0.20, 0.10],
    [0.10, 0.15, 0.35, 0.10, 0.10, 0.10, 0.10],
    [0.10, 0.10, 0.10, 0.30, 0.10, 0.15, 0.15],
    [0.15, 0.10, 0.10, 0.10, 0.30, 0.10, 0.15],
    [0.10, 0.10, 0.15, 0.10, 0.10, 0.35, 0.10],
    [0.20, 0.15, 0.10, 0.10, 0.10, 0.10, 0.25],
])

fig, ax = plt.subplots(figsize=(8, 6))
im = ax.imshow(attention, cmap="viridis")

ax.set_xticks(np.arange(len(tokens)))
ax.set_yticks(np.arange(len(tokens)))
ax.set_xticklabels(tokens, rotation=45, ha="right")
ax.set_yticklabels(tokens)

cbar = plt.colorbar(im, ax=ax)
cbar.set_label("Attention Weight")

ax.set_title("Attention Heatmap")
ax.set_xlabel("Key Tokens")
ax.set_ylabel("Query Tokens")

plt.tight_layout()
plt.show()

Task

Swipe to start coding

Plot an attention heatmap for the sentence "Attention helps models understand context." using the following attention matrix:

attention = [
    [0.25, 0.15, 0.20, 0.20, 0.20],
    [0.10, 0.40, 0.15, 0.20, 0.15],
    [0.15, 0.10, 0.35, 0.20, 0.20],
    [0.20, 0.15, 0.20, 0.25, 0.20],
    [0.15, 0.20, 0.20, 0.20, 0.25],
]

Use matplotlib to create a heatmap;
Label both axes with the sentence tokens;
Add a colorbar labeled "Attention Weight;"
Title the plot "Attention Heatmap."

Solution

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Thanks for your feedback!

Section 3. Chapter 4

single

Ask AI

Ask anything or try one of the suggested questions to begin our chat