single
Challenge: Plotting Attention Heatmaps
Swipe to show menu
Visualizing attention weights with a heatmap helps you interpret how a transformer model distributes its focus across a sentence. You use matplotlib to plot a heatmap where both the x-axis and y-axis represent the tokens from the sentence. Each cell in the heatmap shows the attention weight between a pair of tokens: the row corresponds to the query token, and the column corresponds to the key token.
Start by splitting your sentence into tokens:
sentence = "Transformers help models focus on important words."
tokens = sentence.split()
Next, define your attention matrix as a NumPy array. Each value represents the attention weight from one token to another:
import numpy as np
attention = np.array([
[0.20, 0.10, 0.05, 0.10, 0.25, 0.10, 0.20],
[0.05, 0.30, 0.10, 0.10, 0.15, 0.20, 0.10],
[0.10, 0.15, 0.35, 0.10, 0.10, 0.10, 0.10],
[0.10, 0.10, 0.10, 0.30, 0.10, 0.15, 0.15],
[0.15, 0.10, 0.10, 0.10, 0.30, 0.10, 0.15],
[0.10, 0.10, 0.15, 0.10, 0.10, 0.35, 0.10],
[0.20, 0.15, 0.10, 0.10, 0.10, 0.10, 0.25],
])
To plot the heatmap, use the following code:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(8, 6))
im = ax.imshow(attention, cmap="viridis")
ax.set_xticks(np.arange(len(tokens)))
ax.set_yticks(np.arange(len(tokens)))
ax.set_xticklabels(tokens, rotation=45, ha="right")
ax.set_yticklabels(tokens)
cbar = plt.colorbar(im, ax=ax)
cbar.set_label("Attention Weight")
ax.set_title("Attention Heatmap")
ax.set_xlabel("Key Tokens")
ax.set_ylabel("Query Tokens")
plt.tight_layout()
plt.show()
Brighter or darker colors indicate higher or lower attention values, depending on the colormap. When you look at the heatmap, you can see which words the model pays the most attention to when processing each token. For example, if the cell at row focus and column important is bright, the model strongly connects focus to important in its internal representation. This visualization helps you understand which parts of the input sentence influence each other and is useful for diagnosing or interpreting model behavior in natural language processing tasks.
Now, execute the code to see the resulting heatmap and then write your first visualizing plot.
123456789101112131415161718192021222324252627282930313233import numpy as np import matplotlib.pyplot as plt sentence = "Transformers help models focus on important words." tokens = sentence.split() attention = np.array([ [0.20, 0.10, 0.05, 0.10, 0.25, 0.10, 0.20], [0.05, 0.30, 0.10, 0.10, 0.15, 0.20, 0.10], [0.10, 0.15, 0.35, 0.10, 0.10, 0.10, 0.10], [0.10, 0.10, 0.10, 0.30, 0.10, 0.15, 0.15], [0.15, 0.10, 0.10, 0.10, 0.30, 0.10, 0.15], [0.10, 0.10, 0.15, 0.10, 0.10, 0.35, 0.10], [0.20, 0.15, 0.10, 0.10, 0.10, 0.10, 0.25], ]) fig, ax = plt.subplots(figsize=(8, 6)) im = ax.imshow(attention, cmap="viridis") ax.set_xticks(np.arange(len(tokens))) ax.set_yticks(np.arange(len(tokens))) ax.set_xticklabels(tokens, rotation=45, ha="right") ax.set_yticklabels(tokens) cbar = plt.colorbar(im, ax=ax) cbar.set_label("Attention Weight") ax.set_title("Attention Heatmap") ax.set_xlabel("Key Tokens") ax.set_ylabel("Query Tokens") plt.tight_layout() plt.show()
Swipe to start coding
Plot an attention heatmap for the sentence "Attention helps models understand context." using the following attention matrix:
attention = [
[0.25, 0.15, 0.20, 0.20, 0.20],
[0.10, 0.40, 0.15, 0.20, 0.15],
[0.15, 0.10, 0.35, 0.20, 0.20],
[0.20, 0.15, 0.20, 0.25, 0.20],
[0.15, 0.20, 0.20, 0.20, 0.25],
]
- Use
matplotlibto create a heatmap; - Label both axes with the sentence tokens;
- Add a colorbar labeled "Attention Weight;"
- Title the plot "Attention Heatmap."
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat