Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Evaluating Predictions: MRR and Hits@k | Reasoning and Applications
Practice
Projects
Quizzes & Challenges
Quizzes
Challenges
/
Knowledge Graphs and Embeddings

bookEvaluating Predictions: MRR and Hits@k

When you evaluate a link prediction model for knowledge graphs, you want to measure how well the model ranks the correct triples among all possible candidates. Two widely used ranking metrics are Mean Reciprocal Rank (MRR) and Hits@k.

Mean Reciprocal Rank (MRR) measures the average of the reciprocal ranks of the correct answer. For each test query, you rank all possible candidate entities by their predicted scores. The reciprocal rank is 1rank\frac{\raisebox{1pt}{$1$}}{\raisebox{-1pt}{$\text{rank}$}}, where rank\text{rank} is the position of the correct entity in the sorted list (with rank 1 being the highest score). You then average the reciprocal ranks over all queries:

MRR=1Nβˆ‘i=1N1ranki\mathrm{MRR} = \frac{1}{N} \sum_{i=1}^N \frac{1}{\mathrm{rank}_i}

where NN is the number of queries and ranki\mathrm{rank}_i is the rank of the correct entity for the ii-th query.

Hits@k measures the proportion of test queries for which the correct entity appears in the top $$kv predictions:

Hits@k=1Nβˆ‘i=1NI(ranki≀k)\mathrm{Hits@}k = \frac{1}{N} \sum_{i=1}^N \mathbb{I}(\mathrm{rank}_i \leq k)

where I\mathbb{I} is the indicator function.

Walk through a step-by-step calculation. Suppose you have a test set with three queries, and for each query, your model assigns scores to candidate entities. The correct entity's rank for each query is as follows:

  • Query 1: Correct entity is ranked 22;
  • Query 2: Correct entity is ranked 11;
  • Query 3: Correct entity is ranked 44.

First, calculate the reciprocal ranks:

  • Query 1: 12=0.5\frac{\raisebox{1pt}{$1$}}{\raisebox{-1pt}{$2$}} = 0.5;
  • Query 2: 11=1.0\frac{\raisebox{1pt}{$1$}}{\raisebox{-1pt}{$1$}} = 1.0;
  • Query 3: 14=0.25\frac{\raisebox{1pt}{$1$}}{\raisebox{1pt}{$4$}} = 0.25.

To get the MRR, take the mean:

MRR=0.5+1.0+0.253=0.583\mathrm{MRR} = \frac{0.5 + 1.0 + 0.25}{3} = 0.583

For Hits@1, count how many times the correct entity is ranked first: only once (Query 2), so

Hits@1=13β‰ˆ0.333\mathrm{Hits@}1 = \frac{1}{3} \approx 0.333

For Hits@3, count how many times the correct entity is ranked in the top 33: Query 1 and Query 2 both qualify, so

Hits@3=23β‰ˆ0.667\mathrm{Hits@}3 = \frac{2}{3} \approx 0.667
1234567891011121314151617181920212223242526272829303132
import numpy as np # Example: predicted scores for 3 queries, each with 4 candidate entities # Rows: queries, Columns: candidate entities scores = np.array([ [0.2, 0.9, 0.3, 0.5], # Query 1 [0.8, 0.1, 0.4, 0.7], # Query 2 [0.6, 0.2, 0.9, 0.1] # Query 3 ]) # The index of the correct entity for each query correct_indices = np.array([1, 0, 2]) def compute_mrr_and_hitsk(scores, correct_indices, k=3): mrr = 0 hits_k = 0 num_queries = scores.shape[0] for i in range(num_queries): # Sort scores descending, get ranking ranking = np.argsort(-scores[i]) # Find rank (1-based) of the correct entity rank = np.where(ranking == correct_indices[i])[0][0] + 1 mrr += 1.0 / rank if rank <= k: hits_k += 1 mrr /= num_queries hits_k /= num_queries return mrr, hits_k mrr, hits3 = compute_mrr_and_hitsk(scores, correct_indices, k=3) print("MRR:", mrr) print("Hits@3:", hits3)
copy
Note
Study More

Other evaluation metrics for knowledge graphs include Mean Rank (MR), Area Under the Curve (AUC), and precision/recall at various cutoffs. These metrics can provide additional perspectives on model performance, especially in different application scenarios.

question mark

Which metric measures the average of the reciprocal ranks of the correct answers?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 2

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain why the MRR and Hits@3 are both 1.0 in this example?

How would the results change if the correct entity was ranked lower for one of the queries?

Can you show how to compute Hits@1 using this code?

bookEvaluating Predictions: MRR and Hits@k

Swipe to show menu

When you evaluate a link prediction model for knowledge graphs, you want to measure how well the model ranks the correct triples among all possible candidates. Two widely used ranking metrics are Mean Reciprocal Rank (MRR) and Hits@k.

Mean Reciprocal Rank (MRR) measures the average of the reciprocal ranks of the correct answer. For each test query, you rank all possible candidate entities by their predicted scores. The reciprocal rank is 1rank\frac{\raisebox{1pt}{$1$}}{\raisebox{-1pt}{$\text{rank}$}}, where rank\text{rank} is the position of the correct entity in the sorted list (with rank 1 being the highest score). You then average the reciprocal ranks over all queries:

MRR=1Nβˆ‘i=1N1ranki\mathrm{MRR} = \frac{1}{N} \sum_{i=1}^N \frac{1}{\mathrm{rank}_i}

where NN is the number of queries and ranki\mathrm{rank}_i is the rank of the correct entity for the ii-th query.

Hits@k measures the proportion of test queries for which the correct entity appears in the top $$kv predictions:

Hits@k=1Nβˆ‘i=1NI(ranki≀k)\mathrm{Hits@}k = \frac{1}{N} \sum_{i=1}^N \mathbb{I}(\mathrm{rank}_i \leq k)

where I\mathbb{I} is the indicator function.

Walk through a step-by-step calculation. Suppose you have a test set with three queries, and for each query, your model assigns scores to candidate entities. The correct entity's rank for each query is as follows:

  • Query 1: Correct entity is ranked 22;
  • Query 2: Correct entity is ranked 11;
  • Query 3: Correct entity is ranked 44.

First, calculate the reciprocal ranks:

  • Query 1: 12=0.5\frac{\raisebox{1pt}{$1$}}{\raisebox{-1pt}{$2$}} = 0.5;
  • Query 2: 11=1.0\frac{\raisebox{1pt}{$1$}}{\raisebox{-1pt}{$1$}} = 1.0;
  • Query 3: 14=0.25\frac{\raisebox{1pt}{$1$}}{\raisebox{1pt}{$4$}} = 0.25.

To get the MRR, take the mean:

MRR=0.5+1.0+0.253=0.583\mathrm{MRR} = \frac{0.5 + 1.0 + 0.25}{3} = 0.583

For Hits@1, count how many times the correct entity is ranked first: only once (Query 2), so

Hits@1=13β‰ˆ0.333\mathrm{Hits@}1 = \frac{1}{3} \approx 0.333

For Hits@3, count how many times the correct entity is ranked in the top 33: Query 1 and Query 2 both qualify, so

Hits@3=23β‰ˆ0.667\mathrm{Hits@}3 = \frac{2}{3} \approx 0.667
1234567891011121314151617181920212223242526272829303132
import numpy as np # Example: predicted scores for 3 queries, each with 4 candidate entities # Rows: queries, Columns: candidate entities scores = np.array([ [0.2, 0.9, 0.3, 0.5], # Query 1 [0.8, 0.1, 0.4, 0.7], # Query 2 [0.6, 0.2, 0.9, 0.1] # Query 3 ]) # The index of the correct entity for each query correct_indices = np.array([1, 0, 2]) def compute_mrr_and_hitsk(scores, correct_indices, k=3): mrr = 0 hits_k = 0 num_queries = scores.shape[0] for i in range(num_queries): # Sort scores descending, get ranking ranking = np.argsort(-scores[i]) # Find rank (1-based) of the correct entity rank = np.where(ranking == correct_indices[i])[0][0] + 1 mrr += 1.0 / rank if rank <= k: hits_k += 1 mrr /= num_queries hits_k /= num_queries return mrr, hits_k mrr, hits3 = compute_mrr_and_hitsk(scores, correct_indices, k=3) print("MRR:", mrr) print("Hits@3:", hits3)
copy
Note
Study More

Other evaluation metrics for knowledge graphs include Mean Rank (MR), Area Under the Curve (AUC), and precision/recall at various cutoffs. These metrics can provide additional perspectives on model performance, especially in different application scenarios.

question mark

Which metric measures the average of the reciprocal ranks of the correct answers?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 2
some-alt