Scoring Functions and Triple Plausibility
When working with knowledge graph embeddings, you often need a way to determine how likely a given triple — composed of a head entity, a relation, and a tail entity — is to be true within the graph. Scoring functions are mathematical formulas that take vector representations (embeddings) of the head, relation, and tail, and output a numerical score reflecting the plausibility of the triple.
A typical scoring function is written as:
f(h,r,t)=−∥h+r−t∥where $h$, $r$, and $t$ are the embedding vectors for the head, relation, and tail, and $|\cdot|$ denotes a vector norm (such as L1 or L2). A higher score typically means the triple is more likely to be valid, while a lower score suggests it is less plausible. These functions are central to training and evaluating embedding models, as they guide the model to assign higher scores to true triples and lower scores to false ones.
12345678910111213141516import numpy as np def tr_score_L2(head_emb, rel_emb, tail_emb): """ TransE-style L2 scoring function for a triple (h, r, t). Lower scores indicate higher plausibility. """ return -np.linalg.norm(head_emb + rel_emb - tail_emb, ord=2) # Example usage: head = np.array([0.3, 0.7, 0.5]) rel = np.array([0.2, -0.1, 0.4]) tail = np.array([0.5, 0.6, 0.9]) score = tr_score_L2(head, rel, tail) print("Triple plausibility score (L2):", score)
Besides the L2 (Euclidean) distance, you can use other distance metrics as scoring functions, such as the L1 (Manhattan) distance or even more complex similarity measures. The choice of metric can significantly influence how the model learns and which patterns it captures in the data.
Danke für Ihr Feedback!
Fragen Sie AI
Fragen Sie AI
Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen
Can you explain why lower scores indicate higher plausibility in this function?
What is the difference between using L1 and L2 norms in the scoring function?
How can I interpret the output score in practical terms?
Großartig!
Completion Rate verbessert auf 7.69
Scoring Functions and Triple Plausibility
Swipe um das Menü anzuzeigen
When working with knowledge graph embeddings, you often need a way to determine how likely a given triple — composed of a head entity, a relation, and a tail entity — is to be true within the graph. Scoring functions are mathematical formulas that take vector representations (embeddings) of the head, relation, and tail, and output a numerical score reflecting the plausibility of the triple.
A typical scoring function is written as:
f(h,r,t)=−∥h+r−t∥where $h$, $r$, and $t$ are the embedding vectors for the head, relation, and tail, and $|\cdot|$ denotes a vector norm (such as L1 or L2). A higher score typically means the triple is more likely to be valid, while a lower score suggests it is less plausible. These functions are central to training and evaluating embedding models, as they guide the model to assign higher scores to true triples and lower scores to false ones.
12345678910111213141516import numpy as np def tr_score_L2(head_emb, rel_emb, tail_emb): """ TransE-style L2 scoring function for a triple (h, r, t). Lower scores indicate higher plausibility. """ return -np.linalg.norm(head_emb + rel_emb - tail_emb, ord=2) # Example usage: head = np.array([0.3, 0.7, 0.5]) rel = np.array([0.2, -0.1, 0.4]) tail = np.array([0.5, 0.6, 0.9]) score = tr_score_L2(head, rel, tail) print("Triple plausibility score (L2):", score)
Besides the L2 (Euclidean) distance, you can use other distance metrics as scoring functions, such as the L1 (Manhattan) distance or even more complex similarity measures. The choice of metric can significantly influence how the model learns and which patterns it captures in the data.
Danke für Ihr Feedback!