Aprenda Random Walk–Based Embedding Intuition

Deslize para mostrar o menu

Random walk–based embedding methods such as DeepWalk and Node2Vec use the idea of generating sequences of nodes by simulating random walks on a graph. You can think of these sequences as being similar to sentences in natural language processing, where each node is like a word and the sequence captures the local structure around each node. By collecting many such node sequences, you can learn embeddings that capture both the neighborhood and structural roles of nodes in the graph.

When you perform a random walk starting from a node, you repeatedly move to a random neighbor for a fixed number of steps, recording the sequence of nodes you visit. These node sequences effectively summarize the graph's connectivity patterns. By treating these sequences as "contexts," you can use algorithms similar to word2vec to learn vector representations (embeddings) for each node. These embeddings can then be used for downstream tasks like node classification, link prediction, or clustering, because they encode both local and global graph information.


              12345678910111213141516171819202122232425262728293031323334353637
            
import networkx as nx
import numpy as np

# Create a small graph
G = nx.Graph()
edges = [
    ("A", "B"),
    ("A", "C"),
    ("B", "C"),
    ("B", "D"),
    ("C", "D"),
    ("D", "E"),
    ("E", "F"),
    ("F", "C")
]
G.add_edges_from(edges)

def random_walk(graph, start_node, walk_length):
    walk = [start_node]
    current = start_node
    for _ in range(walk_length - 1):
        neighbors = list(graph.neighbors(current))
        if len(neighbors) == 0:
            break
        next_node = np.random.choice(neighbors)
        walk.append(next_node)
        current = next_node
    return walk

np.random.seed(42)
walks = []
for node in G.nodes():
    walk = random_walk(G, node, walk_length=5)
    walks.append(walk)

for i, walk in enumerate(walks):
    print(f"Random walk from node {list(G.nodes())[i]}: {walk}")

DeepWalk intuition

DeepWalk uses uniform random walks, meaning at each step, you randomly choose one of the current node's neighbors with equal probability. This approach captures the general connectivity patterns of the graph and is simple to implement.

Node2Vec intuition

Node2Vec introduces biased random walks by adjusting the probability of revisiting nodes or exploring new parts of the graph. It uses parameters to control the likelihood of "returning" to the previous node (depth-first search–like behavior) or "exploring" further away (breadth-first search–like behavior), allowing you to tune how local or global the context should be.

1. What is the main idea behind using random walks for node embedding?

2. How does Node2Vec differ from DeepWalk in its approach to random walks?

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 2. Capítulo 2

Pergunte à IA

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Seção 2. Capítulo 2