Lesson 15 of 15

PageRank

PageRank

PageRank is the algorithm that made Google. It measures the importance of nodes in a graph based on the structure of incoming links — a node is important if important nodes link to it.

The Formula

PR(u) = (1 - d) / N  +  d * Σ PR(v) / out_degree(v)
                          v → u
  • d = damping factor (usually 0.85) — probability of following a link
  • (1 - d) / N = probability of jumping to a random node
  • The sum is over all nodes v that link to u

Iterative Computation

Repeat until convergence (or for a fixed number of iterations):

def page_rank(graph, iterations=20, damping=0.85):
    n = len(graph)
    rank = {node: 1.0 / n for node in graph}

    for _ in range(iterations):
        new_rank = {}
        for node in graph:
            score = (1 - damping) / n
            for src in graph:
                if node in graph[src] and len(graph[src]) > 0:
                    score += damping * rank[src] / len(graph[src])
            new_rank[node] = score
        rank = new_rank

    return rank

Intuition

  • Nodes with many incoming links get high rank
  • Links from high-rank nodes are worth more than links from low-rank nodes
  • The damping factor models a random surfer who occasionally jumps to a random page

Your Task

Implement page_rank(graph, iterations, damping) where graph is a dict {node: [outgoing neighbors]}. Return a dict of {node: rank} scores.

Pyodide loading...
Loading...
Click "Run" to execute your code.