Problem

PageRank is an algorithm used by Google to rank the importance of different websites. While there have been changes over the years, the central idea is to assign each site a score based on the importance of other pages that link to that page.

More mathematically, suppose there are N sites, and each site i has a certain count Ci of outgoing links. Then the score for a particular site Sj is defined as:

1
score(Sj) = (1 - d) / N + d * (score(Sx) / Cx + score(Sy) / Cy + ... + score(Sz) / Cz)

Here, Sx, Sy, ..., Sz denote the scores of all the other sites that have outgoing links to Sj, and d is a damping factor, usually set to around 0.85, used to model the probability that a user will stop searching.

Given a directed graph of links between various websites, write a program that calculates each site’s page rank.

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Input: 
N = 3
Links: [[1], [2], [0]]  // Site 0->1, Site 1->2, Site 2->0
d = 0.85
iterations = 100

Output: [0.333, 0.333, 0.333] (approximately)
Explanation:
In this circular graph, each site links to exactly one other site.
After convergence, all sites have equal PageRank since the graph is symmetric.
Each site receives equal "link juice" from its predecessor.

Example 2

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
Input:
N = 4
Links: [[1, 2], [3], [3], [0]]  // Site 0->1,2; Site 1->3; Site 2->3; Site 3->0
d = 0.85
iterations = 100

Output: [0.369, 0.142, 0.142, 0.347] (approximately)
Explanation:
Site 0 and 3 have higher PageRank because:
- Site 0 receives links from site 3
- Site 3 receives links from both sites 1 and 2
Sites 1 and 2 have lower PageRank as they only receive from site 0 but split its influence.

Example 3

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Input:
N = 2
Links: [[1], []]  // Site 0->1, Site 1 has no outgoing links
d = 0.85
iterations = 100

Output: [0.176, 0.824] (approximately)
Explanation:
Site 1 becomes a "sink" with high PageRank because it receives from site 0.
Site 0 has lower PageRank since it gives away its influence without receiving any.
The dangling node (site 1) accumulates PageRank over iterations.

Example 4

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Input:
N = 3
Links: [[1, 2], [], [0]]  // Site 0->1,2; Site 1 no links; Site 2->0
d = 0.5
iterations = 50

Output: [0.4, 0.267, 0.333] (approximately)
Explanation:
With lower damping factor (0.5), the random teleportation effect is stronger.
Site 0 has highest rank due to receiving from site 2.
Site 1 is a sink but gets less accumulation due to higher teleportation probability.

Example 5

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
Input:
N = 1
Links: [[]]  // Single site with no outgoing links
d = 0.85
iterations = 10

Output: [1.0]
Explanation:
Single site case: the site gets all the PageRank (probability 1).
No links to distribute, so the site retains full rank.

Solution

Method 1 - Power Iteration with Matrix Approach

Intuition

PageRank can be computed using the power iteration method on a transition matrix. We construct a matrix where each entry represents the probability of moving from one page to another. The PageRank vector is the stationary distribution of this Markov chain, found by repeatedly multiplying the matrix with an initial rank vector until convergence.

Approach

  1. Build the transition matrix M where M[i][j] = probability of going from page i to page j
  2. For each page i with outgoing links, distribute its rank equally among its links: M[i][j] = 1/Ci if i links to j
  3. Handle dangling nodes (pages with no outgoing links) by distributing their rank equally to all pages
  4. Apply damping factor: M’ = d * M + (1-d)/N * J, where J is matrix of all 1/N
  5. Initialize PageRank vector with equal probabilities: PR = [1/N, 1/N, …, 1/N]
  6. Iterate: PR_new = M’ * PR until convergence or max iterations reached
  7. Return final PageRank vector

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
class Solution {
public:
    vector<double> pageRank(int N, vector<vector<int>>& links, double d = 0.85, int maxIter = 100, double eps = 1e-6) {
        // Build transition matrix
        vector<vector<double>> M(N, vector<double>(N, 0.0));
        vector<int> outCount(N, 0);
        
        // Count outgoing links
        for (int i = 0; i < N; i++) {
            outCount[i] = links[i].size();
        }
        
        // Fill transition matrix
        for (int i = 0; i < N; i++) {
            if (outCount[i] == 0) {
                // Dangling node: distribute equally to all nodes
                for (int j = 0; j < N; j++) {
                    M[i][j] = 1.0 / N;
                }
            } else {
                // Regular node: distribute among linked nodes
                for (int neighbor : links[i]) {
                    M[i][neighbor] = 1.0 / outCount[i];
                }
            }
        }
        
        // Apply damping factor
        for (int i = 0; i < N; i++) {
            for (int j = 0; j < N; j++) {
                M[i][j] = d * M[i][j] + (1.0 - d) / N;
            }
        }
        
        // Initialize PageRank vector
        vector<double> pr(N, 1.0 / N);
        
        // Power iteration
        for (int iter = 0; iter < maxIter; iter++) {
            vector<double> newPr(N, 0.0);
            
            // Matrix-vector multiplication: newPr = M^T * pr
            for (int j = 0; j < N; j++) {
                for (int i = 0; i < N; i++) {
                    newPr[j] += M[i][j] * pr[i];
                }
            }
            
            // Check convergence
            double diff = 0.0;
            for (int i = 0; i < N; i++) {
                diff += abs(newPr[i] - pr[i]);
            }
            
            pr = newPr;
            
            if (diff < eps) break;
        }
        
        return pr;
    }
};
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
func pageRank(N int, links [][]int, d float64, maxIter int, eps float64) []float64 {
    if d == 0 {
        d = 0.85
    }
    if maxIter == 0 {
        maxIter = 100
    }
    if eps == 0 {
        eps = 1e-6
    }
    
    // Build transition matrix
    M := make([][]float64, N)
    for i := range M {
        M[i] = make([]float64, N)
    }
    
    outCount := make([]int, N)
    
    // Count outgoing links
    for i := 0; i < N; i++ {
        outCount[i] = len(links[i])
    }
    
    // Fill transition matrix
    for i := 0; i < N; i++ {
        if outCount[i] == 0 {
            // Dangling node: distribute equally to all nodes
            for j := 0; j < N; j++ {
                M[i][j] = 1.0 / float64(N)
            }
        } else {
            // Regular node: distribute among linked nodes
            for _, neighbor := range links[i] {
                M[i][neighbor] = 1.0 / float64(outCount[i])
            }
        }
    }
    
    // Apply damping factor
    for i := 0; i < N; i++ {
        for j := 0; j < N; j++ {
            M[i][j] = d*M[i][j] + (1.0-d)/float64(N)
        }
    }
    
    // Initialize PageRank vector
    pr := make([]float64, N)
    for i := range pr {
        pr[i] = 1.0 / float64(N)
    }
    
    // Power iteration
    for iter := 0; iter < maxIter; iter++ {
        newPr := make([]float64, N)
        
        // Matrix-vector multiplication: newPr = M^T * pr
        for j := 0; j < N; j++ {
            for i := 0; i < N; i++ {
                newPr[j] += M[i][j] * pr[i]
            }
        }
        
        // Check convergence
        diff := 0.0
        for i := 0; i < N; i++ {
            diff += math.Abs(newPr[i] - pr[i])
        }
        
        pr = newPr
        
        if diff < eps {
            break
        }
    }
    
    return pr
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
class Solution {
    public double[] pageRank(int N, int[][] links, double d, int maxIter, double eps) {
        // Build transition matrix
        double[][] M = new double[N][N];
        int[] outCount = new int[N];
        
        // Count outgoing links
        for (int i = 0; i < N; i++) {
            outCount[i] = links[i].length;
        }
        
        // Fill transition matrix
        for (int i = 0; i < N; i++) {
            if (outCount[i] == 0) {
                // Dangling node: distribute equally to all nodes
                for (int j = 0; j < N; j++) {
                    M[i][j] = 1.0 / N;
                }
            } else {
                // Regular node: distribute among linked nodes
                for (int neighbor : links[i]) {
                    M[i][neighbor] = 1.0 / outCount[i];
                }
            }
        }
        
        // Apply damping factor
        for (int i = 0; i < N; i++) {
            for (int j = 0; j < N; j++) {
                M[i][j] = d * M[i][j] + (1.0 - d) / N;
            }
        }
        
        // Initialize PageRank vector
        double[] pr = new double[N];
        Arrays.fill(pr, 1.0 / N);
        
        // Power iteration
        for (int iter = 0; iter < maxIter; iter++) {
            double[] newPr = new double[N];
            
            // Matrix-vector multiplication: newPr = M^T * pr
            for (int j = 0; j < N; j++) {
                for (int i = 0; i < N; i++) {
                    newPr[j] += M[i][j] * pr[i];
                }
            }
            
            // Check convergence
            double diff = 0.0;
            for (int i = 0; i < N; i++) {
                diff += Math.abs(newPr[i] - pr[i]);
            }
            
            pr = newPr;
            
            if (diff < eps) break;
        }
        
        return pr;
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
class Solution:
    def page_rank(self, N: int, links: List[List[int]], d: float = 0.85, max_iter: int = 100, eps: float = 1e-6) -> List[float]:
        # Build transition matrix
        M = [[0.0 for _ in range(N)] for _ in range(N)]
        out_count = [len(links[i]) for i in range(N)]
        
        # Fill transition matrix
        for i in range(N):
            if out_count[i] == 0:
                # Dangling node: distribute equally to all nodes
                for j in range(N):
                    M[i][j] = 1.0 / N
            else:
                # Regular node: distribute among linked nodes
                for neighbor in links[i]:
                    M[i][neighbor] = 1.0 / out_count[i]
        
        # Apply damping factor
        for i in range(N):
            for j in range(N):
                M[i][j] = d * M[i][j] + (1.0 - d) / N
        
        # Initialize PageRank vector
        pr = [1.0 / N for _ in range(N)]
        
        # Power iteration
        for iter in range(max_iter):
            new_pr = [0.0 for _ in range(N)]
            
            # Matrix-vector multiplication: new_pr = M^T * pr
            for j in range(N):
                for i in range(N):
                    new_pr[j] += M[i][j] * pr[i]
            
            # Check convergence
            diff = sum(abs(new_pr[i] - pr[i]) for i in range(N))
            
            pr = new_pr
            
            if diff < eps:
                break
        
        return pr

Complexity

  • ⏰ Time complexity: O(k * N²), where k is the number of iterations (typically converges in 50-100 iterations) and N is the number of nodes. Each iteration requires matrix-vector multiplication
  • 🧺 Space complexity: O(N²), for storing the transition matrix

Method 2 - Direct Iterative Update

Intuition

Instead of building the full transition matrix, we can directly apply the PageRank formula in each iteration. For each node, we calculate its new PageRank by summing contributions from all nodes that link to it, divided by their respective outgoing link counts, plus the random teleportation factor.

Approach

  1. Initialize all PageRank values to 1/N
  2. For each iteration:
    • For each node j, calculate new PageRank using the formula
    • Sum contributions from all nodes i that link to j: score(Si)/Ci
    • Add teleportation factor: (1-d)/N
    • Apply damping: d * (sum of contributions) + teleportation
  3. Handle dangling nodes by distributing their PageRank equally
  4. Continue until convergence or maximum iterations
  5. Normalize if needed to ensure sum equals 1

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
class Solution {
public:
    vector<double> pageRankDirect(int N, vector<vector<int>>& links, double d = 0.85, int maxIter = 100, double eps = 1e-6) {
        // Build incoming links map for efficient lookup
        vector<vector<int>> incoming(N);
        vector<int> outCount(N, 0);
        
        for (int i = 0; i < N; i++) {
            outCount[i] = links[i].size();
            for (int neighbor : links[i]) {
                incoming[neighbor].push_back(i);
            }
        }
        
        // Initialize PageRank
        vector<double> pr(N, 1.0 / N);
        
        for (int iter = 0; iter < maxIter; iter++) {
            vector<double> newPr(N, 0.0);
            double danglingSum = 0.0;
            
            // Calculate dangling nodes contribution
            for (int i = 0; i < N; i++) {
                if (outCount[i] == 0) {
                    danglingSum += pr[i];
                }
            }
            
            // Calculate new PageRank for each node
            for (int j = 0; j < N; j++) {
                // Teleportation factor
                newPr[j] = (1.0 - d) / N;
                
                // Add dangling nodes contribution
                newPr[j] += d * danglingSum / N;
                
                // Add contributions from incoming links
                for (int i : incoming[j]) {
                    if (outCount[i] > 0) {
                        newPr[j] += d * pr[i] / outCount[i];
                    }
                }
            }
            
            // Check convergence
            double diff = 0.0;
            for (int i = 0; i < N; i++) {
                diff += abs(newPr[i] - pr[i]);
            }
            
            pr = newPr;
            
            if (diff < eps) break;
        }
        
        return pr;
    }
};
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
func pageRankDirect(N int, links [][]int, d float64, maxIter int, eps float64) []float64 {
    if d == 0 {
        d = 0.85
    }
    if maxIter == 0 {
        maxIter = 100
    }
    if eps == 0 {
        eps = 1e-6
    }
    
    // Build incoming links map for efficient lookup
    incoming := make([][]int, N)
    outCount := make([]int, N)
    
    for i := 0; i < N; i++ {
        outCount[i] = len(links[i])
        for _, neighbor := range links[i] {
            incoming[neighbor] = append(incoming[neighbor], i)
        }
    }
    
    // Initialize PageRank
    pr := make([]float64, N)
    for i := range pr {
        pr[i] = 1.0 / float64(N)
    }
    
    for iter := 0; iter < maxIter; iter++ {
        newPr := make([]float64, N)
        danglingSum := 0.0
        
        // Calculate dangling nodes contribution
        for i := 0; i < N; i++ {
            if outCount[i] == 0 {
                danglingSum += pr[i]
            }
        }
        
        // Calculate new PageRank for each node
        for j := 0; j < N; j++ {
            // Teleportation factor
            newPr[j] = (1.0 - d) / float64(N)
            
            // Add dangling nodes contribution
            newPr[j] += d * danglingSum / float64(N)
            
            // Add contributions from incoming links
            for _, i := range incoming[j] {
                if outCount[i] > 0 {
                    newPr[j] += d * pr[i] / float64(outCount[i])
                }
            }
        }
        
        // Check convergence
        diff := 0.0
        for i := 0; i < N; i++ {
            diff += math.Abs(newPr[i] - pr[i])
        }
        
        pr = newPr
        
        if diff < eps {
            break
        }
    }
    
    return pr
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
class Solution {
    public double[] pageRankDirect(int N, int[][] links, double d, int maxIter, double eps) {
        // Build incoming links map for efficient lookup
        List<List<Integer>> incoming = new ArrayList<>();
        for (int i = 0; i < N; i++) {
            incoming.add(new ArrayList<>());
        }
        
        int[] outCount = new int[N];
        
        for (int i = 0; i < N; i++) {
            outCount[i] = links[i].length;
            for (int neighbor : links[i]) {
                incoming.get(neighbor).add(i);
            }
        }
        
        // Initialize PageRank
        double[] pr = new double[N];
        Arrays.fill(pr, 1.0 / N);
        
        for (int iter = 0; iter < maxIter; iter++) {
            double[] newPr = new double[N];
            double danglingSum = 0.0;
            
            // Calculate dangling nodes contribution
            for (int i = 0; i < N; i++) {
                if (outCount[i] == 0) {
                    danglingSum += pr[i];
                }
            }
            
            // Calculate new PageRank for each node
            for (int j = 0; j < N; j++) {
                // Teleportation factor
                newPr[j] = (1.0 - d) / N;
                
                // Add dangling nodes contribution
                newPr[j] += d * danglingSum / N;
                
                // Add contributions from incoming links
                for (int i : incoming.get(j)) {
                    if (outCount[i] > 0) {
                        newPr[j] += d * pr[i] / outCount[i];
                    }
                }
            }
            
            // Check convergence
            double diff = 0.0;
            for (int i = 0; i < N; i++) {
                diff += Math.abs(newPr[i] - pr[i]);
            }
            
            pr = newPr;
            
            if (diff < eps) break;
        }
        
        return pr;
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
class Solution:
    def page_rank_direct(self, N: int, links: List[List[int]], d: float = 0.85, max_iter: int = 100, eps: float = 1e-6) -> List[float]:
        # Build incoming links map for efficient lookup
        incoming = [[] for _ in range(N)]
        out_count = [len(links[i]) for i in range(N)]
        
        for i in range(N):
            for neighbor in links[i]:
                incoming[neighbor].append(i)
        
        # Initialize PageRank
        pr = [1.0 / N for _ in range(N)]
        
        for iter in range(max_iter):
            new_pr = [0.0 for _ in range(N)]
            
            # Calculate dangling nodes contribution
            dangling_sum = sum(pr[i] for i in range(N) if out_count[i] == 0)
            
            # Calculate new PageRank for each node
            for j in range(N):
                # Teleportation factor
                new_pr[j] = (1.0 - d) / N
                
                # Add dangling nodes contribution
                new_pr[j] += d * dangling_sum / N
                
                # Add contributions from incoming links
                for i in incoming[j]:
                    if out_count[i] > 0:
                        new_pr[j] += d * pr[i] / out_count[i]
            
            # Check convergence
            diff = sum(abs(new_pr[i] - pr[i]) for i in range(N))
            
            pr = new_pr
            
            if diff < eps:
                break
        
        return pr

Complexity

  • ⏰ Time complexity: O(k * (N + E)), where k is the number of iterations, N is the number of nodes, and E is the number of edges. More efficient than matrix approach for sparse graphs
  • 🧺 Space complexity: O(N + E), for storing incoming links and PageRank vectors

Notes

  • Method 1 uses the classic matrix-based power iteration approach, suitable for theoretical understanding
  • Method 2 provides a more memory-efficient implementation, better for large sparse graphs
  • The damping factor d typically ranges from 0.8 to 0.9, with 0.85 being most common
  • Convergence usually occurs within 50-100 iterations for most real-world graphs
  • Dangling nodes (pages with no outgoing links) require special handling to maintain probability conservation
  • The algorithm finds the stationary distribution of a Markov chain representing random web surfing
  • PageRank values represent the long-term probability of being at each page during random navigation
  • Both implementations handle edge cases like isolated nodes, self-loops, and disconnected components