H-Index Problem

Problem

Given an array of integers citations where citations[i] is the number of citations a researcher received for their ith paper, return compute the researcher’s h-index.

According to the definition of h-index on Wikipedia: A scientist has an index h if h of their n papers have at least h citations each, and the other n − h papers have no more than h citations each.

If there are several possible values for h, the maximum one is taken as the h-index.

Examples

Example 1:

Input:
citations = [3,0,6,1,5]
Output:
 3
Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively.
Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.

Example 2:

Input:
citations = [1,3,1]
Output:
 1

Solution

Method 1 - Sorting and track counts from start of array

Algorithm

Sort the array of length n
Now, we iterate from left to right with iterator i, then for each index i, n - i is number of citations on the right (inclusive of current element), which more than current citation. We have to find max such value, such that there exist a position in sorted array, that citation[i] >= n - i.

Dry Run

input  = [3,0,6,1,5]
sorted = [0,1,3,5,6]

i = 0, cnt = 5, citation = 0, 0 >= 5 => False, ans = ?
i = 1, cnt = 4, citation = 1, 1 >= 4 => False, ans = ?
i = 2, cnt = 3, citation = 3, 3 >= 3 => True, ans = 3

Code

Java

public class Solution {
    public int hIndex(int[] citations) {
        Arrays.sort(citations);
        int n = citations.length;
        
        for(int i = 0; i < citations.length; i++){
	        int cnt = n - i;
            if(citations[i] >= cnt) {
	            return cnt;
            }
        }
        return 0;
    }
}

Complexity

⏰ Time complexity: O(n log n)
🧺 Space complexity: O(1)

Method 2 -Sorting and track counts from end of array

We can also do the same count from the end of array.

Code

Java

public int hIndex(int[] citations) {
    Arrays.sort(citations);
    int n = citations.length;
    int ans = 0;
    for(int i=n-1; i>=0; i--){
        int cnt = n-i;
        if(citations[i]>=cnt){
            ans = cnt;
        }else{
            break;
        }
    }
    return ans;
}

Complexity

⏰ Time complexity: O(n log n)
🧺 Space complexity: O(1)

Dry Run

Dry running above code for [3,0,6,1,5]:

input  = [3,0,6,1,5]
sorted = [0,1,3,5,6]

i = 4, cnt = 1, citation = 6, 6 >= 1 => ans = 1
i = 3, cnt = 2, citation = 5, 5 >= 2 => ans = 2
i = 2, cnt = 3, citation = 3, 3 >= 3 => ans = 3
i = 1, cnt = 4, citation = 1, 1 < 4 => break

Method 3 - Bucket Sort 🏆

We can also put the citations in a counting sort array, then iterate over the counter array.

First, you may ask why bucket sort. Well, the h-index is defined as the number of papers with reference greater than the number. So assume n is the total number of papers, if we have n+1 buckets, number from 0 to n, then for any paper with reference corresponding to the index of the bucket, we increment the count for that bucket. The only exception is that for any paper with larger number of reference than n, we put in the n-th bucket.

Then we find total which is the total number of papers that has more than i citations, and we stop when total>=i (total number of papers with more than i citations >= i). We don’t need to keep going because we are trying the biggest i possible, we we stop and return the result.

Code

Java

public int hIndex(int[] citations) {
	int n = citations.length;
	int[] buckets = new int[n+1];
	for(int c : citations) {
		if(c >= n) {
			buckets[n]++;
		} else {
			buckets[c]++;
		}
	}
	int total = 0;
	for(int i = n; i >= 0; i--) {
		total += buckets[i];
		if(total >= i) {
			return i;
		}
	}
	return 0;
}

Dry Run

Dry running above code for [3,0,6,1,5]:

input   = [3,0,6,1,5]
buckets = [0,0,0,0,0,0]
index   =  0 1 2 3 4 5

After filling buckets:
buckets = [1,1,0,1,0,2]

Now we start from end of buckets:
i = 5, total = 2, 2 >= 5 => false
i = 4, total = 2, 2 >= 4 => false
i = 3, total = 3, 3 >= 3 => true, return 3

Complexity

⏰ Time complexity: O(n)
🧺 Space complexity: O(n)

Method 4 - Sorting and Binary Search 🥈

We can improvise method 1. In method 1, we sort the array, and then try to find the point such that citation[i] >= (n - i) using linear search. But array is sorted we can do binary search.

Code

Java

public int hIndex(int[] citations) {
	Arrays.sort(citations);
	int n = citations.length;
	int lo = 0, hi = n - 1;
	while (lo <= hi) {
		int mid = lo + (hi - lo) / 2;
		int cnt = n - mid;
		if (citations[mid] == cnt) {
			return cnt;
		} else if (citations[mid] < cnt) {
			lo = mid + 1;
		} else { 
			//(citations[mid] > cnt), mid qualified as a hIndex,
		    // but we have to continue to search for a highest one.
			hi = mid - 1;
		}
	}
	return n - lo;
}

Complexity

⏰ Time complexity: O(n log n)
🧺 Space complexity: O(1)

Problem#

Examples#

Solution#

Method 1 - Sorting and track counts from start of array#

Algorithm#

Dry Run#

Code#

Java#

Complexity#

Method 2 -Sorting and track counts from end of array#

Code#

Java#

Complexity#

Dry Run#

Method 3 - Bucket Sort 🏆#

Code#

Java#

Dry Run#

Complexity#

Method 4 - Sorting and Binary Search 🥈#

Code#

Java#

Complexity#

Problem

Examples

Solution

Method 1 - Sorting and track counts from start of array

Algorithm

Dry Run

Code

Java

Complexity

Method 2 -Sorting and track counts from end of array

Code

Java

Complexity

Dry Run

Method 3 - Bucket Sort 🏆

Code

Java

Dry Run

Complexity

Method 4 - Sorting and Binary Search 🥈

Code

Java

Complexity