Problem

Input - array A with n distinct numbers and a number i ∈ {1, 2, …, n}

Output : ith order statistic i.e. ith smallest element of the input array

Solution

We have already seen multiple solutions here - Kth OR Nth Smallest Element in an Array.

Here will focus on Linear time or O(n) solution.

Finding the ith smallest element may seem difficult at first, but is it possible to find some O(n) or linear time solution.

It turns out that we can actually accomplish this task by slighting changing QuickSort. Recall that after partitioning, the pivot stays in its “rightful” spot and doesn’t get moved in any of recursive calls later on.   So let’s say we’re looking for the ith element and after the first partition, the pivot is at k.

If k=i, then we know we found the ith element and we’re done! If k>i, then recursively partition the left hand side. Otherwise, recursively partition the right side. So, this can be done in 2 ways - randomized as well as deterministic. Deterministic is not that obvious. So, we will take it in the end.

Randomized Selection

  1. Choose a pivot. We know that the pivot property is like elements less than pivot, lie on left of it and more than it stay on right.

  2. We will recurse on the right side of the array. We will

Pseudocode

/**A is the input array
i - is the i'th element we have to select
arrayLength= is the length of the array we are assing.
**/
public int RSelect(int[] A, int arrayLength, int i) {
	if (n == 1) return A[1];
	p = choosePivot();
	//partition A around p
	//Let j = new index of p
	if (j == i) return p;
	if (j > i) return RSelect(1 st part of A, j - 1, i);
	if (j<i) return RSelect(2n d part of A, n - j, i - j)
}

Running Time

What is the running time of this selection algorithm? Well it depends on quality of chosen pivots.

Worst Case

If pivots are chosen in the worst possible way every time, then it will be O(n^2). This would be if we chose the smallest or largest element every time, reducing the problem by only one element.

What about the best pivot? To get the best pivot, our problem size should decrease significantly. This would be the median. So, problem size will be n/2, and then we do partitioning which takes O(n) time. Using the master method:

T(n) <= T(n/2)+O(n)

which is case 2, and we get that this will run in O(n).

Summary of Random-select

Note that randomness is not in data but in code. Also, we find out that selection is moderate problem when compared to sorting, and and hence it took linear time.