Problem

An array A[1..n] contains all the integers from 0 to n except for one number which is missing. In this problem, we cannot access an entire integer in A with a single operation. The elements of A are represented in binary, and the only operation we can use to access them is ”fetch the jth bit of A[i]”, which takes constant time. Write code to find the missing integer. Can you do it in O(n) time?

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Input:
Full range should be numbers 0..7 (n = 7) but one number is missing.
Array (order arbitrary): [0, 1, 2, 4, 5, 6, 7]
Accessible operation: getBit(i, bit) -> jth bit of A[i].

Binary forms present:
0 -> 000
1 -> 001
2 -> 010
4 -> 100
5 -> 101
6 -> 110
7 -> 111

Missing binary pattern must be 011 which is decimal 3.

Output:
3

Explanation:
Count MSB (bit 2) zeros vs ones:
	bit2=0: {000,001,010} (3 items)
	bit2=1: {100,101,110,111} (4 items)
Since counts differ by one and in the full set (0..7) they'd be equal (4 vs 4), the missing number has bit2 = 0.
Recurse on those with bit2=0 (000,001,010) for next bit (bit1):
	bit1=0: {000,001} (2)
	bit1=1: {010} (1)  -> missing bit1 must be 1.
Recurse on subset needing bit1=1 (010) for bit0:
	bit0=0: {010} (1)
	bit0=1: {}   (0)  -> missing bit0 must be 1.
Assembled bits: bit2=0, bit1=1, bit0=1 => 011 (decimal 3).

Solution

Method 1 - Divide and Conquer approach for array having n=2^k - 1

Let’s assume for now that n is 2^k - 1 for some k. Let’s also look at an example where k = 3.

1
2
3
4
5
6
7
8
000
001
010
011
100
101
110
111

You’ll notice that when there is a full set, like the one shown above, each column (each digit’s place) has the same number of ones and zeros. Of course, for convenience we are showing this as sorted, but in reality, I am not stating that it is.

Let’s take a look at the following list, where 101 is missing.

1
2
3
4
5
6
7
000
001
010
011
100
110
111

We look at the first bit of all of the elements ( O(n) ) and figure out which count is less than the other. We see that for the first bit, there is a number with the 1 in the most significant bit missing. This means that we know that our number has a one in its most significant bit. Basically, we partition into two sets, one where the most significant bit is 1 and one where the most significant bit is 0. The smaller set shows us what bit the missing number has.

We do the same thing on the smaller partition.

Since it is O(n) + O(n/2) + O (n/4) … it is basically O (2n) which is O (n).

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public static int findMissing(List<Integer> array) {
	return findMissing(array, Integer.SIZE - 1);
}


public static int findMissing(ArrayList<Integer> input, int column) {
	if (column < 0) {
		return 0;
	}

	List<Integer> oddIndices = new ArrayList<Integer>();
	List<Integer> evenIndices = new ArrayList<Integer>();

	for (Integer t: input) {
		if ((t & (1<< column)) == 0) {
			evenIndices.add(t);
		} else {
			oddIndices.add(t);
		}
	}

	if (oddIndices.size() >= evenIndices.size()) {
		return (findMissing(evenIndices, column - 1))<< 1 | 0;
	} else {
		return (findMissing(oddIndices, column - 1))<< 1 | 1;
	}
}

References

stackoverflow, stackoverflow, tian runhe