We denote start of array in binary search by start
, left
, low
, lo
. Similarly,
we denote end of array in binary search by end
, right
, high
, hi
.
Now, with that notation lets look look at when the overflow may occur.
Problem
To find mid, we sometimes use:
mid = (start+end)) / 2;
OR
mid = (start + end) >> 1
To get the middle value, but this can caused OVERFLOW !
When start and end are all about INT_MAX OR Integer.MAX_VALUE
, then (start+end) of course will be overflow!
Fixing the Problem
To avoid the problem we can use
mid = start+(end-start)/2;
We can also use bitwise expression to do the same:
mid = start + ((start - end)>>1)
OR we can use unsigned right shift (>>>
):
mid = (start + end)>>>1
Under the assumption that start
and end
are both non-negative, we know for sure that the upper-most bit (the sign-bit) is zero.
So both start
and end
are in fact 31-bit integers.
end = 0100 0000 0000 0000 0000 0000 0000 0000 = 1073741824
start = 0100 0000 0000 0000 0000 0000 0000 0000 = 1073741824
When you add them together they may “spill” over into the top-bit.
start + end = 1000 0000 0000 0000 0000 0000 0000 0000
= 2147483648 as unsigned 32-bit integer
= -2147483648 as signed 32-bit integer
(start + end) / 2 = 1100 0000 0000 0000 0000 0000 0000 0000 = -1073741824
(start + end) >>> 1 = 0100 0000 0000 0000 0000 0000 0000 0000 = 1073741824
Caveat with unsigned right shift (>>>
) solution
How signed and unsigned right shift solution work?
Unlike the signed right shift operator (>>
), which sign-extends the left operand by filling empty bits with the sign bit, the unsigned right shift operator (>>>
) fills empty bits with zeros when shifting right.
For eg. consider the number -128
.
-128 (decimal) = 10000000 (binary) // Note the leading 1 for negative sign
When doing the signed right shift >>
operation, the signed right shift operator preserves the sign bit. To maintain the negative value, the leftmost bit (which was 1) is replicated and fills the empty bit introduced by the shift.
-128 >> 1 = 11000000 (binary) = -64 (decimal)
When doing the unsigned right shift >>>
operation, the bits are shifted to right, and operand is treated as an unsigned integer, ignoring the sign bit, so shifted value is filled with 0.
-128 >>> 1 = 01000000 (binary) = 128 (decimal) // Note the overflow and positive value
Why no overflow when doing right shift >>>
operation
- When adding 2 large numbers
start
andend
, the most significant bit (MSB) will only be set to 1 if the sum overflows the maximum value representable by the data type. - However, unsigned right shift
>>>
ignores the signed bit. When shifting the bits to the right and filling empty bits with zeros, the>>>
operator essentially discards the carry bit (overflow bit) and retains the significant bits for the average calculation.
Potential issues
While unlikely for positive start
and end
values, if either start
or end
is negative due to exceptional circumstances, overflow can still occur during the addition. In such cases, other approaches for calculating the middle index might be necessary (e.g., start + (end - start) / 2
).
Code Example
public static void main(String args[]) {
int start = Integer.MAX_VALUE / 2 + 2;
int end = Integer.MAX_VALUE / 2;
int m1 = (start + end) / 2;
int m2 = (start + end) >> 1;
int m3 = start + (end - start)/2;
int m4 = start + ((start - end)>>1);
int m5 = (start + end) >>> 1;
System.out.println(m1); // -1073741824 ⇨ overflow
System.out.println(m2); // -1073741824 ⇨ overflow
System.out.println(m3); // 1073741824
System.out.println(m4); // 1073741826
System.out.println(m5); // 1073741824
}