Avoid Overflow when calculating Mid in Binary Search

We denote start of array in binary search by start, left, low, lo. Similarly, we denote end of array in binary search by end, right, high, hi.

Now, with that notation lets look look at when the overflow may occur.

Problem

To find mid, we sometimes use:

  mid = (start+end)) / 2;

mid = (start + end) >> 1

To get the middle value, but this can caused OVERFLOW !

When start and end are all about INT_MAX OR Integer.MAX_VALUE , then (start+end) of course will be overflow!

Fixing the Problem

To avoid the problem we can use

  mid =  start+(end-start)/2;

We can also use bitwise expression to do the same:

mid = start + ((start - end)>>1)

OR we can use unsigned right shift (>>>):

mid = (start + end)>>>1

Under the assumption that start and end are both non-negative, we know for sure that the upper-most bit (the sign-bit) is zero.

So both start and end are in fact 31-bit integers.

end = 0100 0000 0000 0000 0000 0000 0000 0000 = 1073741824
start = 0100 0000 0000 0000 0000 0000 0000 0000 = 1073741824

When you add them together they may “spill” over into the top-bit.

start + end = 1000 0000 0000 0000 0000 0000 0000 0000
= 2147483648 as unsigned 32-bit integer
= -2147483648 as signed 32-bit integer

(start + end) / 2 = 1100 0000 0000 0000 0000 0000 0000 0000 = -1073741824
(start + end) >>> 1 = 0100 0000 0000 0000 0000 0000 0000 0000 = 1073741824

Caveat with unsigned right shift (`>>>`) solution

How signed and unsigned right shift solution work?

Unlike the signed right shift operator (>>), which sign-extends the left operand by filling empty bits with the sign bit, the unsigned right shift operator (>>>) fills empty bits with zeros when shifting right.

For eg. consider the number -128.

-128 (decimal) = 10000000 (binary)  // Note the leading 1 for negative sign

When doing the signed right shift >> operation, the signed right shift operator preserves the sign bit. To maintain the negative value, the leftmost bit (which was 1) is replicated and fills the empty bit introduced by the shift.

-128 >> 1 = 11000000 (binary) = -64 (decimal)

When doing the unsigned right shift >>> operation, the bits are shifted to right, and operand is treated as an unsigned integer, ignoring the sign bit, so shifted value is filled with 0.

-128 >>> 1 = 01000000 (binary) = 128 (decimal) // Note the overflow and positive value

Why no overflow when doing right shift `>>>` operation

When adding 2 large numbers start and end, the most significant bit (MSB) will only be set to 1 if the sum overflows the maximum value representable by the data type.
However, unsigned right shift>>> ignores the signed bit. When shifting the bits to the right and filling empty bits with zeros, the >>> operator essentially discards the carry bit (overflow bit) and retains the significant bits for the average calculation.

Potential issues

While unlikely for positive start and end values, if either start or end is negative due to exceptional circumstances, overflow can still occur during the addition. In such cases, other approaches for calculating the middle index might be necessary (e.g., start + (end - start) / 2).

Code Example

    public static void main(String args[]) {
      int start = Integer.MAX_VALUE / 2 + 2;
      int end = Integer.MAX_VALUE / 2;
      int m1 = (start + end) / 2;
      int m2 = (start + end) >> 1;
      int m3 =  start + (end - start)/2;
      int m4 =  start + ((start - end)>>1);
      int m5 = (start + end) >>> 1;
    
      System.out.println(m1); // -1073741824 ⇨ overflow
      System.out.println(m2); // -1073741824 ⇨ overflow
      System.out.println(m3); // 1073741824
      System.out.println(m4); // 1073741826
      System.out.println(m5); // 1073741824
    }

Problem#

Fixing the Problem#

Caveat with unsigned right shift (>>>) solution#

How signed and unsigned right shift solution work?#

Why no overflow when doing right shift >>> operation#

Potential issues#

Code Example#