Why Binary Search is Hard to Write Correctly

Binary search is the canonical “simple” algorithm. It’s also the canonical example of an algorithm that almost everyone implements incorrectly. Bentley reported in Programming Pearls that only about 10% of professional programmers can write a correct binary search. The bug in Java’s Arrays.binarySearch wasn’t found until 2006 — nine years after it shipped.

The problem isn’t that binary search is conceptually hard. It’s that there are several off-by-one decisions that interact, and intuition alone can’t keep them straight. The tool that can is the loop invariant.

The problem

Given a sorted array $A[0..n-1]$ and a target value $t$ , find an index $i$ such that $A[i] = t$ , or determine that no such index exists.

The invariant approach

A loop invariant is a property that holds before each iteration of the loop. You prove three things:

Initialization: the invariant holds before the first iteration.
Maintenance: if it holds before an iteration, it holds after.
Termination: when the loop exits, the invariant plus the exit condition imply the postcondition.

For binary search, let’s use indices lo and hi to bound the search range. The invariant we’ll maintain is:

$\text{If } t \text{ exists in } A, \text{ then } t \in A[\texttt{lo}..\texttt{hi})$

That is: if the target is anywhere in the array, it’s in the half-open interval $[\texttt{lo}, \texttt{hi})$ . When $\texttt{lo} = \texttt{hi}$ , the interval is empty and we can conclude $t$ is not in the array.

A correct implementation

def binary_search(a, t):
    """
    Return index i such that a[i] == t, or -1 if not found.
    Precondition: a is sorted in non-decreasing order.

    Invariant: if t is in a, then t is in a[lo..hi)
    """
    lo = 0
    hi = len(a)

    while lo < hi:
        mid = lo + (hi - lo) // 2  # avoids overflow
        if a[mid] < t:
            lo = mid + 1
        elif a[mid] > t:
            hi = mid
        else:
            return mid

    return -1

Let’s verify the invariant formally.

Initialization. Before the loop: $\texttt{lo} = 0$ , $\texttt{hi} = n$ . The interval $[0, n)$ is the entire array. If $t$ is anywhere, it’s here. The invariant holds.

Maintenance. Assume the invariant holds at the start of an iteration: if $t \in A$ , then $t \in A[\texttt{lo}..\texttt{hi})$ . We compute $\texttt{mid} = \lfloor(\texttt{lo} + \texttt{hi}) / 2\rfloor$ .

Case 1: $A[\texttt{mid}] < t$ . Since $A$ is sorted, every element in $A[\texttt{lo}..\texttt{mid}]$ is $\leq A[\texttt{mid}] < t$ . So $t \notin A[\texttt{lo}..\texttt{mid}]$ , meaning if $t \in A$ , then $t \in A[\texttt{mid}+1..\texttt{hi})$ . Setting $\texttt{lo} = \texttt{mid} + 1$ preserves the invariant.

Case 2: $A[\texttt{mid}] > t$ . Every element in $A[\texttt{mid}..\texttt{hi})$ is $\geq A[\texttt{mid}] > t$ . So if $t \in A$ , then $t \in A[\texttt{lo}..\texttt{mid})$ . Setting $\texttt{hi} = \texttt{mid}$ preserves the invariant.

Case 3: $A[\texttt{mid}] = t$ . We return immediately. Correctness is obvious.

Termination. The loop exits when $\texttt{lo} \geq \texttt{hi}$ . Since we maintain $\texttt{lo} \leq \texttt{hi}$ (both branches either increase $\texttt{lo}$ or decrease $\texttt{hi}$ , and $\texttt{mid}$ is always in $[\texttt{lo}, \texttt{hi})$ ), the exit condition is $\texttt{lo} = \texttt{hi}$ . The interval $[\texttt{lo}, \texttt{hi})$ is empty, so by the invariant, $t \notin A$ .

Termination guarantee. We need to verify the loop terminates. The quantity $\texttt{hi} - \texttt{lo}$ is a non-negative integer that strictly decreases each iteration. In Case 1, $\texttt{lo}$ increases to at least $\texttt{mid} + 1 > \texttt{lo}$ . In Case 2, $\texttt{hi}$ decreases to $\texttt{mid} < \texttt{hi}$ (since $\texttt{mid} = \lfloor(\texttt{lo} + \texttt{hi})/2\rfloor < \texttt{hi}$ when $\texttt{lo} < \texttt{hi}$ ). So the loop terminates.

Where bugs hide

Bug 1: overflow in midpoint calculation

The classic mistake:

int mid = (lo + hi) / 2;  // BUG: lo + hi can overflow

If lo and hi are both large (near $2^{31} - 1$ for 32-bit integers), their sum overflows to a negative number, and mid becomes negative. This was the bug in Java’s standard library.

The fix:

int mid = lo + (hi - lo) / 2;  // safe: hi - lo doesn't overflow

In Python, integers are arbitrary-precision, so this doesn’t apply. But the habit matters.

Bug 2: wrong interval convention

Mixing up closed and half-open intervals is the most common source of off-by-one errors. If you use the closed interval $[\texttt{lo}, \texttt{hi}]$ :

def binary_search_closed(a, t):
    """Using closed interval [lo, hi]."""
    lo = 0
    hi = len(a) - 1  # different initialization

    while lo <= hi:  # different condition
        mid = lo + (hi - lo) // 2
        if a[mid] < t:
            lo = mid + 1
        elif a[mid] > t:
            hi = mid - 1  # different update
        else:
            return mid

    return -1

The invariant is now: if $t \in A$ , then $t \in A[\texttt{lo}..\texttt{hi}]$ . The loop condition changes to $\texttt{lo} \leq \texttt{hi}$ (the interval $[\texttt{lo}, \texttt{hi}]$ is empty when $\texttt{lo} > \texttt{hi}$ ). And when $A[\texttt{mid}] > t$ , we set $\texttt{hi} = \texttt{mid} - 1$ instead of $\texttt{hi} = \texttt{mid}$ , because $\texttt{mid}$ has been checked and the interval is closed.

Both versions are correct. The danger is mixing conventions: using half-open initialization with a closed-interval loop condition, or vice versa. This creates an invariant that doesn’t actually hold, leading to infinite loops or missed elements.

Bug 3: not reducing the interval

# BUG: infinite loop when hi - lo == 1
if a[mid] < t:
    lo = mid      # should be mid + 1

If $\texttt{lo} = 5$ and $\texttt{hi} = 6$ , then $\texttt{mid} = 5$ , and setting $\texttt{lo} = \texttt{mid} = 5$ doesn’t change anything. The interval never shrinks. The invariant is maintained, but termination is violated.

Variations: lower bound and upper bound

The real power of the invariant approach shows when you need variations. lower_bound finds the first position where you could insert $t$ while maintaining sorted order (the leftmost element $\geq t$ ):

def lower_bound(a, t):
    """
    Return smallest index i such that a[i] >= t.
    If all elements are less than t, return len(a).

    Invariant: a[0..lo) < t  and  a[hi..n) >= t
    """
    lo = 0
    hi = len(a)

    while lo < hi:
        mid = lo + (hi - lo) // 2
        if a[mid] < t:
            lo = mid + 1  # a[mid] < t, so a[0..mid] < t, extend left partition
        else:
            hi = mid      # a[mid] >= t, so a[mid..n) >= t, extend right partition

    return lo  # lo == hi, and a[0..lo) < t, a[lo..n) >= t

The invariant here is different:

$A[0..\texttt{lo}) < t \quad \text{and} \quad A[\texttt{hi}..n) \geq t$

At termination ( $\texttt{lo} = \texttt{hi}$ ), the entire array is partitioned: everything before $\texttt{lo}$ is strictly less than $t$ , and everything from $\texttt{lo}$ onward is $\geq t$ . So $\texttt{lo}$ is the answer.

upper_bound is analogous — the first position where you could insert $t$ after all existing copies:

def upper_bound(a, t):
    """
    Return smallest index i such that a[i] > t.

    Invariant: a[0..lo) <= t  and  a[hi..n) > t
    """
    lo = 0
    hi = len(a)

    while lo < hi:
        mid = lo + (hi - lo) // 2
        if a[mid] <= t:   # only difference: <= instead of <
            lo = mid + 1
        else:
            hi = mid

    return lo

The difference between lower_bound and upper_bound is a single character: < vs <= in the comparison. The invariant makes it obvious why.

And with both, you can count occurrences of $t$ in $O(\log n)$ :

def count_occurrences(a, t):
    return upper_bound(a, t) - lower_bound(a, t)

A C++ implementation

For reference, here’s a generic lower_bound in C++ — essentially what std::lower_bound does:

template <typename It, typename T>
It lower_bound(It first, It last, const T& value) {
    while (first != last) {
        auto mid = first + (last - first) / 2;
        if (*mid < value) {
            first = mid + 1;
        } else {
            last = mid;
        }
    }
    return first;
}

The structure is identical to the Python version. The invariant is identical. The correctness proof is identical. The abstraction over iterators doesn’t change the logic at all — and that’s the point. Once you have the invariant right, the implementation in any language is mechanical.

The lesson

Binary search is not hard because it requires cleverness. It’s hard because it requires precision — every bound, every comparison operator, every $+1$ or $-1$ must be consistent with a single invariant. Intuition says “split in half, recurse.” But intuition doesn’t tell you whether hi should be mid or mid - 1. The invariant does.

More broadly: any loop that maintains a shrinking interval over integers is doing binary search in disguise. The invariant technique applies to all of them. Two-pointer algorithms, partition routines in quicksort, bisection methods for root-finding — they all have the same structure, and they all have the same class of bugs. State the invariant. Verify the three conditions. The code writes itself.