- Understand how binary search eliminates half the search space with each comparison
- Implement both iterative and recursive versions of binary search
- Recognize why sorted data is required and trace through the algorithm step-by-step
- Avoid common pitfalls like integer overflow and off-by-one errors
Binary Search Fundamentals
Here's a question: if I gave you a phone book (remember those?) and asked you to find "Smith, John," would you start at the first page and flip through one by one? Of course not. You'd open somewhere in the middle, see if you're past the S section or before it, and adjust accordingly. You might not realize it, but you're using binary search.
Binary search is one of those algorithms that feels almost too simple to be powerful. And yet, it transforms a problem that could take a million steps into one that takes about 20. That's not just an improvement—that's a completely different category of performance.
The Problem Binary Search Solves
Let's say you have a sorted list of a million numbers, and you need to find a specific one. The naive approach—linear search—means checking each number until you find it. On average, you'd check 500,000 numbers. In the worst case, all million.
Binary search takes a different approach: each time you look at a number, you learn something that lets you eliminate half the remaining possibilities. Look at the middle number. If it's too big, the target must be in the left half. If it's too small, the target must be in the right half. Either way, you just eliminated 500,000 candidates with one comparison.
Do it again. Half of what remains gets eliminated. Again. And again.
How many times can you cut a million in half before you're down to one? About 20 times. That's log₂(1,000,000) ≈ 20.
This is why binary search is O(log n) and linear search is O(n). It's not a constant factor speedup—it's a logarithmic improvement that gets more dramatic as your data grows.
How It Actually Works
The algorithm is remarkably simple:
- Start with the full range: Your "search space" is the entire array
- Check the middle element: Calculate the midpoint index
- Compare and eliminate half:
- If the middle element equals your target, you're done
- If the middle element is less than your target, eliminate the left half
- If the middle element is greater than your target, eliminate the right half
- Repeat on the remaining half until you find the target or run out of elements
Let's trace through an example. Search for 23 in [3, 7, 12, 18, 23, 31, 45, 52, 68]:
Array: [3, 7, 12, 18, 23, 31, 45, 52, 68]
Target: 23
Step 1:
low = 0, high = 8
mid = (0 + 8) // 2 = 4
arr[4] = 23
23 == 23 → Found at index 4!
That was lucky—we got it on the first try. Let's try a harder one. Search for 7:
Array: [3, 7, 12, 18, 23, 31, 45, 52, 68]
Target: 7
Step 1:
low = 0, high = 8
mid = 4
arr[4] = 23
7 < 23 → Search left half (indices 0-3)
Step 2:
low = 0, high = 3
mid = (0 + 3) // 2 = 1
arr[1] = 7
7 == 7 → Found at index 1!
Now let's search for something not in the array, like 50:
Array: [3, 7, 12, 18, 23, 31, 45, 52, 68]
Target: 50
Step 1:
low = 0, high = 8
mid = 4
arr[4] = 23
50 > 23 → Search right half (indices 5-8)
Step 2:
low = 5, high = 8
mid = (5 + 8) // 2 = 6
arr[6] = 45
50 > 45 → Search right half (indices 7-8)
Step 3:
low = 7, high = 8
mid = (7 + 8) // 2 = 7
arr[7] = 52
50 < 52 → Search left half (indices 7-6)
Step 4:
low = 7, high = 6
low > high → Not found!
When low exceeds high, we've exhausted the search space. The target doesn't exist.
The Code: Iterative Version
Here's the standard iterative implementation:
def binary_search(arr, target):
low = 0
high = len(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == target:
return mid # Found it
elif arr[mid] < target:
low = mid + 1 # Search right half
else:
high = mid - 1 # Search left half
return -1 # Not found
The loop continues while there's still a valid search space (low <= high). Each iteration either finds the target or cuts the search space in half.
Why low <= high and not low < high? Because when low == high, there's still one element to check. If we used <, we'd miss that last element.
The Code: Recursive Version
Binary search can also be written recursively:
def binary_search_recursive(arr, target, low=0, high=None):
if high is None:
high = len(arr) - 1
# Base case: search space exhausted
if low > high:
return -1
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
# Recursively search right half
return binary_search_recursive(arr, target, mid + 1, high)
else:
# Recursively search left half
return binary_search_recursive(arr, target, low, mid - 1)
This is elegant, but the iterative version is usually preferred in practice. Why? The iterative version uses O(1) space, while the recursive version uses O(log n) space for the call stack. For most applications, this difference doesn't matter, but if you're searching a massive dataset or working on an embedded system, every bit of memory counts.
The Hidden Trap: Integer Overflow
Look at this line:
mid = (low + high) // 2
Seems innocent, right? But there's a subtle bug hiding here. In languages like Java or C++, if low and high are both very large integers (near the maximum int value), low + high could overflow.
The safe way to calculate the midpoint:
mid = low + (high - low) // 2
This is mathematically equivalent but avoids overflow. high - low gives you the range size, you halve it, then add it to low. In Python, this doesn't matter because Python integers don't overflow, but in other languages, this is a real issue that has caused bugs in production code.
Why It Must Be Sorted
Binary search only works on sorted data. If the array isn't sorted, the algorithm falls apart.
Think about why: when you check the middle element and it's less than your target, you eliminate the left half because you assume everything to the left is also less than your target. That assumption only holds if the array is sorted.
If the array isn't sorted, you have two options:
- Sort it first (costs O(n log n), but then each search is O(log n))
- Use linear search (O(n) per search, but no upfront sorting cost)
The trade-off depends on how many searches you'll do. If you're searching once, linear search might be faster overall. If you're searching thousands of times, sorting once and using binary search pays off.
Performance Analysis
Time complexity:
- Best case: O(1) - target is at the middle on the first check
- Average case: O(log n) - typical case
- Worst case: O(log n) - target not found or at an extreme
Space complexity:
- Iterative: O(1) - just a few variables
- Recursive: O(log n) - call stack depth
Those log n numbers are powerful. For an array of:
- 10 elements: max 4 comparisons
- 100 elements: max 7 comparisons
- 1,000 elements: max 10 comparisons
- 1,000,000 elements: max 20 comparisons
- 1,000,000,000 elements: max 30 comparisons
Linear search on a billion elements could take a billion comparisons. Binary search takes at most 30. That's the power of logarithmic scaling.
Common Mistakes
Mistake 1: Forgetting low <= high
# Wrong
while low < high: # Misses the case when low == high
...
# Right
while low <= high: # Checks all elements
...
Mistake 2: Not updating bounds correctly
# Wrong
if arr[mid] < target:
low = mid # This can cause infinite loop
# Right
if arr[mid] < target:
low = mid + 1 # Properly excludes mid
If you set low = mid instead of low = mid + 1, and arr[mid] keeps being less than the target, low and mid could get stuck pointing to the same position forever.
Mistake 3: Ignoring edge cases
What if the array is empty? What if it has one element? What if the target is smaller than the smallest element or larger than the largest?
Your code should handle these gracefully. The standard binary search implementation does—when the array is empty, low = 0 and high = -1, so the loop never executes and it returns -1.
Beyond Just Finding Elements
Binary search isn't just for finding elements. The core idea—eliminate half the possibilities with each comparison—applies to many problems:
Finding the first or last occurrence of a value in a sorted array with duplicates
Finding the insertion point for a new element to keep the array sorted
Searching in rotated sorted arrays (common interview question)
Finding peaks in mountain arrays
Searching in 2D sorted matrices
We'll explore these variations in the next lesson. For now, the key insight is that binary search is a strategy: if you can frame your problem so that you can eliminate half the solution space with each check, you can solve it in O(log n) time.
Where You'll See This
Binary search shows up everywhere:
Database indexes: B-trees use binary search principles to find records in O(log n) time
Git bisect: Finding which commit introduced a bug by binary searching through commit history
Finding breakpoints: In continuous functions, finding where a condition changes (numerical analysis)
Game AI: Minimax with alpha-beta pruning uses binary search ideas
Memory allocation: Binary buddy allocators
Networking: Binary exponential backoff algorithms
The Bigger Picture
Binary search is more than an algorithm—it's a problem-solving paradigm. The question you should ask when facing any search or optimization problem is: "Can I frame this so each comparison eliminates half the possibilities?"
If the answer is yes, you've probably got an O(log n) solution hiding in there.
The elegance of binary search is that it's simple enough to understand in five minutes but powerful enough to solve problems that would otherwise be intractable. A billion-element array that would take minutes to search linearly? Binary search does it in microseconds.
That's not magic. That's just mathematics—and it's available to anyone who understands how to cut a problem in half.
