Written By:
Nestor Rubio
Last Updated:
May 10, 2023
Considering Binary Search is one of the most popular algorithms taught when learning the foundations of computer science, there’s no telling how you got here but I’m hoping you can gain something from this post. Keep reading to see the ideas behind binary search and its implementation in languages like Java, C, JavaScript, Python, and Go. If this is not entirely clear by the end there is no worries. When I was learning this, it took me some hours of reading and researching to truly say I got a hang of it.
What Problem Does it Solve?
The main problem binary search solves is the problem of searches that could take too long on large datasets. For an array of about 20 items in today’s computing age is virtually not a problem no matter how slow your search algorithm is. Now imagine an array of 100, 1000, 5657864578 items. Suddenly the initial thought we had of checking each individual number one-by-one seems less and less plausible if we want quick results. Imagine finding a needle in that haystack. It would take forever!
The basis of binary search is quite simple and is often compared to that of looking for a name in a dictionary. In fact, you may relate to how this search is done because many of us have done it before. If you want to look for a word in the dictionary, you can flip to a random page and see if it is on that page. If not, then based on the alphabet, you can know whether to look before or after the page you flipped to. This removes the need to search through many, many pages on the side of the dictionary that you know the word will not be on. This can be repeated until you either see the word on a page you flip to or find out that the word your friend used to beat you in Scrabble was not real like they swore it was. Odd because I would have thought “alip” would be a word too. Congratulations, you just visualized and saw a real-life example of a binary search.
There is one key thing to know about why this works though. For starters, the reason we could know at each page we flip to whether to go before or after the page is because there is an expectation that the words are all sorted in alphabetical order. If this was not the case, there would be no way to know whether to look before or after and so we’d be forced to check each word one by one since none of them tell use the relative position of the other. The other thing at play here is that the concept or structure of the book allows us to flip to any page almost instantly. This is important because if we had to close the book and start from the beginning to flip to the page we want or had any complication in getting to the page, then there would be questions on what the best way to find the word in the dictionary would be.
Translating these dictionary aspects into code and how it applies to us coding it, we can see the importance of having sorted values before even going into it. If the array is not sorted, it is not worth even beginning the binary search because it will be futile except in odd miracle cases where it works due to circumstance. Part of the fun of coding is seeing the data you’re given and seeing the best way to manipulate it if needed to solve the problem. In some cases, it may be worth it to sort the dataset before searching so that the benefits of the binary search speed up the process (more on this later). If the data is already sorted, this is the ideal case because you can instantly start the binary search.
The observation of the dictionary’s structure is important. This post will show examples of the binary search implemented on arrays because that is the most effective structure for teaching binary search in due to the immediate access we are given to any index in the array at all times. Something like a Linked List would be close to the example where in order to get to a page you always need to start at the beginning and eventually get to that page because of the lack of immediate access to most values in the list. This could lead to repeatedly checking pages one through whatever even if we know it wont be on those pages and all that extra work can lead to so much added time to finding the result.
Coding Binary Search
When implementing binary search it is often done in one of two ways: iterative or recursive. Each section has its pros and cons. No matter what anyone tells you, there is no “right” answer for the way to implement binary search, although all of us tend to develop a preference along the way.
Both forms of solving the problem follow a series of steps which get repeated in a cycle:
- If we are at the last index that the value should be then we will return whether it is or not. If we are not at the last possible spot then there are some options available and we can move to the next step
- Find the middle index of the available indices
- If the item is in the middle then simply return it
- If the item is greater than the value at the current index then we need to search the right half since we know that in any sorted array, the left side of that midway index will never have the item we are searching
- If the item is less than the value at the current index then we can search the left half of the midway point because we know the right side will not contain the value we seek in any sorted array
Iterative Approach
Iterative in this case means that the steps are repeated sequentially to solve the problem. In a repetitive problem like binary search, this can be simulated with a while loop which is bound by two pointers defining the “scope” of indices that our target value could be in. We can see this at play in the examples below.
- C
- Java
- Python
- Golang
- Typescript
int binarySearchIterative(int arr[], int target, int arrLen) {
int left = 0;
int right = arrLen - 1;
while(left <= right) {
int mid = (left + right) / 2; // calculate middle index
if(arr[mid] == target) {
return mid; // found target at middle index
} else if(target > arr[mid]) {
left = mid + 1; // target is in right half of array so move lowest search to right of mid
} else {
right = mid - 1; // target is in the left half of the array so move highest search to left of mid
}
}
return -1; // target not found
}
public static int binarySearchIterative(int[] list, int target) {
int low = 0;
int high = list.length - 1;
while (low <= high) {
int mid = (low + high) / 2; // calculate middle index
if(list[mid] == target) {
return mid; // found target
} else if(target > list[mid]) {
// target is in right half of array so move lowest search to right of mid
low = mid + 1;
} else {
// target is in the left half of the array so move highest search to left of mid
high = mid - 1;
}
}
return -1; // target not found
}
def binary_search_iterative(arr, target):
left = 0
right = len(arr) - 1
while left <= right:
# calculate midpoint with double slash for floor division
mid = (left + right) // 2
if arr[mid] == target:
return mid # target found
elif target > arr[mid]:
# target is in right half of array so move lowest search to right of mid
left = mid + 1
else:
# target is in the left half of the array so move highest search to left of mid
right = mid - 1
return -1 # target not found
func binarySearchIterative(list []int, target int) int {
low := 0
high := len(list) - 1;
for low <= high {
mid := (low + high) / 2 // get middle index between high and low
if list[mid] == target {
return mid // found the target
} else if target > list[mid] {
low = mid + 1 // since target will be to the right, we shift our lowest point of search to the right of mid
} else {
// this means target < mid so since target will be to the left, we shift our highest point of search to the left of mid
high = mid - 1
}
}
return -1 // target not found
}
function binarySearchIterative(arr: number[], target: number):number {
let left = 0;
let right = arr.length - 1;
while(left <= right) {
// get middle index of left and right with floor division
const mid = Math.floor((left + right) / 2);
if(arr[mid] === target) {
return mid;// target found
} else if(target > arr[mid]) {
// target is in right half of array so move lowest search to right of mid
left = mid + 1;
} else {
// target is in the left half of the array so move highest search to left of mid
right = mid - 1
}
}
return -1; // target not found
}
Recursive Approach
A trick to the recursive approach is framing the problem of finding a target value in some array as a problem that can be broken into several smaller problems where each subproblem can be solved with the same function call. We essentially call the function within itself to solve its own problem! You can see that just like how the iterative versions use two pointers to define the window of indices that our target value could be in, we do the same thing when writing recursive code but we let our function arguments hold our pointers defining that valid window. With each call, the size of the window in which we check for values decreases until we either find the target or see that with the window closed, the target is not in the list.
- C
- Java
- Python
- Golang
- Typescript
int binarySearchRecursive(int arr[], int left, int right, int target) {
if(left > right) {
return -1;
}
int mid = (left + right) / 2;// calculate middle index
if(arr[mid] == target) {
return mid; // found target at middle index
} else if(target > arr[mid]) {
// target is in right half of array so move lowest search to right of mid
return binarySearchRecursive(arr, mid + 1, right, target);
} else {
// target is in the left half of the array so move highest search to left of mid
return binarySearchRecursive(arr, left, mid - 1, target);
}
}
public static int binarySearchRecursive(int[] list, int target, int low, int high) {
if(low > high) {
return -1; // target not found
}
int mid = (low + high) / 2;
if(list[mid] == target) {
return mid; // found target
} else if(target > list[mid]) {
// target is in right half of array so move lowest search to right of mid
return binarySearchRecursive(list, target, mid + 1, high);
} else {
// target is in the left half of the array so move highest search to left of mid
return binarySearchRecursive(list, target, low, mid - 1);
}
}
def binary_search_recursive(arr, target, left, right):
if left > right:
return -1
# calculate midpoint with double slash for floor division
mid = (left + right) // 2
if arr[mid] == target:
return mid # target found
elif target > arr[mid]:
# target is in right half of array so move lowest search to right of mid
return binary_search_recursive(arr,target,mid + 1, right)
else:
# target is in the left half of the array so move highest search to left of mid
return binary_search_recursive(arr,target,left, mid - 1)
func binarySearchRecursive(list []int, target int, low, high int) int {
// if we are at final check return whether it is the target or not
if low == high {
return -1
}
// get middle of low and high
mid := (low + high) / 2
if list[mid] == target {
return mid // target found in the middle index
}
if target > list[mid] {
// since target is to the right, we shift our lowest point of search to the right of mid
return binarySearchRecursive(list, target, mid + 1, high)
} else {
// sice target is to the left, we shift our highest point of search to the left of mid
return binarySearchRecursive(list, target, low, mid - 1)
}
}
function binarySearchRecursive(arr: number[], target: number, low: number, high: number):number {
if(low > high) {
return -1; // target not found
}
const mid = Math.floor((low + high) / 2);
if(arr[mid] === target) {
return mid; // target found
} else if(target > arr[mid]) {
// target is in right half of array so move lowest search to right of mid
return binarySearchRecursive(arr,target,mid+1,high);
} else {
// target is in the left half of the array so move highest search to left of mid
return binarySearchRecursive(arr,target,low, mid-1);
}
}
Time and Space Complexity
Note: There is no pressure to know this information right now but if the term “time and space complexity” is not ringing any bells, you can read more about it in a post I made here.
Earlier I mentioned the idea of weighing benefits of things like sorting and running binary search or opting for another way of finding our target. Part of this process is knowing how much time and memory the operations can cost us.
First, we need to see the pros and cons of using binary search over a linear search in an array we can at least assume for fair comparison reasons is already sorted. Looking at the time complexity first, it is evident with our understanding of binary search that for large datasets, binary search clearly dominates a linear search. Since binary search follows the pattern of cutting the problem to be solved in half with each iteration, it has a runtime that can be denoted as O(log(n). The linear search has to check each item sequentially and the worst case situation is that the item could be at the end. If there’s no way of knowing ahead of time where this target is, we are looking at a runtime that can be denoted as O(n). This gives the binary search an advantage over the linear search because for each extra item, the linear search could do another comparison whereas the binary search requires a much larger number of additional values in order to even add that one extra comparison.
If the array is not already sorted we run into a dilemma. The typical sort function can run in O(nlog(n)) time which well exceeds both O(log(n)) and O(n). It is up to you to decide in a case like this if the extra time being spent on sorting it to find an item is worthwhile. This is typically worth it if the list could be looked through more than once, meaning we invest the initial sorting time to make future searches much faster. If it is a one-off search the issue becomes a little more nuanced.
Moving onto the less clear payoff made when deciding between a binary search and a linear search is the amount of memory needed to complete it. While the iterative approach is fairly simple and allocates memory for some local variables and simply runs through the search to return an integer, the recursive one is not so simple and memory-efficient. Each call the function makes to itself throws an extra function call on the call stack. This means that the state of the very first binary search call will be held in memory until all following calls have been made. Same for the second, third, etc. This can lead to an accumulation of function calls which may lead to a well-known stack overflow error. In today’s day and age of computing, these memory concerns aren’t necessarily showstoppers for small datasets but if the dataset becomes large enough and the system running the program is constrained enough, this could very well be a valid concern. This doesn’t mean the iterative approach is the only one to know though, it never hurts to know another concept in the confusing world of recursion and so being able to binary search in both ways can benefit you greatly in your learning journey.
More To Learn
Integer Overflow
I’ve made mention of large datasets but there’s another topic related to it that may be worth understanding if you’re the curious type. We know that computers don’t have infinite memory and that each number stored in memory could take some small amount of that memory. The typical int in many programming languages is the 32bit integer. Without going into too much detail, this means that the max number that can be stored in the int is 231 – 1 = 2147483647 while the smallest number that can be stored is -231 = 2147483648. If an integer that is larger than the max gets stored in a 32bit integer space, then the excess difference will actually flow over into the negative space and start up from that minimum value. This is not the case for all languages as I have seen that languages like Python, JavaScript, and others handle overflow in different ways. I recommend doing some research to see how your language of choice handles these situations.
The reason this is interesting is that in all our examples we calculate the middle index by taking the elementary average of the high and low indices. For large datasets it is possible these two indices could exceed the 32bit integer space, overflow into negatives, and end up causing us to index out of bounds. We can avoid this with a pretty simple trick:
Replace ( (high + low) / 2 ) with ( low + (high - low) / 2 )
This allows us to calculate our middle index by the adding of the difference to the low point rather than taking the full sum of two potentially large values. Pretty neat if you ask me.
Different Patterns
This format of writing binary search is definitely not the only way to logically structure it. There are many ways to reorganize the logic to perhaps decrease the number of useless comparisons or even speed up the development process by not being so explicit. I encourage you to see other ways of writing it because being able to comprehend the search in its many forms will help solidify your understanding of its foundations and how it works and can be applied to different problems.
On that note, binary search can help for so much more than finding an integer in an array of integers. This is the beauty of coding. Imagine you pulled a sorted list of users from a database. Each user could consist of a username, first name, last name, and date of birth. Searching for a user by username could be done by sorting the list by username and beginning our binary search. The conditions wouldn’t explicitly be integer comparisons but could perhaps make use of the language’s string comparison function to tell us if the target is greater or less than the current user being compared to. This is a brilliant chance to create a mini program and see how this search and its foundations can be applied to solve more complex issues.
Conclusion
Congrats, you made it to the end and have seen not only what problem binary search addresses but also how and why it works as well as seeing some of the implementation details of it. Your learning of algorithms and searches doesn’t have to end here though. Keep doing research and practicing until you can explain it to someone in your life who may not be as well-versed as you. It helps, trust me.
Best of luck in your sorting journey!