Binary Search Algorithm
Popcorn Hack 1: CB MCQ 2020
Which of the following conditions must be met in order for the procedure to work as intended? Explain why.
a) The length of numList must be even
b) The list numList must not contain any duplicate values
c) The values in numList must be in sorted order
d) The value of target must not be equal to -1
The answer is C. The array must be in a sorted order because it cannot search the first half or the second half if the target isn’t guaranteed to be in one of the halves.
Popcorn Hack 2:
Which of the following statements correctly describes a disadvantage of binary search compared to linear search? Explain why your answer is correct and why the others are wrong.
a) Binary search takes more time on average than linear search
b) Binary search cannot be used on unsorted lists without modifications
c) Binary search always returns the first occurrence of the target
d) Binary search can only be used on lists with unique values
The answer is B. Since Binary search must search one of the halves of the data set, for the target to be in one guaranteed half, it must be sorted, so Binary search cannot be used on unsorted lists.
Popcorn Hack 3:
Create a binary search algorithm that returns the index of a value in a data set.
def binary_search_letters(arr, target):
low, high = 0, len(arr) - 1
while low <= high:
mid = (low + high) // 2
guess = arr[mid]
if guess == target:
return mid
elif guess < target:
low = mid + 1
else:
high = mid - 1
return -1
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
print(binary_search_letters(letters, 'g')) # Output should be 6
6
Homework Hack
import pandas as pd
data = pd.read_csv("school_supplies.csv")
data_cleaned = data.dropna()
data_sorted = data_cleaned.sort_values(by="Price")
price_list = data_sorted["Price"].tolist()
print("First few rows of sorted data:")
print(data_sorted.head())
print("Original row count:", len(data))
print("Cleaned row count:", len(data_cleaned))
def binary_search_for_price(arr, target):
low, high = 0, len(arr) - 1
while low <= high:
mid = (low + high) // 2 # Find middle index
if arr[mid] == target:
return mid # Target found
elif arr[mid] < target:
low = mid + 1 # Search right half
else:
high = mid - 1 # Search left half
return -1 # Target not found
# Prices to search
search_prices = [1.25, 6.49, 10.00]
# Use binary search to look for each price
for price in search_prices:
result = binary_search_for_price(price_list, price)
if result != -1:
print(f"Yay! Price ${price:.2f} found at index {result}.")
else:
print(f"Oopsies! Price ${price:.2f} not found in the list.")
First few rows of sorted data:
Product Price
5 Eraser 0.50
14 Paper Clips 0.89
2 Pencil 0.99
9 Glue Stick 1.25
1 Pen 1.50
Original row count: 15
Cleaned row count: 15
Yay! Price $1.25 found at index 3.
Yay! Price $6.49 found at index 12.
Oopsies! Price $10.00 not found in the list.
Explanation of Homework Hack
Code steps:
- loads a CSV file with school supplies data into a pandas dataframe.
- removes any rows that contain missing values.
- sorts the cleaned data by the “Price” column.
- converts the sorted “Price” column into a list for binary search.
- defines a binary search function to efficiently locate a target price.
- loops through the prices given and searches each half for the given price.
- prints whether each price was found or not.