1. Description

You are a product manager and currently leading a team to develop a new product. Unfortunately, the latest version of your product fails a quality check. Since each version is developed based on the previous version, all the versions after a bad version are also bad.

Suppose you have n versions [1, 2, ..., n] and you want to find out the first bad one, which causes all the following ones to be bad.

You are given an API bool isBadVersion(version) which returns whether version is bad. Implement a function to find the first bad version. You should minimize the number of calls to the API.

Constraints:

  1. \(1 \leq bad \leq n \leq 2^{31} - 1\)

1.1. Examples:

Input: n = 5, bad = 4
Output: 4
Explanation:
call isBadVersion(3) -> false
call isBadVersion(5) -> true
call isBadVersion(4) -> true
Then 4 is the first bad version.

Input: n = 1, bad = 1
Output: 1

2. Solution

2.1. Understanding the problem

This is a typical Binary Search problem but I don't understand why they put bad=4 as Input in the example when the template is definitely not using it. I know it's the answer but why put it there?

2.2. Algorithm

The possible responses from the API can be arranged as the following list: [f,f,f,f,t,t,t,...,t]

Basically for this problem, from definition, right_idx is always a bad version and we will continue the search until left_idx == right_idx, and we can return right_idx.

2.3. Code

def firstBadVersion(n):
    """
    :type n: int
    :rtype: int
    """
    left_idx = 1
    right_idx = n

    while left_idx < right_idx:
        mid = (left_idx + right_idx) // 2

        if isBadVersion(mid):
            # right_idx should always be bad version
            right_idx = mid
        else:
            left_idx = mid + 1

    return right_idx


# tests
def isBadVersion(ver: int):
    return ver >= first_bad

first_bad = 3
print(firstBadVersion(5) == first_bad)

first_bad = 4
print(firstBadVersion(5) == first_bad)

first_bad = 1
print(firstBadVersion(5) == first_bad)
True
True
True

2.3.1. Complexity

2.3.1.1. Time complexity:

Olog(n)

2.3.1.2. Space complexity:

O(1)

2.4. Leetcode solution

nil.

<<imports for typing>>


2.4.1. Time complexity:

2.4.2. Space complexity:

3. More analysis

3.1. General thoughts

Two things that we need to be aware of in this example:

  1. using (left_idx + right_idx) // 2 could cause an overflow issue so it's better to use left_idx + (right_idx - left_idx)//2
  2. The termination condition left_idx < right_idx. We use it without the = sign because right_idx = mid, instead of right_idx = mid + 1.
    1. Because we know mid is a bad version, but we don't know if mid - 1 is also a bad version, hence we set right_idx = mid.
    2. Once left_idx == right_idx, we know that we've found our bad version, which is both left_idx and right_idx, so we don't need to go in to the loop again.

I think this is also how git bisect works. There is also a python library bisect for this.

3.2. Related problems

  1. 704 Binary Search
  2. [BROKEN LINK: 6705fa69-9835-4076-b293-cd962e3c5828]

4. Log time