004. Median of Two Sorted Arrays

  • 21.0%

https://leetcode.com/problems/median-of-two-sorted-arrays/

There are two sorted arrays nums1 and nums2 of size m and n respectively.

Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

1
2
3
4
5
Example 1:
nums1 = [1, 3]
nums2 = [2]

The median is 2.0
1
2
3
4
5
Example 2:
nums1 = [1, 2]
nums2 = [3, 4]

The median is (2 + 3)/2 = 2.5

cpp


https://discuss.leetcode.com/topic/16797/very-concise-o-log-min-m-n-iterative-solution-with-detailed-explanation

Very concise O(log(min(M,N))) iterative solution with detailed explanation

This problem is notoriously hard to implement due to all the corner cases. Most implementations consider odd-lengthed and even-lengthed arrays as two different cases and treat them separately. As a matter of fact, with a little mind twist. These two cases can be combined as one, leading to a very simple solution where (almost) no special treatment is needed.

First, let’s see the concept of ‘MEDIAN’ in a slightly unconventional way. That is:

1
2
3
"if we cut the sorted array to two halves of EQUAL LENGTHS, then
median is the AVERAGE OF Max(lower_half) and Min(upper_half), i.e. the
two numbers immediately next to the cut".

For example, for [2 3 5 7], we make the cut between 3 and 5:

1
[2 3 / 5 7]

then the median = (3+5)/2. Note that I’ll use ‘/‘ to represent a cut, and (number / number) to represent a cut made through a number in this article.

for [2 3 4 5 6], we make the cut right through 4 like this:

[2 3 (4/4) 5 7]

Since we split 4 into two halves, we say now both the lower and upper subarray contain 4. This notion also leads to the correct answer: (4 + 4) / 2 = 4;

For convenience, let’s use L to represent the number immediately left to the cut, and R the right counterpart. In [2 3 5 7], for instance, we have L = 3 and R = 5, respectively.

We observe the index of L and R have the following relationship with the length of the array N:

1
2
3
4
5
6
7
8
9
N        Index of L / R
1 0 / 0
2 0 / 1
3 1 / 1
4 1 / 2
5 2 / 2
6 2 / 3
7 3 / 3
8 3 / 4

It is not hard to conclude that index of L = (N-1)/2, and R is at N/2. Thus, the median can be represented as

1
(L + R)/2 = (A[(N-1)/2] + A[N/2])/2

To get ready for the two array situation, let’s add a few imaginary ‘positions’ (represented as #’s) in between numbers, and treat numbers as ‘positions’ as well.

1
2
3
4
5
[6 9 13 18]  ->   [# 6 # 9 # 13 # 18 #]    (N = 4)
position index 0 1 2 3 4 5 6 7 8 (N_Position = 9)

[6 9 11 13 18]-> [# 6 # 9 # 11 # 13 # 18 #] (N = 5)
position index 0 1 2 3 4 5 6 7 8 9 10 (N_Position = 11)

As you can see, there are always exactly 2*N+1 ‘positions’ regardless of length N. Therefore, the middle cut should always be made on the Nth position (0-based). Since index(L) = (N-1)/2 and index(R) = N/2 in this situation, we can infer that index(L) = (CutPosition-1)/2, index(R) = (CutPosition)/2.

Now for the two-array case:

1
2
3
A1: [# 1 # 2 # 3 # 4 # 5 #]    (N1 = 5, N1_positions = 11)

A2: [# 1 # 1 # 1 # 1 #] (N2 = 4, N2_positions = 9)

Similar to the one-array problem, we need to find a cut that divides the two arrays each into two halves such that

1
2
"any number in the two left halves" <= "any number in the two right
halves".

We can also make the following observations:

  1. There are 2N1 + 2N2 + 2 position altogether. Therefore, there must be exactly N1 + N2 positions on each side of the cut, and 2 positions directly on the cut.
  2. Therefore, when we cut at position C2 = K in A2, then the cut position in A1 must be C1 = N1 + N2 - k. For instance, if C2 = 2, then we must have C1 = 4 + 5 - C2 = 7.
1
2
3
[# 1 # 2 # 3 # (4/4) # 5 #]    

[# 1 / 1 # 1 # 1 #]
  1. When the cuts are made, we’d have two L’s and two R’s. They are

    L1 = A1[(C1-1)/2]; R1 = A1[C1/2];
    L2 = A2[(C2-1)/2]; R2 = A2[C2/2];
    In the above example,

1
2
L1 = A1[(7-1)/2] = A1[3] = 4; R1 = A1[7/2] = A1[3] = 4;
L2 = A2[(2-1)/2] = A2[0] = 1; R2 = A1[2/2] = A1[1] = 1;

Now how do we decide if this cut is the cut we want? Because L1, L2 are the greatest numbers on the left halves and R1, R2 are the smallest numbers on the right, we only need

1
L1 <= R1 && L1 <= R2 && L2 <= R1 && L2 <= R2

to make sure that any number in lower halves <= any number in upper halves. As a matter of fact, since
L1 <= R1 and L2 <= R2 are naturally guaranteed because A1 and A2 are sorted, we only need to make sure:

L1 <= R2 and L2 <= R1.

Now we can use simple binary search to find out the result.

1
2
3
4
5
6
7
8
If we have L1 > R1, it means there are too many large numbers on the left half of A1, 
then we must move C1 to the left (i.e. move C2 to the right);

If L2 > R1, then there are too many large numbers on the left half of A2, and we must move C2 to the left.

Otherwise, this cut is the right one.

After we find the cut, the medium can be computed as (max(L1, L2) + min(R1, R2)) / 2;

Two side notes:

A. since C1 and C2 can be mutually determined from each other, we might as well select the shorter array (say A2) and only move C2 around, and calculate C1 accordingly. That way we can achieve a run-time complexity of O(log(min(N1, N2)))

B. The only edge case is when a cut falls on the 0th(first) or the 2Nth(last) position. For instance, if C2 = 2N2, then R2 = A2[2*N2/2] = A2[N2], which exceeds the boundary of the array. To solve this problem, we can imagine that both A1 and A2 actually have two extra elements, INT_MAX at A[-1] and INT_MAX at A[N]. These additions don’t change the result, but make the implementation easier: If any L falls out of the left boundary of the array, then L = INT_MIN, and if any R falls out of the right boundary, then R = INT_MAX.

I know that was not very easy to understand, but all the above reasoning eventually boils down to the following concise code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
int N1 = nums1.size();
int N2 = nums2.size();
if (N1 < N2) return findMedianSortedArrays(nums2, nums1); // Make sure A2 is the shorter one.

if (N2 == 0) return ((double)nums1[(N1-1)/2] + (double)nums1[N1/2])/2; // If A2 is empty

int lo = 0, hi = N2 * 2;
while (lo <= hi) {
int mid2 = (lo + hi) / 2; // Try Cut 2
int mid1 = N1 + N2 - mid2; // Calculate Cut 1 accordingly

double L1 = (mid1 == 0) ? INT_MIN : nums1[(mid1-1)/2]; // Get L1, R1, L2, R2 respectively
double L2 = (mid2 == 0) ? INT_MIN : nums2[(mid2-1)/2];
double R1 = (mid1 == N1 * 2) ? INT_MAX : nums1[(mid1)/2];
double R2 = (mid2 == N2 * 2) ? INT_MAX : nums2[(mid2)/2];

if (L1 > R2) lo = mid2 + 1; // A1's lower half is too big; need to move C1 left (C2 right)
else if (L2 > R1) hi = mid2 - 1; // A2's lower half too big; need to move C2 left.
else return (max(L1,L2) + min(R1, R2)) / 2; // Otherwise, that's the right cut.
}
return -1;
}

If you have any suggestions to make the logic and implementation even more cleaner. Please do let me know!


https://discuss.leetcode.com/topic/2778/share-my-simple-o-log-m-n-solution-for-your-reference

Share my simple O(log(m+n)) solution for your reference

Binary search. Call 2 times getkth and k is about half of (m + n). Every time call getkth can reduce the scale k to its half. So the time complexity is log(m + n).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Solution {
public:
int getkth(int s[], int m, int l[], int n, int k){
// let m <= n
if (m > n)
return getkth(l, n, s, m, k);
if (m == 0)
return l[k - 1];
if (k == 1)
return min(s[0], l[0]);

int i = min(m, k / 2), j = min(n, k / 2);
if (s[i - 1] > l[j - 1])
return getkth(s, m, l + j, n - j, k - j);
else
return getkth(s + i, m - i, l, n, k - i);
return 0;
}

double findMedianSortedArrays(int A[], int m, int B[], int n) {
int l = (m + n + 1) >> 1;
int r = (m + n + 2) >> 1;
return (getkth(A, m ,B, n, l) + getkth(A, m, B, n, r)) / 2.0;
}
};

https://discuss.leetcode.com/topic/5728/share-one-divide-and-conquer-o-log-m-n-method-with-clear-description

Share one divide and conquer O(log(m+n)) method with clear description

// using divide and conquer idea, each time find the mid of both arrays

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
double findMedianSortedArrays(int A[], int m, int B[], int n) {
/* A[0, 1, 2, ..., n-1, n] */
/* A[0, 1, 2, ..., m-1, m] */
int k = (m + n + 1) / 2;
double v = (double)FindKth(A, 0, m - 1, B, 0, n - 1, k);

if ((m+n) % 2 == 0) {
int k2 = k+1;
double v2 = (double)FindKth(A, 0, m - 1, B, 0, n - 1, k2);
v = (v + v2) / 2;
}

return v;
}

// find the kth element int the two sorted arrays
// let us say: A[aMid] <= B[bMid], x: mid len of a, y: mid len of b, then wen can know
//
// (1) there will be at least (x + 1 + y) elements before bMid
// (2) there will be at least (m - x - 1 + n - y) = m + n - (x + y +1) elements after aMid
// therefore
// if k <= x + y + 1, find the kth element in a and b, but unconsidering bMid and its suffix
// if k > x + y + 1, find the k - (x + 1) th element in a and b, but unconsidering aMid and its prefix
int FindKth(int A[], int aL, int aR, int B[], int bL, int bR, int k) {
if (aL > aR) return B[bL + k - 1];
if (bL > bR) return A[aL + k - 1];

int aMid = (aL + aR) / 2;
int bMid = (bL + bR) / 2;

if (A[aMid] <= B[bMid]) {
if (k <= (aMid - aL) + (bMid - bL) + 1)
return FindKth(A, aL, aR, B, bL, bMid - 1, k);
else
return FindKth(A, aMid + 1, aR, B, bL, bR, k - (aMid - aL) - 1);
}
else { // A[aMid] > B[bMid]
if (k <= (aMid - aL) + (bMid - bL) + 1)
return FindKth(A, aL, aMid - 1, B, bL, bR, k);
else
return FindKth(A, aL, aR, B, bMid + 1, bR, k - (bMid - bL) - 1);
}
}

https://discuss.leetcode.com/topic/11478/o-lg-m-n-c-solution-using-kth-smallest-number

O(lg(m+n)) c++ solution using kth smallest number

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Solution {
public:
int kth(int a[], int m, int b[], int n, int k) {
if (m < n) return kth(b,n,a,m,k);
if (n==0) return a[k-1];
if (k==1) return min(a[0],b[0]);

int j = min(n,k/2);
int i = k-j;
if (a[i-1] > b[j-1]) return kth(a,i,b+j,n-j,k-j);
return kth(a+i,m-i,b,j,k-i);
}

double findMedianSortedArrays(int a[], int m, int b[], int n) {
int k = (m+n)/2;
int m1 = kth(a,m,b,n,k+1);
if ((m+n)%2==0) {
int m2 = kth(a,m,b,n,k);
return ((double)m1+m2)/2.0;
}
return m1;
}
};

https://discuss.leetcode.com/topic/26926/another-simple-and-neat-solution-binary-search-non-recursion-3-rows-of-core-code-o-log-min-m-n

Another simple and neat solution, binary search, non-recursion, 3 rows of core code, O(log(min(m, n)))

If you solve the k-th minmum value of two sorted arrays, you solve this problem.This is a classical problem of “Divide and conquer”.

Here is another more simple and more neat solution.

Cosider chosing first x numbers from A and k - x numbers from B.if these k numbers are the k minmum numbers of A and B, x must satisfies that A[x + 1] >= B[k - x] and B[k - x + 1] >= A[x] (for better explanation index is base-1).

So this x is what we want.

Obviously, if A[x + 1] < B[k - x + 1] then x must be smaller, else if B[k - x] < A[x] then x must be greater. A nice two-value definition for binary search :)

To simplify edge cases, we cosider each array indefinite, with value of INTMIN when index < 1 and INTMAX when index > n.

Here is the solution of c++ version:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
typedef vector<int> vi;
const int inf = 0x7fffffff, ninf = 0x80000000;
class Solution {
int kth_min(vi& a, vi& b, int k, int n, int m){
#define A(i) (i < 1 ? ninf : (i > n ? inf : a[i - 1]))
#define B(i) (i < 1 ? ninf : (i > m ? inf : b[i - 1]))
int l = 0, r = n + 1, x;
while(l <= r){
x = (l + r) >> 1;
if(A(x) > B(k - x + 1)) r = x - 1;
else if(B(k - x) > A(x + 1)) l = x + 1;
else return max(A(x), B(k - x));
}
return 0; //never execute , just to hide the warning :)
#undef A
#undef B
}
public:
double findMedianSortedArrays(vector<int>& a, vector<int>& b) {
int n = a.size(), m = b.size();
if(n > m) return findMedianSortedArrays(b, a); //make sure that a.size() <= b.size()
if((m + n) & 1) return kth_min(a, b, (m + n + 1) >> 1, n, m);
return (0.0 + kth_min(a, b, (m + n + 1) >> 1, n, m) + kth_min(a, b, ((m + n) >> 1) + 1, n, m)) * 0.5;
}
};

python


https://discuss.leetcode.com/topic/4996/share-my-o-log-min-m-n-solution-with-explanation

To solve this problem, we need to understand “What is the use of median”. In statistics, the median is used for dividing a set into two equal length subsets, that one subset is always greater than the other. If we understand the use of median for dividing, we are very close to the answer.

1
2
3
4
First let's cut A into two parts at a random position i:

left_A | right_A
A[0], A[1], ..., A[i-1] | A[i], A[i+1], ..., A[m-1]

Since A has m elements, so there are m+1 kinds of cutting( i = 0 ~ m ). And we know: len(left_A) = i, len(right_A) = m - i . Note: when i = 0 , left_A is empty, and when i = m , right_A is empty.

1
2
3
4
With the same way, cut B into two parts at a random position j:

left_B | right_B
B[0], B[1], ..., B[j-1] | B[j], B[j+1], ..., B[n-1]

Put left_A and left_B into one set, and put right_A and right_B into another set.

1
2
3
4
Let's name them left_part and right_part :
left_part | right_part
A[0], A[1], ..., A[i-1] | A[i], A[i+1], ..., A[m-1]
B[0], B[1], ..., B[j-1] | B[j], B[j+1], ..., B[n-1]

If we can ensure:

1
2
1) len(left_part) == len(right_part)
2) max(left_part) <= min(right_part)

then we divide all elements in {A, B} into two parts with equal length, and one part is always greater than the other. Then median = (max(left_part) + min(right_part))/2.

To ensure these two conditions, we just need to ensure:

1
2
3
(1) i + j == m - i + n - j (or: m - i + n - j + 1)
if n >= m, we just need to set: i = 0 ~ m, j = (m + n + 1)/2 - i
(2) B[j-1] <= A[i] and A[i-1] <= B[j]

(For simplicity, I presume A[i-1],B[j-1],A[i],B[j] are always valid even if i=0/i=m/j=0/j=n . I will talk about how to deal with these edge values at last.)

So, all we need to do is:

1
2
Searching i in [0, m], to find an object `i` that:
B[j-1] <= A[i] and A[i-1] <= B[j], ( where j = (m + n + 1)/2 - i )

And we can do a binary search following steps described below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
<1> Set imin = 0, imax = m, then start searching in [imin, imax]

<2> Set i = (imin + imax)/2, j = (m + n + 1)/2 - i

<3> Now we have len(left_part)==len(right_part). And there are only 3 situations
that we may encounter:
<a> B[j-1] <= A[i] and A[i-1] <= B[j]
Means we have found the object `i`, so stop searching.
<b> B[j-1] > A[i]
Means A[i] is too small. We must `ajust` i to get `B[j-1] <= A[i]`.
Can we `increase` i?
Yes. Because when i is increased, j will be decreased.
So B[j-1] is decreased and A[i] is increased, and `B[j-1] <= A[i]` may
be satisfied.
Can we `decrease` i?
`No!` Because when i is decreased, j will be increased.
So B[j-1] is increased and A[i] is decreased, and B[j-1] <= A[i] will
be never satisfied.
So we must `increase` i. That is, we must ajust the searching range to
[i+1, imax]. So, set imin = i+1, and goto <2>.
<c> A[i-1] > B[j]
Means A[i-1] is too big. And we must `decrease` i to get `A[i-1]<=B[j]`.
That is, we must ajust the searching range to [imin, i-1].
So, set imax = i-1, and goto <2>.

When the object i is found, the median is:

1
2
max(A[i-1], B[j-1]) (when m + n is odd)  注释 odd 奇数
or (max(A[i-1], B[j-1]) + min(A[i], B[j]))/2 (when m + n is even)

Now let’s consider the edges values i=0,i=m,j=0,j=n where A[i-1],B[j-1],A[i],B[j] may not exist. Actually this situation is easier than you think.

What we need to do is ensuring that max(left_part) <= min(right_part). So, if i and j are not edges values(means A[i-1],B[j-1],A[i],B[j] all exist), then we must check both B[j-1] <= A[i] and A[i-1] <= B[j]. But if some of A[i-1],B[j-1],A[i],B[j] don’t exist, then we don’t need to check one(or both) of these two conditions. For example, if i=0, then A[i-1] doesn’t exist, then we don’t need to check A[i-1] <= B[j]. So, what we need to do is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Searching i in [0, m], to find an object `i` that:
(j == 0 or i == m or B[j-1] <= A[i]) and
(i == 0 or j == n or A[i-1] <= B[j])
where j = (m + n + 1)/2 - i
And in a searching loop, we will encounter only three situations:

<a> (j == 0 or i == m or B[j-1] <= A[i]) and
(i == 0 or j = n or A[i-1] <= B[j])
Means i is perfect, we can stop searching.

<b> j > 0 and i < m and B[j - 1] > A[i]
Means i is too small, we must increase it.

<c> i > 0 and j < n and A[i - 1] > B[j]
Means i is too big, we must decrease it.

Thank @Quentin.chen , him pointed out that: i < m ==> j > 0 and i > 0 ==> j < n . Because:

1
2
m <= n, i < m ==> j = (m+n+1)/2 - i > (m+n+1)/2 - m >= (2*m+1)/2 - m >= 0    
m <= n, i > 0 ==> j = (m+n+1)/2 - i < (m+n+1)/2 <= (2*n+1)/2 <= n

So in situation and , we don’t need to check whether j > 0 and whether j < n.

Below is the accepted code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def median(A, B):
m, n = len(A), len(B)
if m > n:
A, B, m, n = B, A, n, m
if n == 0:
raise ValueError

imin, imax, half_len = 0, m, (m + n + 1) / 2
while imin <= imax:
i = (imin + imax) / 2
j = half_len - i
if i < m and B[j-1] > A[i]:
# i is too small, must increase it
imin = i + 1
elif i > 0 and A[i-1] > B[j]:
# i is too big, must decrease it
imax = i - 1
else:
# i is perfect

if i == 0: max_of_left = B[j-1]
elif j == 0: max_of_left = A[i-1]
else: max_of_left = max(A[i-1], B[j-1])

if (m + n) % 2 == 1:
return max_of_left

if i == m: min_of_right = B[j]
elif j == n: min_of_right = A[i]
else: min_of_right = min(A[i], B[j])

return (max_of_left + min_of_right) / 2.0


https://leetcode.com/discuss/15790/share-my-o-log-min-m-n-solution-with-explanation

120ms, 86.12%, June.25th, 2016

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
class Solution(object):
def findMedianSortedArrays(self, nums1, nums2):
"""
:type nums1: List[int]
:type nums2: List[int]
:rtype: float
"""
m, n = len(nums1), len(nums2)
if m > n:
nums1, nums2, m, n = nums2, nums1, n, m
if n == 0:
raise ValueError

imin, imax, half_len = 0, m, (m + n + 1) / 2
while imin <= imax:
i = (imin + imax) / 2
j = half_len - i
if j > 0 and i < m and nums2[j-1] > nums1[i]:
# i is too small, must increase it
imin = i + 1
elif i > 0 and j < n and nums1[i-1] > nums2[j]:
# i is too big, must decrease it
imax = i - 1
else:
# i is perfect
if i == 0: max_of_left = nums2[j-1]
elif j == 0: max_of_left = nums1[i-1]
else: max_of_left = max(nums1[i-1], nums2[j-1])

if (m + n) % 2 == 1:
return max_of_left

if i == m: min_of_right = nums2[j]
elif j == n: min_of_right = nums1[i]
else: min_of_right = min(nums1[i], nums2[j])

return (max_of_left + min_of_right) / 2.0

https://leetcode.com/discuss/20897/intuitive-python-solution-smallest-two-sorted-arrays-252ms

Intuitive Python O(log (m+n)) solution, by kth smallest in the two sorted arrays, 252ms

144ms, 30.79%, June.25th, 2016

The idea is in the comment:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class Solution(object):
def findMedianSortedArrays(self, nums1, nums2):
"""
:type nums1: List[int]
:type nums2: List[int]
:rtype: float
"""
l = len(nums1) + len(nums2)
if l % 2 == 1:
return self.kth(nums1, nums2, l // 2)
else:
return (self.kth(nums1, nums2, l // 2) + self.kth(nums1, nums2, l // 2 - 1)) / 2.

def kth(self, a, b, k):
if not a:
return b[k]
if not b:
return a[k]
ia, ib = len(a) // 2, len(b) // 2
ma, mb = a[ia], b[ib]

if ia + ib < k:
if ma > mb:
return self.kth(a, b[ib + 1:], k - ib - 1)
else:
return self.kth(a[ia + 1:], b, k -ia - 1)
else:
if ma > mb:
return self.kth(a[:ia], b, k)
else:
return self.kth(a, b[:ib], k)

https://discuss.leetcode.com/topic/22406/python-o-log-min-m-n-solution

Python O(log(min(m,n)) solution

It’s guaranteed to be O(log(min(m,n)) because every time the findKth function cuts the shorter array by half of its size.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Solution:
# @return a float
def findMedianSortedArrays(self, A, B):
l=len(A)+len(B)
return self.findKth(A,B,l//2) if l%2==1 else (self.findKth(A,B,l//2-1)+self.findKth(A,B,l//2))/2.0


def findKth(self,A,B,k):
if len(A)>len(B):
A,B=B,A
if not A:
return B[k]
if k==len(A)+len(B)-1:
return max(A[-1],B[-1])
i=len(A)//2
j=k-i
if A[i]>B[j]:
#Here I assume it is O(1) to get A[:i] and B[j:]. In python, it's not but in cpp it is.
return self.findKth(A[:i],B[j:],i)
else:
return self.findKth(A[i:],B[:j],j)

java


https://discuss.leetcode.com/topic/3367/share-my-iterative-solution-with-o-log-min-n-m

Share my iterative solution with O(log(min(n, m)))

This is my iterative solution using binary search. The main idea is to find the approximate location of the median and compare the elements around it to get the final result.

  1. do binary search. suppose the shorter list is A with length n. the runtime is O(log(n)) which means no matter how large B array is, it only depends on the size of A. It makes sense because if A has only one element while B has 100 elements, the median must be one of A[0], B[49], and B[50] without check everything else. If A[0] <= B[49], B[49] is the answer; if B[49] < A[0] <= B[50], A[0] is the answer; else, B[50] is the answer.
  2. After binary search, we get the approximate location of median. Now we just need to compare at most 4 elements to find the answer. This step is O(1).
  3. the same solution can be applied to find kth element of 2 sorted arrays.

Here is the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
public double findMedianSortedArrays(int A[], int B[]) {
int n = A.length;
int m = B.length;
// the following call is to make sure len(A) <= len(B).
// yes, it calls itself, but at most once, shouldn't be
// consider a recursive solution
if (n > m)
return findMedianSortedArrays(B, A);

// now, do binary search
int k = (n + m - 1) / 2;
int l = 0, r = Math.min(k, n); // r is n, NOT n-1, this is important!!
while (l < r) {
int midA = (l + r) / 2;
int midB = k - midA;
if (A[midA] < B[midB])
l = midA + 1;
else
r = midA;
}

// after binary search, we almost get the median because it must be between
// these 4 numbers: A[l-1], A[l], B[k-l], and B[k-l+1]

// if (n+m) is odd, the median is the larger one between A[l-1] and B[k-l].
// and there are some corner cases we need to take care of.
int a = Math.max(l > 0 ? A[l - 1] : Integer.MIN_VALUE, k - l >= 0 ? B[k - l] : Integer.MIN_VALUE);
if (((n + m) & 1) == 1)
return (double) a;

// if (n+m) is even, the median can be calculated by
// median = (max(A[l-1], B[k-l]) + min(A[l], B[k-l+1]) / 2.0
// also, there are some corner cases to take care of.
int b = Math.min(l < n ? A[l] : Integer.MAX_VALUE, k - l + 1 < m ? B[k - l + 1] : Integer.MAX_VALUE);
return (a + b) / 2.0;
}

I’m lazy to type. But I found a very good pdf to explain my algorithm:

http://ocw.alfaisal.edu/NR/rdonlyres/Electrical-Engineering-and-Computer-Science/6-046JFall-2005/30C68118-E436-4FE3-8C79-6BAFBB07D935/0/ps9sol.pdf

BTW: Thanks to xdxiaoxin. I’ve removed the check “midB > k”.


https://discuss.leetcode.com/topic/28602/concise-java-solution-based-on-binary-search

Concise JAVA solution based on Binary Search

Explanation

The key point of this problem is to ignore half part of A and B each step recursively by comparing the median of remaining A and B:

1
2
3
if (aMid < bMid) Keep [aRight + bLeft]

else Keep [bRight + aLeft]

As the following: time=O(log(m + n))

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
public double findMedianSortedArrays(int[] A, int[] B) {
int m = A.length, n = B.length;
int l = (m + n + 1) / 2;
int r = (m + n + 2) / 2;
return (getkth(A, 0, B, 0, l) + getkth(A, 0, B, 0, r)) / 2.0;
}

public double getkth(int[] A, int aStart, int[] B, int bStart, int k) {
if (aStart > A.length - 1) return B[bStart + k - 1];
if (bStart > B.length - 1) return A[aStart + k - 1];
if (k == 1) return Math.min(A[aStart], B[bStart]);

int aMid = Integer.MAX_VALUE, bMid = Integer.MAX_VALUE;
if (aStart + k/2 - 1 < A.length) aMid = A[aStart + k/2 - 1];
if (bStart + k/2 - 1 < B.length) bMid = B[bStart + k/2 - 1];

if (aMid < bMid)
return getkth(A, aStart + k/2, B, bStart, k - k/2);// Check: aRight + bLeft
else
return getkth(A, aStart, B, bStart + k/2, k - k/2);// Check: bRight + aLeft
}
谢谢你,可爱的朋友。