31/10/2017

[Java] Partition array in linear time and constant space

Here is an interesting problem: given an integer array and an index, consider the element at the index as a pivot and then reorganize the array so that all the elements less than the pivot are at the start of the array, followed by the elements equal to the pivot, followed by the elements greater than the pivot.

This is trivial to do in a single pass and additional O(N) space, since we scan the array once and store the elements in one of three additional arrays (smaller, equal, greater) and then in another pass we reconstruct the result array.
We can get rid of the extra space but run in O(N^2) time by scanning for each element the array multiple times to make sure it is in the proper place, until we reach the end.

A better solution that runs in O(N) time and O(1) space, is to partition the array itself in 4 sections: smaller, equal, unclassified, larger and process one unclassified element at a time. For the unclassified section, equal is the pointer to the beginning of it and larger the pointer to the end. At the start, smaller and equals are the same and larger is the end of the array. It is not guaranteed that at each pass the array will be in a consistent state but it will be by the end of the algorithm!

So at each pass we inspect the current element and:

- if it is smaller, we swap it with the last of the smaller elements and move both pointers forward. We might swap the element with itself, but we are also reducing the unclassified section by 1
- if it is equal, we move the equals pointer forward, thus reducing the unclassified section by 1, but we do not move smaller because we might have another element to add there at a later point in time, and therefore we might have to swap again the first equal element to borrow space from it
- if it is bigger, we swap it with the first of the bigger elements (which is also the last of the unclassified elements) and only move the larger pointer backwards by 1, reducing the unclassified section by 1 as well. We do not move the equals pointer, because we borrowed space from an unclassified element we still need to check!

We stop as soon as the larger pointer surpasses the equals pointer, meaning we checked all unclassified elements.

The logic is unfortunately not straightforward, so it greatly helps to draw a sample run to verify it. Suppose our input array is: 5,4,3,2,1 and our index is 0, therefore the pivot is 5. We expect the output to be 4,3,2,1,5 since all other elements are smaller than 5, only one is equal to 5, and no other element is bigger than 5.

Start situation, e = equals, s = smaller, l = larger:

5 4 3 2 1
s       l
e

5 is equal to 5, so move equals up by 1

5 4 3 2 1
s e     l

4 is smaller than 5, so swap with smaller and move both smaller and equals up by 1

4 5 3 2 1
  s e   l

3 is smaller than 5, so swap with smaller and move both smaller and equals up by 1

4 3 5 2 1
    s e l

2 is smaller than 5, so swap with smaller and move both smaller and equals up by 1

4 3 2 5 1
      s e
        l


1 is smaller than 5, so swap with smaller and move both smaller and equals up by 1. This is also why we have to run until equals <= larger, otherwise we would miss the last swap!

4 3 2 1 5
        s e
        l

Now equals is bigger than larger, therefore we end our loop with a successful partition.

You can the code for partitionArray on my Gist along with some test cases in PartitionArrayJTests

No comments:

Post a Comment

With great power comes great responsibility