Algorithms in Go: Dutch National Flag
The flag of the Netherlands consists of three colors: red, white and blue. Given balls of these three colors arranged randomly in a line (it does not matter how many balls there are), the task is to arrange them such that all balls of the same color are together and their collective color groups are in the correct order.
For simplicity instead of colors red, white, and blue we will be dealing with ones
, twos
and zeroes
.
Let’s start with our intuition. We have an array of zeroth, ones, and twos. How would we sort it? Well, we could put aside all zeroes
into some bucket, all ones
into another bucket, and all twos
into the third. Then we can fetch all items from the first bucket, then from the second, and from the last bucket, and restore all the items. This approach is perfectly fine and has a great performance. We touch all the elements when we iterate through the array, and then we iterate through all the elements once more when we «reassamble» the array. So, the overall time complexity is O(n) + O(n) ~= O(n)
. The space complexity is also O(n)
as we need to store all items in the buckets.
Can we do better than that? There is no way to improve our time complexity. However, we can think of a more efficient algorithm in regard to space complexity. How would we solve the problem without the additional buckets?
Let’s make a leap of faith and pretend that somehow we were able to process a part of the array. We iterate through part of the array and put encountered zeroes
and ones
at the beginning of the array, and twos
at the end of the array. Now, we switched to the next index i
with some unprocessed value x
. What should we do there?
In this case, there are three possibilities: x
could be equal to 0
, 1
or 2
.
Let’s say x
is equal to 2
. Where should we put it? We already have some twos at the end of the array, so we need to put current two
right before the start of the known twos. We don’t know which value is located before the start of known twos. Let’s call this value y
.
We swap our current value and y
. As we don’t know the value y
at our current index i
, it is possible that we need to swap it once more. So we don’t proceed to the next value, i.e. we don’t increase our running pointer i
. After this operation we have:
We note that we now have one more two
at the end, so we need to shift the start of the known twos. This fact tells us that we need to have one more pointer that points to the start of the processed twos
, let’s denote this pointer as high
. When we encounter another two
we put it right before this pointer.
Now, let’s consider the case when x
is equal to zero
. We need to put this zero
right after the last processed zero
After this operation, we need to shift the end of known zeroes to the right. Therefore, we will have another pointer low
that points to the end of known zeroes. When we encounter another zero
we put it to the right of this pointer.
We swap the items at indices i
and low+1
. As index low+1
goes directly after index low
that denotes the end of zeroes, the value at index low+1
will always be equal to one
.
Now, what do we do with ones? That’s a tricky part because, actually, we do nothing. The item is already in place, so we are free to proceed to the next item.
Ok, this could work when we are in the middle of the processing. But how do we start?
The running pointer is initialized with a value equal to zero. We don’t have any known twos
yet, so the start of twos is equal to the length of the array. In this way, the first encountered two
will be put at the very end of the array which seems logical. We don’t have any known zeroes
either, so the end of zeroes
is equal to minus one. Therefore, the first encountered
zero will be put at the very beginning of the array and that’s exactly what we want.
Right, how do we know where should we stop? We could iterate through the whole array, but at some point the movement becomes redundant. When our running pointer becomes equal to high
, we know for sure that the value at that index is equal to two
and we have already processed it. Therefore, we can stop there. If we don’t have any twos
then the value of high
is equal to the length of the array, and we stop at the last item.
Finally, we can implement the algorithm in Go:
func Sort(arr []int) {
low := -1
high := len(arr)
for i := 0; i < high; {
val := arr[i]
switch val {
case 0:
arr[i], arr[low+1] = arr[low+1], arr[i]
low++
i++
case 1:
i++
case 2:
arr[i], arr[high-1] = arr[high-1], arr[i]
high--
}
}