Ok, this one is definitely my favourite.

The core principle behind a merge sort is divide-and-conquer. This is a technique for solving programming problems which relies on discovering a way to divide the problem up into smaller problems, and then solving each of the smaller problems at the same time (see multicore machines for an explanation of why we can do multiple things at once on a computer).

We start with a list. Note that this list is a bit longer than the bubble sort example one because it demonstrates the advantage of merge sort better.


[1, 7, 7, 11, 18, 1, 23, 5, 10, 14, 3, 6, 20, 23, 28, 11, 28, 13, 8, 22]

The key to thinking about merge sort is that we are trying to maximise the number of operations that we can conduct in parallel (i.e. at the same time). We start by dividing up our list into groups of two (if we can't fit everything into neat groups of two – i.e. if the length of our list is not divisible by two –, we do what we always do in computer science and just try to make it fit as best we can – 'imperfect' solutions that work are fine).

Our list has twenty items (the format I have provided it in is designed to be easy to paste it as Python code, so you can follow along and experiment). First we divide our list into two halves. This means we have:


# let's call this list A
[1, 7, 7, 11, 18, 1, 23, 5, 10, 14]
# let's call this list B
[3, 6, 20, 23, 28, 11, 28, 13, 8, 22]

We now divide our halves again.


# let's call this list A_A because it is the first half of A
[1, 7, 7, 11, 18]
# let's call this list A_B because it is the second half of A
[1, 23, 5, 10, 14]
# let's call this list B_A
[3, 6, 20, 23, 28]
# let's call this list B_B
[11, 28, 13, 8, 22]

We do this again.


# let's call this list A_A_A because it is the first half of A_A
[1, 7]
# let's call this list A_A_B because it is the second half of A_A
[7, 11, 18]
# let's call this list A_B_A because it is the first half of A_B
[1, 23]
# let's call this list A_B_B because it is the second half of A_B
[5, 10, 14]
# let's call this list B_A_A because it is the first half of B_A
[3, 6]
# let's call this list B_A_B because it is the second half of B_A
[20, 23, 28]
# let's call this list B_B_A because it is the first half of B_B
[11, 28]
# let's call this list B_B_B because it is the second half of B_B
[13, 8, 22]

Once we've performed the division at stage 3, we have a lot of lists that are three elements long, and some that are two elements long. What we do here is divide up the three-element lists into a two element list and a one element long list. We leave the two element lists unchanged.


# let's call this list A_A_A because it is the first half of A_A
[1, 7]
# let's call this list A_A_B_A because it is the first half of A_A_B
[7, 11]
# let's call this list A_A_B_B because it is the second half of A_A_B
[18]
# let's call this list A_B_A because it is the first half of A_B
[1, 23]
# let's call this list A_B_B_A because it is the first half of A_B_B
[5, 10]
# let's call this list A_B_B_B because it is the second half of A_A_B
[14]
# let's call this list B_A_A because it is the first half of B_A
[3, 6]
# let's call this list B_A_B_A because it is the first half of B_A_B
[20, 23]
# let's call this list B_A_B_B because it is the second half of B_A_B
[28]
# let's call this list B_B_A because it is the first half of B_B
[11, 28]
# let's call this list B_B_B_A because it is the first half of B_B_B
[13, 8]
# let's call this list B_B_B_B because it is the second half of B_B_B
[22]

We have now finished our division. The next step is to merge and sort the lists. We want to merge all the X_X_X_A (where X_X_X is any three As or Bs in an order). Let's start with A_A_B_A and A_A_B_B (so [7, 11] and [18]). To sort these we do what is called "card-player's trick" where we pick the smallest item of each list (the pop function in most programming languages). First we pick the first item of [7, 11], which is 7. We then compare 7 to 18 – 7 is definitely smaller, so we place 7 as hte first item of our sorted list (so we now have [7]). We now pick the next smallest item (11) and compare it to 18. 11 is smaller, so we place it next in the sorted list – we now have [7, 11]. Because 18 is the only item remaining, we simply push it to the end, so we are left with [7, 11, 18] which is what we want. We do the same process for all the other four-lettered lists (e.g. A_B_B_A). If you don't understand this merging process, have a look at step 8 where I run through another merge.


# let's call this list A_A_A because it is the first half of A_A
[1, 7]
# let's call this list A_A_B (sorted)
[7, 11, 18]
# let's call this list A_B_A because it is the first half of A_B
[1, 23]
# let's call this list A_B_B (sorted)
[5, 10, 14]
# let's call this list B_A_A because it is the first half of B_A
[3, 6]
# let's call this list B_A_B (sorted)
[20, 23, 28]
# let's call this list B_B_A because it is the first half of B_B
[11, 28]
# let's call this list B_B_B (sorted)
[8, 13, 22]

We now sort all the elements at this level


# let's call this list A_A_A (sorted)
[1, 7]
# let's call this list A_A_B (sorted)
[7, 11, 18]
# let's call this list A_B_A (sorted)
[1, 23]
# let's call this list A_B_B (sorted)
[5, 10, 14]
# let's call this list B_A_A (sorted)
[3, 6]
# let's call this list B_A_B (sorted)
[20, 23, 28]
# let's call this list B_B_A (sorted)
[11, 28]
# let's call this list B_B_B (sorted)
[8, 13, 22]

We now merge them again.


# A_A (sorted)
[1, 7, 7, 11, 18]
# A_B (sorted)
[1, 5, 10, 14, 23]
# B_A (sorted)
[3, 6, 20, 23, 28]
# B_B (sorted)
[8, 11, 13, 22, 28]

We now merge again. Let's take A_A (sorted) and A_B (sorted). We have [1, 7, 7, 11, 18] and [1, 5, 10, 14, 23]. We know that these lists are sorted (in ascending order – i.e. from smallest to biggest). We want to form a new list of 10 items (combine the two lists of five). Let's think about the first item we want to place into this list. We know that the smallest item must be the first item of one of our two lists (because the first item of each list is the smallest item, and therefore the overall smallest item will be one of these). We remove the two items (one from each list), so we have 1 and 1. When we compare 1 and 1 we can see that they are equal, so we can push them in in either order into our list of ten sorted items – this list now looks like [1, 1]. We now pick the next two items. These are 7 (from A_A) and 5 (from A_B). Comparing these, it is clear that 7 is bigger than 5. Therefore we push 5 onto the sorted list (which now looks like [1, 1, 5]). We keep the 7, and compare it to the next item from A_B, which is 10. 7 is definitely smaller than 10, so we push it straight onto our sorted list, which now looks like [1, 1, 5, 7]. We remove the next smallest item from A_A (which is 7, again) and compare is to the item from A_B (the 10 we already removed). As before, 7 is smaller than 10, so we push it onto our list – which now looks like [1, 1, 5, 7, 7]. We remove the next item from A_A (i.e. 11) and compare it to the item from A_B (the same 10 we've already removed). 11 is bigger than 10, so we push the 10 onto the sorted list (which now looks like [1, 1, 5, 7, 7, 10, 11]). We now pick the next item from A_B (14) and conclude that 11 is less than 14, so we push 11 onto the list (so we have [1, 1, 5, 7, 7, 10, 11]). Now we pick the next item from A_A (18), which is definitely bigger than 14 (so we push 14 on the list, leaving us with [1, 1, 5, 7, 7, 10, 11, 14]). We remove the final item from A_B (23) and compare it to 18, noticing that 18 is less than 23, so we push 18 and then 23. The final list is therefore [1, 1, 5, 7, 7, 10, 11, 14, 18, 23].


[1, 1, 5, 7, 7, 10, 11, 14, 18, 23]
[3, 6, 8, 11, 13, 20, 22, 23, 28, 28]

We now finish our merging, and are left with the sorted list.


[1, 1, 3, 5, 6, 7, 7, 8, 10, 11, 11, 13, 14, 18, 20, 22, 23, 23, 28, 28]

why is merge sort fast(er)?

We can do operations in parallel – for example, when we need to merge four lists at one stage, we can use multiple processor cores so that we can merge each pair of lists at the same time rather than having to merge the first pair and the second pair sequentially.

UK secondary Computer Science

why is merge sort fast(er)?

further reading