Mergesort Investigation 2 - 2021 program
My July 2021 mergesort program, as the subject of an investigation, deserves some explanation.
A linked list is a computer data structure, an organization of data that allows optimal searching, categorizing, or storage of information. Linked lists are usually used to store information that has a sequential notion about it.
My conception of a linked list is to have “nodes”, elements of the list, that contain a numeric value field (the data being organized), and a “next” field, which contains the address of the element of the list that logically follows the node. The linked list as a whole is referenced by the address of its first, “head” node. Other nodes of the list can be found by “walking” the linked list, successively using the “next” field of a node to obtain a reference to the actual next node.
In Go source code, that looks like this:
type Node struct {
Data uint
Next *Node
}
This structure specification did evolve over the course of my investigation.
The .Data
field originally had type int
, a 64-bit signed integer.
I changed it to uint
, an unsigned 64-bit integer to accommodate setting
.Next
elements so a linked list could be sorted from high to low memory addresses.
My algorithm merges sublists of a power-of-2 size from beginning to end of list,
being careful not to lose the .Next
pointer to the “rest of the list”,
the part of the list not actively being merged.
The merge code looks like this:
for psize > 0 && qsize > 0 && q != nil {
if p.Data < q.Data {
appnd(p)
p = p.Next
psize--
continue
}
appnd(q)
q = q.Next
qsize--
}
for ; psize > 0 && p != nil; psize-- {
appnd(p)
p = p.Next
}
for ; qsize > 0 && q != nil; qsize-- {
appnd(q)
q = q.Next
}
That’s something like:
- Take the smallest valued node from the head of list
p
or listq
, whichever is lower. Append that smallest valued node to the current being-merged-list. - Keep doing step (1) until we’ve used up one or the other of lists
p
orq
. This is kept track of by one size variable per list. - If there are any nodes on the
p
list, append them. - If there are any nodes on the
q
list, append them.
Since only one list, p
or q
has nodes left,
only one of steps (3) and (4) happens.
Above, a representation of an unsorted linked list.
It’s actually sorted in decreasing data order,
but the algorithm puts linked lists in increasing data order.
I’m going to keep the elements in the same places
in this and a few following diagrams.
I designate these places as memory addresses.
Any linked list sort is going to put elements in sorted order
by .Next
pointer, not by moving data values in memory.
The dashed lines exist solely to keep the nodes in increasing memory order.
The bold arrow lines represent the .Next
pointers
that characterize linked lists.
First, the algorithm walks the list, two 1-element lists at a time, swapping high and low value node pointers. After that iteration, the linked list can be imagined to look like this:
The head of the list is the node with data value of 15. If you follow the bold arrows, you get data values ordered like this:
15 → 16 → 13 → 14 → 11 → 12 → 9 → 10 → 7 → 8 → 5 → 6 → 3 → 4 → 2 → 1
No list node “moves” in memory. Only .Next
pointers change.
Next, the algorithm walks the list, merging pairs of 2-element lists at a time, creating 4-node lists sorted by increasing data order. After that iteration, the linked list can be imagined to look like this:
The node with data value of 13 is now the head of the list.
Following the .Next
pointers,
you can see that the list has 4-element sublists sorted in increasing order.
13 → 14 → 15 → 16 → 9 → 10 → 11 → 12 → 5 → 6 → 7 → 8 → 1 → 2 → 3 → 4
After the third iteration, which merges pairs of 4-node sublists, the 9-valued node is the head of the list, which is clearly divided into two, eight node sorted sublists.
The fourth iteration, merging pairs of 8-node sublists,
ends up with the 1-valued node as the list head.
Following .Next
pointers,
you find the list sorted by increasing data value,
and sorted by decreasing list node address.
I took the “constant space” stipulation seriously, because it seemed unusual. Some coding problems have extra requirements to cause job candidates to do some algorithmic thinking, or to keep someone from regurgitating a memorized solution.
My algorithm is iterative, so no (control flow) stack growth at all, much less unbounded stack growth.
It uses five list-node-pointer variables, one integer to count the pairs of sublists merged, one for-loop-counter, and two size-of-sublist integer counts. One of the list-node-pointer variables is the formal argument to an anonymous function that exists only to de-clutter flow-of-control, and could be eliminated.
It has the goofy, puzzling dramatic slowdowns demonstrated before.
Some factors I had ruled out previously:
- Garbage collection. A C language transliteration exhibited the same drops. Besides that, the timed portion of the program does’t generate reclaimable garbage. The nodes of the linked list aren’t allocated and discarded, they’re live for the length of the timed execution.
- Operating system. The same code compiled for MacOS and Linux has the same drops, apparently at the same number of nodes in the list.
- Some kind of “adversary” random number generation. I tried two very different PRNGs and got the same results.
The performance drops at obvious powers-of-2 numbers of nodes is still puzzling. Since the algorithm marches through large linked lists by power-of-2 chunks from beginning to end, the powers-of-2 size linked list performance drops seems like it has to be related.