Heapsort - overview
Heapsort is a comparison based and in-place sorting algorithm. It was invented by John Williams in 1964.
What is a heap?
A heap is a tree-based data structure that must satisfy two properties to be a valid heap. First it must be a
complete binary tree and secondly it must satisfy the
heap property.
- In a complete binary tree each node has exactly two children. So a heap is entirely filled except for the lowest level. In the lowest level the leaves must be filled from left to right.
- The heap-order property says that: "Every node in the heap is <= to its children" (This is valid for min-heaps. For max-heaps we simply change "<=" against ">=").
A heap can be represented in an array. See the following illustration:
|
An example of a min-heap structure and its corresponding representation in array form. Note
that this heap consists of a binary tree that is complete and that each node is <= its children. |
Since a heap has the property of completeness, we know that there are no "holes" in the array. At each position in the array from 0 to n-1 we will have a valid value.
To find the index of the first child of a node with index
i, we just need to use the formula
2i + 1. The second child index is found at
2i + 2. On the other hand, if we want to get the index of the parent, we would instead use
floor((i - 1) / 2). Since integer division always results in an integer we can simply use
(i-1) / 2.
Description of the heapsort algorithm
Heapsort consists of two parts. First a heap structure is constructed from the input data. For a min-heap the root value now is the smallest value in the heap, and for a max-heap it would be the largest.
In the second step the root value is removed from the heap by swapping the first with the last value. The root value is inserted into the right part of the array and is now considered sorted. Now the heap is one element smaller than before.
Next the algorithm will adjust the heap to restore the heap property and then remove the next smallest value and insert into the right sorted part of the array, by swapping first and last vaues. Then it will again adjust the heap and remove the next smallest value and so on. This will go on until all values in the heap has been moved to the right part. At that point the values in the array are sorted.
Computational complexity
Worst, best and average case performance are all
O(n log n).
Example of an implementation of heapsort in C#
private void SortArrayWithHeapSort()
{
int[] array = { 7, 5, 23, 2, 0, 2, 2 };
BuildHeap(array);
for (int i = array.Length - 1; i > 0; i--)
{
SwapValues(array, 0, i);
int size = i;
Heapify(array, size, 0);
}
// PrintResult(array);
}
private void BuildHeap(int[] a)
{
int size = a.Length;
for (int i = size / 2; i >= 0; i--)
{
Heapify(a, size, i);
}
}
private void Heapify(int[] a, int size, int index)
{
int value = a[index];
while (index < size)
{
int childPos = index * 2 + 1;
if (childPos < size)
{
if (childPos + 1 < size && a[childPos + 1] < a[childPos])
{
childPos++;
}
if (value < a[childPos])
{
a[index] = value;
return;
}
else
{
a[index] = a[childPos];
index = childPos;
}
}
else
{
a[index] = value;
return;
}
}
}
private void SwapValues(int[] a, int i, int j)
{
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}