måndag 7 juli 2014

Bucket sort in C#

Bucket sort - overview

Bucket sort is a good example of a distribution sort algorithm. It is closely related to Radix sort and counting sort.

Description of the bucket sort algorithm

Bucket sort is a sorting algorithm that distributes all values in an array into a number of so called buckets. Then each bucket is sorted by a different algorithm or by applying the same algorithm recursively. Finally the separate buckets are concatenated to get the final sorted array.

The algorithm works like this:
  1. Initialize a list of empty buckets.
  2. Loop through the input array and put each object in a corresponding bucket. This step is called "scattering".
  3. Sort all buckets
  4. Go through the sorted buckets in order and put the elements back into the original array. This step is called "gathering".

Computational complexity

Bucket sort has has a worst case performance of O(n2). In average it runs in O(n + k) time.

Example of an implementation of bucket sort in C#


    private void SortArrayWithBucketSort()
    {
        double[] array = { 0.37, 0.25, 0.86, 0.23, 0.09, 0.21, 0.17, 0.71 };
        double[] sortedArray = BucketSort(array);

        // PrintResult(sortedArray);
    }

    public double[] BucketSort(double[] array)
    {
        List<List<double>> buckets = new List<List<double>>();
        InitializeBuckets(buckets);

        Scatter(array, buckets);

        int i = 0;
        foreach (List<double> bucket in buckets)
        {
            double[] arr = bucket.ToArray();
            InsertionSort(arr);

            foreach (double d in arr)
            {
                array[i++] = d;
            }
        }

        return array;
    }

    private void Scatter(double[] array, List<List<double>> buckets)
    {
        foreach (double value in array)
        {
            int bucketNumber = GetBucketNumber(value);
            buckets[bucketNumber].Add(value);
        }
    }

    private void InsertionSort(double[] array)
    {
        int j;
        double temp;

        for (int i = 1; i < array.Length; i++)
        {
            j = i;
            while (j > 0 && array[j] < array[j - 1])
            {
                temp = array[j];
                array[j] = array[j - 1];
                array[j - 1] = temp;
                j--;
            }
        }
    }

    private int GetBucketNumber(double value)
    {
        double val = value * 10;
        int bucketNumber = (int)Math.Floor(val);
        return bucketNumber;
    }

    private static void InitializeBuckets(List<List<double>> buckets)
    {
        for (int i = 0; i < 10; i++)
        {
            List<double> a  = new List<double>();
            buckets.Add(a);
        }
    }




fredag 4 juli 2014

Counting sort in C#

Counting sort - overview

Counting sort (or histogram sort) is a linear integer sorting algorithm. It counts the number of elements that have the same value in the input array, puts the number of items in a a auxilliary counting array and then uses arithmetic to get the position of each value in the output array. The algorithm was invented by Harold H. Seward in 1954.

Counting sort does not use any comparisons, so it is not a comparison sort. Instead, it uses the values of the elements in the input array as indexes in an array used for counting.

A useful property of counting sort is that it is stable. This property is important when there is satellite data belonging to the elements being sorted.

The stability property is useful for another reason. In radix sort the counting sort algorithm is often used as a subroutine. In order to get Radix sort to work it is required that counting sort is stable. 

Description of the counting sort algorithm

Lets assume that we have an input array containing n values in the range [0..k]. Counting sort first creates an auxilliary counting array (with  k+1 elements) to hold frequencies of the elements in the input array. It then loops through all values in the input array and for each value, it uses that value as index in the counting array and increments that value by 1. For example if the value in the input array is 17, then it would execute countingArray[17]++ for each value.

After this first loop the counting array now holds the frequencies for all values in the input array.

In the next step the counting array is recalculated so that at position i, the value holds information about how many input elements that are less than or equal to the element i. In other words, the element at index=i holds the index position j of where the value i should be put in the output array. If using a zero indexed array we take the number j and decrease by 1, and then we get the final position.

Next an array for the output is created. The input array is then looped through, and for each value v it retrieves the position of where v should be placed in the output array. The value is retrieved from the recalculated counting array. Then it decreases the value in the counting array by 1.

Computational complexity

Counting sort has has a complexity of  O(n + k), where n is the size of the sorted array and k is the size of the helper array.

Example of an implementation of counting sort in C#


    private void SortArrayWithCountingSort()
    {
        int[] arrayToSort = { 2, 6, 6, 10, 10, 3 };
        int[] sorted = CountingSort(arrayToSort);

        // PrintResult(sorted);
    }

    private int[] CountingSort(int[] array)
    {
        int[] count = new int[11];

        for (int i = 0; i < array.Length; i++)
        {
            int value = array[i];
            count[value]++;
        }

        for (int i = 1; i < count.Length; i++)
        {
            count[i] = count[i] + count[i - 1];
        }

        int[] sorted = new int[array.Length];

        for (int i = array.Length - 1; i >= 0; i--)
        {
            int value = array[i];
            int position = count[value] - 1;
            sorted[position] = value;

            count[value]--;
        }

        return sorted;
    }




onsdag 2 juli 2014

Heapsort in C#

Heapsort - overview

Heapsort is a comparison based and in-place sorting algorithm. It was invented by John Williams in 1964.

What is a heap?

A heap is a tree-based data structure that must satisfy two properties to be a valid heap. First it must be a complete binary tree and secondly it must satisfy the heap property.

  • In a complete binary tree each node has exactly two children. So a heap is entirely filled except for the lowest level. In the lowest level the leaves must be filled from left to right.
  • The heap-order property says that: "Every node in the heap is <= to its children" (This is valid for min-heaps. For max-heaps we simply change "<=" against ">=").

A heap can be represented in an array. See the following illustration:

An example of a min-heap structure and its corresponding representation in array form. Note
 that this heap consists of a binary tree that is complete and that each node is <= its children.

Since a heap has the property of completeness, we know that there are no "holes" in the array. At each position in the array from 0 to n-1 we will have a valid value.

To find the index of the first child of a node with index i, we just need to use the formula 2i  + 1. The second child index is found at 2i  + 2. On the other hand, if we want to get the index of the parent, we would instead use floor((i - 1) / 2). Since integer division always results in an integer we can simply use (i-1) / 2.

Description of the heapsort algorithm

Heapsort consists of two parts. First a heap structure is constructed from the input data. For a min-heap the root value now is the smallest value in the heap, and for a max-heap it would be the largest.

In the second step the root value is removed from the heap by swapping the first with the last value. The root value is inserted into the right part of the array and is now considered sorted. Now the heap is one element smaller than before.

Next the algorithm will adjust the heap to restore the heap property and then remove the next smallest value and insert into the right sorted part of the array, by swapping first and last vaues. Then it will again adjust the heap and remove the next smallest value and so on. This will go on until all values in the heap has been moved to the right part. At that point the values in the array are sorted.

Computational complexity

Worst, best and average case performance are all O(n log n).

Example of an implementation of heapsort in C#


    private void SortArrayWithHeapSort()
    {
        int[] array = { 7, 5, 23, 2, 0, 2, 2 };
        BuildHeap(array);

        for (int i = array.Length - 1; i > 0; i--)
        {
            SwapValues(array, 0, i);

            int size = i;
            Heapify(array, size, 0);
        }

        // PrintResult(array);
    }

    private void BuildHeap(int[] a)
    {
        int size = a.Length;

        for (int i = size / 2; i >= 0; i--)
        {
            Heapify(a, size, i);
        }
    }

    private void Heapify(int[] a, int size, int index)
    {
        int value = a[index];

        while (index < size)
        {
            int childPos = index * 2 + 1;

            if (childPos < size)
            {
                if (childPos + 1 < size && a[childPos + 1] < a[childPos])
                {
                    childPos++;
                }

                if (value < a[childPos])
                {
                    a[index] = value;
                    return;
                }
                else
                {
                    a[index] = a[childPos];
                    index = childPos;
                }
            }
            else
            {
                a[index] = value;
                return;
            }
        }
    }

    private void SwapValues(int[] a, int i, int j)
    {
        int temp = a[i];
        a[i] = a[j];
        a[j] = temp;
    }