CMSC--202 Computer Science II for Majors
Fall 1997 5 November 1997 Three Sorting Routines
This handout presents three different sorting routines: selection sort, merge sort, and quicksort. Each of these routines takes four arguments: an array of items to sort; the low and high indices in the array between which lie the elements we wish to sort; and a comparator that takes two items and returns:
First, here is the swap routine that is used by selectsort
and
by quicksort
.
void swap(array_item_type * x, array_item_type * y) { array_item_type temp = *x; *x = *y; *y = temp; }
selectsort
works by repeatedly finding the next
item in the sequence and putting that item in its proper place.
First, we define a helper function for selectsort.
FindNextIndex
returns the index of the element of the
subarray A[low...high]
which most satisfies the comparison
function comp.
static index_type FindNextIndex(array_type A, index_type low, index_type high, comparefunctype comp) { assert(low <= high); if (low == high) return(low); else if (comp(A[low], A[high]) < 0) return(FindNextIndex(A, low, high - 1, comp)); else return(FindNextIndex(A, low + 1, high, comp)); }
Now, here is selectsort
void selectsort(array_type A, index_type low, index_type high, comparefunctype comp) { if (low < high) { int nextpos = FindNextIndex(A,low,high,comp); swap(&A[low], &A[nextpos]); selectsort(A, low + 1, high, comp); } }
mergesort
relies on the merge
function to merge together
two sorted arrays. mergesort
gets its efficiency from the fact
that at most one comparison needs to be made for each item being
merged.
Think about merging two decks of cards, A
and B
, into a
third deck C
. Decks A
and B
are assumed to each
be already in sorted order. Deck C
stars out empty.
You compare the top cards from each deck and place the smaller of the
two, face down, on the merged deck. When either deck A
or
deck B
becomes empty, you place the entire remaining deck, face
down, onto deck C
. The key here is that the two decks,
A
and B
are sorted to begin with. At most one
comparison gets made for each card merged. When one of the decks
becomes empty, no further comparisons are done, the entire remaining
deck is merged in at once. This is the basis for the following merge
algorithm.
In this algorithm, the ``cards'' are stored in an array.
Deck A
is stored in indices low1..high1
,
inclusive. Deck B
is stored in indices low2..high2
. We
further stipulate that the two ``decks'' are contiguous in the array
(the last card of deck A
comes just before the first card of
deck B
).
The mergesort
algorithm uses the merge
algorithm (below)
to sort the entire array. To sort the whole array A
,
low
would be zero and high would be the maximum index in the
array.
void mergesort(array_type A, index_type low, index_type high, comparefunctype p) { if (low < high) { int mid; mid = (low + high)/2; mergesort(A, low, mid, p); mergesort(A, mid + 1, high, p); merge(A, low, mid, mid + 1, high, p); } }
Mergesort
breaks the array down into smaller and smaller pieces
until the individual pieces are just one item in size. Since one item
is ``sorted,'' we can merge two contiguous items. The
mergesort
algorithm therefore breaks the array down into
smaller chunks on the way down the recursion tree. On the way back
up, it merges these smaller pieces of the array into larger pieces.
One could say that the sorting is done on the way back up the tree.
Note that the recursion pattern is the same as that for postfix
traversal of a binary tree.
void merge(array_type A, index_type low1, index_type high1, index_type low2, index_type high2, comparefunctype comp) { array_item_type *temp; int array_size; index_type temp_index = 0; index_type index1 = low1; index_type index2 = low2; index_type i; /* make sure the two areas are contiguous */ assert(high1 + 1 == low2); /* Make a temporary array to hold the merged subarrays */ array_size = high2 - low1 + 1; temp = (array_item_type *) calloc(array_size, sizeof(array_item_type)); /* while there are elements in both halves, copy the lowest one */ while (index1 <= high1 && index2 <= high2) { if (comp(A[index1],A[index2]) < 0) temp[temp_index++] = A[index1++]; else temp[temp_index++] = A[index2++]; } /* copy any remaining elements */ while (index1 <= high1) temp[temp_index++] = A[index1++]; while (index2 <= high2) temp[temp_index++] = A[index2++]; /* copy the now-sorted elements back into the original array */ for (i = low1; i <= high2; i++) A[i] = temp[i - low1]; free(temp); }
Just as in mergesort
, quicksort
splits the array into
two pieces. However, quicksort
takes the additional step of
placing all the low values in the left piece and all the high values
in the right piece. Consquently, no merge step is needed after the
two halves have been sorted.
The partition
function (below) is responsible for dividing the
array into two pieces.
void quicksort(array_type A, index_type low, index_type high, comparefunctype comp) { if (low < high) { int q = partition(A, low, high, comp); quicksort(A, low, q, comp); quicksort(A, q + 1, high, comp); } }
In quicksort
, the partitioning is done on the way down the
recursion tree. One could say that the sorting is done on the way
down. Notice that the recursion pattern in quicksort
is the
same as that for prefix traversal of a binary tree.
The partition
function starts an index at each end of the array
and moves them inward. Whenever two values that are in the wrong halves
are found, they are swapped. When the two indices meet, the
partitioning is complete.
index_type partition(array_type A, index_type low, index_type high, comparefunctype comp) { array_item_type x = A[low]; index_type i = low - 1; index_type j = high + 1; while (1) { do {j = j - 1;} while (comp(A[j], x) > 0); /* A[j] should follow x */ do {i = i + 1;} while (comp(A[i], x) < 0); /* A[i] should precede x */ if (i < j) swap(&A[i], &A[j]); else return j; } }
Most C systems provide a library function to sort arrays. The function is often quicksort and has the following signature:
void qsort(void * base, size_t nmemb, size_t size, int (*compar)(const void*,const void*))
The type size_t
is system dependent, but is always some kind of
integer. For now, think of it as just an integer.
The qsort
function sorts an array with nmemb
elements of
size size
. The base
argument points to the start of the
array. The comparison function compar
must return an integer
less than, equal to, or greater than zero if its first argument is
considered to be respectively less than, equal to, or greater than its
second argument. If two members compare as equal, their order in the
sorted array is undefined ( i.e., they will be in the correct
sorted order, but which equal one precedes the other is undefined).
Here's an example of the use of qsort
.
#include <stdio.h> #include <stdlib.h> int precedes(const void * x, const void * y) { return (*(int *)x - *(int *)y); } main() { int foo [] = {9,3,5,12,1,7,9,3,6}; int foosize = sizeof(foo)/sizeof(int); int i; for (i = 0; i < foosize; i++) printf("%d ", foo[i]); printf("\n"); qsort(foo, foosize, sizeof(int), precedes); for (i = 0; i < foosize; i++) printf("%d ", foo[i]); printf("\n"); }
Notice that the precedes
function must have the correct
signature for use as the compar
argument. The function must
therefore type convert the arguments appropriately.