Gettysburg College

CS 216
Data Structures

Spring 2024

Assignment 8

Due: Mon, Apr 1, by 11:59pm
  • method stubs: MaxHeap|HeapSort|QuickSort|MergeSort.java
  • junit tests: SortAlgosTest.java
  • coverage-MaxHeap|HeapSort|QuickSort|MergeSort.pdf
  • no API required
Due: Thu, Apr 4, by 11:59pm
  • MaxHeap|HeapSort|QuickSort|MergeSort.java
  • SortAlgosTest.java
  • coverage-MaxHeap|HeapSort|QuickSort|MergeSort.pdf
Due: Fri, Apr 5, by 11:59pm
  • CompareSortAlgos.java
  • CompareSortAlgos.xslx

Description

Implement the three sorting algorithms discussed in class and provide experimental evaluation and analysis of their running times.

Create a project called SortAlgos with the classes mentioned below.

Javadoc API is not required.

Generic Methods

Some of the methods below are generic but they are not part of a generic class. To declare a generic method, simply put <E> in front of the return type. Here is an example:

public <E> void printArray(E[] items)
{
}

The Tester

See Section JUnit below on how to setup the Tester.

Swaps and Comparisons

Download the SortUtils.jar file in the SortAlgos project folder (next to folders bin/ src/):

To swap the values at two indices in a primitive array:

SortUtils.swap( the-array-name, i, j)
To compare two values:

if ( SortUtils.compare(xxx, yyy) < 0 )       instead of    if ( xxx < yyy )

if ( SortUtils.compare(xxx, yyy) > 0 )       instead of    if ( xxx > yyy )

if ( SortUtils.compare(xxx, yyy) == 0 )      instead of    if ( xxx == yyy )

Max Heap

Implement generic class MaxHeap with the methods given below. The data members are:

MaxHeap(E[] theItems)

Creates a max heap from the given items. The size is initially the same as the number of items in the array.
void pushDown(int i)

(private) Pushes the item at the given index down the items array (down the heap) until it ends up in a place where it has no max-heap property violations with its children.

This method repairs the heap starting from the given index and is used in most of the methods that follow.

Zig-zag down the tree exchanging with the larger child. Stop when no max-heap property violation.

You may assume that the heap is not empty

void makeHeap()

(private) Rearranges the data member items so that it represents a max heap.

Repair the heap starting at the farthest non-leaf and moving up toward the root.

Initially start at the very last item, since this is easier. Then think of the formula that can compute the location of the farthest non-leaf from the heap size.

No if-statements. It should be possible to create an empty heap without any extra code.

E pop()

Returns (and removes) the max value in the heap.

The max is always at the top of the heap. Replace the max with the last item and repair the heap from the top.

Note: Since we will be comparing the execution time of sorting algorithms, you may assume that the heap is not empty to avoid the constant check. (Normally, this would have to be handled.)

isLeaf(int i)

(private) Determines if the item at the given index is a leaf.

You may assume the heap is not empty and the index is valid.

Do as little work as possible to determine the answer.

left(int i), right(int i)

(private) Returns the child index.
toString()

Returns a string representation of this heap (up to its size). This method is given:

public String toString()
{
    return SortUtils.toString(items, size);
}

Heap Sort

Create class HeapSort that implements the HeapSort algorithm.
void sort(E[] items)

The method sorts the given items using the Heap Sort algorithm discussed in class:
  1. create a max heap out of the items
  2. pop the max element from the heap and store it at the end of the given array; each time we pop from the heap a new spot opens up
  3. repeat as many times as needed to put each item in its correct spot; avoid unncessary work

Quick Sort

Create class QuickSort that implements the QuickSort algorithm.
int partition(E[] items, int i, int j)

Partitions the items in the given range [i,j] (inclusive) around a pivot. The method returns the index where the pivot element ends up after the partition.

Initially pick the middle element in the given range to serve as the pivot, so that it is easy to test/debug the code.

For the final comparison use method median3from SortUtils. This method will place the median of first, middle, and last items is in the middle index.

Briefly, here is how the partition algorithm works:

  1. pick a pivot element and swap it with the last item in the range
  2. move forward index i until it finds an item bigger than the pivot
  3. move backward index j until it finds an item smaller than the pivot
  4. exchange the values at i, j
  5. repeat steps 2-4 until i,j meet
  6. exchange the pivot and the item at index i; this puts the pivot in its proper spot

Note: You may assume that the range is not empty. This will have been ensured elswhere.

void sort(E[] items, int i, int j)

Sorts the items in the given range [i,j] (inclusive) using the QuickSort algorithm.

Briefly, here is how QuickSort works:

  1. partition the given range
  2. sort each "half" of the range (i.e. each side of the pivot)
Note that the partition algorithm will have put the pivot in its correct spot.
void sort(E[] items)

Sorts the given items using the QuickSort algorithm. This method simply calls the previous one.

Merge Sort

Create class MergeSort that implements the MergeSort algorithm.
E[] merge(E[] leftSide, E[] rightSide)

Merges the two primitive arrays into a final sorted primitive array. The given sides are assumed to have already been sorted.

Briefly, here is how the merge algorithm works:

  1. create the result primitive array with enough space to hold both sides
  2. keep two indices starting at the beginning of each side (and an index into the result)
  3. transfer to the result the smaller element from the two sides and advance the corresponding index
  4. at the end transfer the left over elements from one of the sides
Note: As part of the sorting algorithm will need to create arrays of certain sizes. See method copyCell and arrayAs ins SortUtils.
E[] sort(E[] items, int i, int j)

Returns a sorted array of the given range [i,j] (the given array is not modified) using the MergeSort algorithm.

Briefly, here is how MergeSort works:

  1. find the middle index and sort each half
  2. merge the results from the previous step
Note: As part of the sorting algorithm will need to create arrays of certain sizes. See method copyCell and arrayAs ins SortUtils.

Note: You may assume that the range is not empty, but put as a comment what you would have written if this assumption was not allowed.

E[] sort(E[] items)

Returns a sorted array of the given items (the given array is not modified) using the MergeSort algorithm. This method simply calls the previous one.

Note: You may assume that the array is not empty.

JUnit Tests

Create class SortAlgosTest that shows evidence of thorough testing. Here is an example:

numbers = load( ...the numbers... );
numbers = MergeSort.sort(numbers);
assertEquals(SortUtils.toString(numbers), "[...the expected result...]");

In particular:

Algorithm Comparison

This portion may be submitted on Friday (there will be a separate dropbox). Create class CompareSortAlgos with the following methods:

Summary of Results

Make sure to add the -Xint flag under "VM Arguments":

1) Right-click on project name; Go to "Run As" -> "Run Configurations"
2) [Left  Panel] Click on "Java Applications" -> "SortAlgos"
3) [Right Panel] Click tab "Arguments" and in box "VM Arguments" write -Xint


Do this part when all sorting algorithms work correctly and you are ready to collect the time measurements that will be plotted in Excel.

Generate three tables (one for each configuration) for testing arrays of sizes 100,000--1,000,000 (inclusive) at size increments of 100,000.

Using Excel create a single document with three sheets for each input configuration -- sorted, reverse sorted, and random order input array.

  1. Save the document in a single file called CompareSortAlgos.xlsx with three sheets inside called Sorted, Reverse, and Random.

  2. Each plot should show the relationship between input size (x-axis) and run-time (y-axis) for the three sorting algorithms on the respective input configuration. A good step-by-step explanation of how to draw an XY-Scatter Chart in Excel is available here.

  3. Fit a curve/trendline through the data for each algorithm for each plot. Use your judgment as to what curve is the best fit. Excel will allow you to fit a straight line, any polynomial curve (i.e. n?), exponential curve, etc. Our analysis from class suggested a polynomial trendline of either degree 1 or degree 2.

  4. Show the equation for each curve/trendline. A good step-by-step explanation of how to fit a trendline in Excel through scatter data is available here.

  5. In the Excel file provide a brief explanation of your findings. State which algorithm you expected to be faster, briefly explain why, and state whether the data supports your expectations.