CMU 15-112: Fundamentals of Programming and Computer Science
Class Notes: Efficiency: Sorting

Bubble Sort
Selection Sort
Merge Sort
Sorting Efficiency
Selection Sort vs. Merge Sort

Bubble Sort

def swap(a, i, j): (a[i], a[j]) = (a[j], a[i]) def bubbleSort(a): n = len(a) end = n swapped = True while (swapped): swapped = False for i in range(1, end): if (a[i-1] > a[i]): swap(a, i-1, i) swapped = True end -= 1

Selection Sort

def swap(a, i, j): (a[i], a[j]) = (a[j], a[i]) def selectionSort(a): n = len(a) for startIndex in range(n): minIndex = startIndex for i in range(startIndex+1, n): if (a[i] < a[minIndex]): minIndex = i swap(a, startIndex, minIndex)

Merge Sort

def swap(a, i, j): (a[i], a[j]) = (a[j], a[i]) def merge(a, start1, start2, end): index1 = start1 index2 = start2 length = end - start1 aux = [None] * length for i in range(length): if ((index1 == start2) or ((index2 != end) and (a[index1] > a[index2]))): aux[i] = a[index2] index2 += 1 else: aux[i] = a[index1] index1 += 1 for i in range(start1, end): a[i] = aux[i - start1] def mergeSort(a): n = len(a) step = 1 while (step < n): for start1 in range(0, n, 2*step): start2 = min(start1 + step, n) end = min(start1 + 2*step, n) merge(a, start1, start2, end) step *= 2

Sorting Efficiency

import time, random # Copy the sorts from above to here! def testSort(sortFn, n): a = [random.randint(0,2**31) for i in range(n)] sortedA = sorted(a) startTime = time.time() sortFn(a) endTime = time.time() elapsedTime = endTime - startTime assert(a == sortedA) print("%20s n=%d time=%6.3fs" % (sortFn.__name__, n, elapsedTime)) def testSorts(): n = 2**8 # use 2**8 for Brython, use 2**12 or larger for Python for sortFn in [selectionSort, bubbleSort, mergeSort, builtinSort]: testSort(sortFn, n) testSorts()

Selection Sort vs Merge Sort
When determining the Big-O runtime of a more advanced algorithm, such as Selection Sort or Merge Sort, it isn't enough to look at builtin runtimes or nesting- we have to consider how much work the algorithm does overall. Stepping away from the code and considering the algorithm can help in this analysis.

For Selection Sort, consider sorting a worst-case list of length N. How much work does each iteration of the loop do?
- On the first pass, we need N-1 compares and 1 swap - N total actions.
- On the second pass, we need only N-2 compares (since one value is already sorted) plus one swap - N-1 total actions.
- On the third pass, we only N-2 total actions...
- We can find the total number of steps by adding the steps across the loop. So, total steps are about 1 + 2 + ... + (N-1) + N = N(N+1)/2 = O(N²).
For Merge Sort, again we need to consider how much work each iteration does.
- On each pass, we need about 3N compares and copies (N compares, N copies down, N copies back).
- So total cost = (3N steps per pass) x (# of passes)
- After pass 0, we have sorted lists of size 2⁰ (1)
- After pass 1, we have sorted lists of size 2¹ (2)
- After pass 2, we have sorted lists of size 2² (4)
- After pass k, we have sorted lists of size 2^k
- So we need k passes, where N = 2^k
- So # of passes = k = log₂N
- Recall that total cost = (3N steps per pass) x (# of passes)
- So total cost = (3N)(log₂N) = O(NlogN).
Important note: for general sorting purposes, O(NlogN) is the best possible runtime. That's why Python's built-in sort takes O(NlogN) time!

CMU 15-112: Fundamentals of Programming and Computer Science Class Notes: Efficiency: Sorting

CMU 15-112: Fundamentals of Programming and Computer Science
Class Notes: Efficiency: Sorting