CMU 15-112: Fundamentals of Programming and Computer Science
Class Notes: Efficiency: Sorting


  1. Bubble Sort
  2. Selection Sort
  3. Merge Sort
  4. Sorting Efficiency
  5. Selection Sort vs. Merge Sort

  1. Bubble Sort
    def swap(a, i, j): (a[i], a[j]) = (a[j], a[i]) def bubbleSort(a): n = len(a) end = n swapped = True while (swapped): swapped = False for i in range(1, end): if (a[i-1] > a[i]): swap(a, i-1, i) swapped = True end -= 1

  2. Selection Sort
    def swap(a, i, j): (a[i], a[j]) = (a[j], a[i]) def selectionSort(a): n = len(a) for startIndex in range(n): minIndex = startIndex for i in range(startIndex+1, n): if (a[i] < a[minIndex]): minIndex = i swap(a, startIndex, minIndex)

  3. Merge Sort
    def swap(a, i, j): (a[i], a[j]) = (a[j], a[i]) def merge(a, start1, start2, end): index1 = start1 index2 = start2 length = end - start1 aux = [None] * length for i in range(length): if ((index1 == start2) or ((index2 != end) and (a[index1] > a[index2]))): aux[i] = a[index2] index2 += 1 else: aux[i] = a[index1] index1 += 1 for i in range(start1, end): a[i] = aux[i - start1] def mergeSort(a): n = len(a) step = 1 while (step < n): for start1 in range(0, n, 2*step): start2 = min(start1 + step, n) end = min(start1 + 2*step, n) merge(a, start1, start2, end) step *= 2

  4. Sorting Efficiency
    import time, random # Copy the sorts from above to here! def testSort(sortFn, n): a = [random.randint(0,2**31) for i in range(n)] sortedA = sorted(a) startTime = time.time() sortFn(a) endTime = time.time() elapsedTime = endTime - startTime assert(a == sortedA) print("%20s n=%d time=%6.3fs" % (sortFn.__name__, n, elapsedTime)) def testSorts(): n = 2**8 # use 2**8 for Brython, use 2**12 or larger for Python for sortFn in [selectionSort, bubbleSort, mergeSort, builtinSort]: testSort(sortFn, n) testSorts()

  5. Selection Sort vs Merge Sort
    When determining the Big-O runtime of a more advanced algorithm, such as Selection Sort or Merge Sort, it isn't enough to look at builtin runtimes or nesting- we have to consider how much work the algorithm does overall. Stepping away from the code and considering the algorithm can help in this analysis.

    For Selection Sort, consider sorting a worst-case list of length N. How much work does each iteration of the loop do?
    • On the first pass, we need N-1 compares and 1 swap - N total actions.
    • On the second pass, we need only N-2 compares (since one value is already sorted) plus one swap - N-1 total actions.
    • On the third pass, we only N-2 total actions...
    • We can find the total number of steps by adding the steps across the loop. So, total steps are about 1 + 2 + ... + (N-1) + N = N(N+1)/2 = O(N2).

    For Merge Sort, again we need to consider how much work each iteration does.
    • On each pass, we need about 3N compares and copies (N compares, N copies down, N copies back).
    • So total cost = (3N steps per pass) x (# of passes)
    • After pass 0, we have sorted lists of size 20 (1)
    • After pass 1, we have sorted lists of size 21 (2)
    • After pass 2, we have sorted lists of size 22 (4)
    • After pass k, we have sorted lists of size 2k
    • So we need k passes, where N = 2k
    • So # of passes = k = log2N
    • Recall that total cost = (3N steps per pass) x (# of passes)
    • So total cost = (3N)(log2N) = O(NlogN).

    Important note: for general sorting purposes, O(NlogN) is the best possible runtime. That's why Python's built-in sort takes O(NlogN) time!