CMU 15-112: Fundamentals of Programming and Computer Science
Class Notes: P vs. NP

  1. Complexity Classes
  2. Solving vs. Verifying
  3. P vs. NP
  4. NP-Completeness
  5. Solving NP-Complete Problems Efficiently
  6. Consequences

  1. Complexity Classes
    • In the Efficiency unit we showed that algorithms fall into different function families. Complexity families are the same idea, but generalized even more.
    • Specifically, we care about two main complexity classes: exponential time and polynomial time. Generally we want to write algorithms that can run in polynomial time. We call these functions 'tractable'.
    • We'll say that a function is in P if it runs in polynomial (O(n**k) for some integer k) time.

  2. Solving vs. Verifying
    • Usually, when we are working with a problem, we want to solve the problem by writing an algorithm that finds a solution.
    • Sometimes, we also care about verifying solutions to a problem. Given a possible solution, how much time does it take to check whether it works?
    • We'll say that a function is in NP if it can be verified in polynomial time, regardless of how long it takes to solve.
    • Example: subsetSum. Given a list of n elements, is there a subset of the list where the elements sum to 0?

      # Obvious solution: findSolution has to generate every possible subset to check each one's sum. # Therefore, it runs in exponential time. Might there be a better way? def findSolution(lst, sumLst = None): if sumLst == None: sumLst = [] if len(sumLst) > 0 and sum(sumLst) == 0: return sumLst elif len(lst) == 0: return None else: result = findSolution(lst[1:], sumLst + [lst[0]]) if result == None: result = findSolution(lst[1:], sumLst) return result # checkSolution only needs to check if the sum is 0. This runs in polynomial time, # so subsetSum is in NP. def checkSolution(solution): return sum(solution) == 0

  3. P vs. NP
    • A lot of the time, we know that a problem is in NP (can be verified in polynomial time), but the obvious solution runs in exponential time.
    • Big Idea: maybe we can prove that, if we can verify in polynomial time, we can also solve in polynomial time. In other words, maybe P = NP. If there exists at least one problem where we can verify in polynomial time and we can prove that it cannot be solved in polynomial time, P != NP (though P will still be a subset of NP in this case).
    • P vs. NP is still unsolved, and is in fact one of the seven Millennium Prize Problems.

  4. NP-Completeness
    • A great number of useful problems are in NP. These include:
      • Solving Sudoku
      • Travelling Salesman Problem
      • Graph Coloring
      • Knapsack Problem
      • Boolean satisfiability problem
      • Scheduling
      • ...
    • For many of these problems, it's possible to transform (or reduce) problem A into problem B in polynomial time. Some problems can be reduced from all other NP problems in polynomial time; these are called NP-Complete.
    • If we can find a polynomial-time solution to any NP-Complete problem, we can prove that P=NP, since that polynomial-time solution could be combined with the polynomial-time reduction to achieve still-polynomial solutions to all other NP problems.
  5. Solving NP-Complete Problems Efficiently
    • We don't have efficient solutions to NP-complete problems. In many cases, however, we can use heuristics to estimate a solution that is good enough.
    • Consider the problem of scheduling final exams:
      • Given a set of courses, students, rooms, and student course schedules, find the shortest possible final exam schedule that guarantees no student has two exams at the same time.
      • This problem is NP-complete, so finding the optimal is not currently feasible.
      • However, we can solve for solutions that are close to optimal. This means that while we might not be able to find the shortest possible exam schedule, we might find one that is close to the shortest.
      • In practice, finding a close solution is often good enough. If this interests you, you might consider reading about constraint satisfaction problems.
    • The ability to identify that a problem you are solving is not efficiently solvable is an important skill for a programmer and can save you lots of wasted time.
  6. Consequences
    • If P=NP, answering questions that are currently considered difficult could suddenly become much easier. Scheduling and other logistical questions could be almost entirely automated, and most mathematical work could be done entirely with computers. This would also impact cryptography, as certain types of encryption would be breakable in much more efficient time.
    • If P!=NP, life would continue on as normal. However, we will also have gained a way to formally demonstrate that certain problems cannot be solved efficiently, which could reduce the number of open questions researchers investigate.