NUIMCrest CS403/SE307/CS355 - Computation and Complexity
Department of Computer Science
National University of Ireland, Maynooth

T Naughton, NUIM
Back to CS403/SE307/CS355 home


Lab 6 - Graph algorithms and polynomial verifiability

First, make sure you are assigned to a team by a demonstrator and go to the space allocated for your team. The object of this lab is to work in teams to answer the problems below. Each team will hand up its solutions at the end of the lab. You may look at last week's lab sheet to remind you of the definition for a graph, directed graph, path, etc.

Recall the PATH problem, defined in last week's lab sheet as the language PATH = {< G, s, t > : G is a directed graph that has a path from s to t}. The algorithm given in lectures to decide this language was as follows.

MP = "On input < G, s, t > :
         1. Create two sets S and S'. Add s to S.
         2. For each edge pair (x, y) in G, if x is in S and y is not in S then add y to S'.
         3. If S' is not empty, assign S = S union S', assign S' = emptyset and go to stage 2.
         4. Check if t is in S. If so, accept; otherwise reject."

Algorithm MP is analysed formally below, but before that, here's an explanation of the analysis. First, we choose to measure time complexity in terms of number of nodes in G. This is because the number of nodes in a graph is the best simple measure of the size of a graph (and is therefore an appropriate measure of the size of the input to MP). Next, for simplicity, we assume that each atomic operation such as creating a set, adding an element to a set, emptying a set, copying an element, and comparing two elements takes a single timestep. Finally, and again for simplicity, we assume that data structures are not sorted so that a search of a set containing at most z elements will take z timesteps. Stage 1 performs three atomic operations and is executed at most once, so we say that stage 1 takes three timesteps. Stage 2 involves a loop through all edge pairs, of which there will be at most m(m - 1) where m is the number of nodes in G. At each iteration of the implicit loop within stage 2, two searches of set S (where S contains at most m elements) will be undertaken as well as possibly adding one more element to S. Stage 3 requires at most one comparison, m copy operations, and one assignment. Stages 2 and 3 will each be exectuted at most m times; the case where no more than one node is added per iteration. Stage 4 involves a search of at most m comparisons, and is executed at most once. The following is a complete proof that PATH is in P.


Proof that PATH is in P. Consider the algorithm MP given above to decide PATH. MP requires the following number of timesteps for each of its stages, where m is the number of nodes in G:
         1. 1 times 3 = O(1)
         2. m times (m(m - 1))(2m + 1) = O(m4)
         3. m times (m + 2) = O(m2)
         4. 1 times m = O(m)
Summing 1 through 4 gives O(m4) which is a polynomial in m. Therefore PATH is in P.

Of course, O(m4) timesteps is pretty inefficient for a solution to the PATH problem, and you could probably come up with a slightly more complicated algorithm that takes only O(m2) timesteps, but the point is, as long as we find one algorithm, any algorithm, that decides PATH in polynomial time, then our proof is complete.


The CONNECTED problem is defined by the following language CONNECTED = {< G > : G is a graph and G is connected}. Graphs are undirected by default, so in this problem each G is undirected. A graph G is said to be connected if for every pair of different nodes x and y in G there is a path from x to y. Nodes x and y do not have to be joined by a single edge for G to be connected. The following algorithm MC decides CONNECTED.

MC = "On input < G > :
         1. Create two sets S and S'. Add any one node in G to S.
         2. For each edge pair (x, y) in G do both of the following. If x is in S and y is not in S then add y to S'. If y is in S and x is not in S then add x to S'.
         3. If S' is not empty, assign S = S union S', assign S' = emptyset and go to stage 2.
         4. Check if ____________________________. If so, accept; otherwise reject."


Problem 1. Fill in the blank in stage 4 of MC, so that MC decides CONNECTED.

Problem 2. Using MC, and in the same way that we proved that PATH was in P, complete the proof below that CONNECTED is in P. You will be forgiven for missing a +1 or -1 here and there, but you must have the correct power of m for each stage. (Hint: remember this is an undirected graph!)

Proof that CONNECTED is in P. Consider the algorithm MC given above to decide CONNECTED. MC requires the following number of timesteps for each of its stages, where m is the number of nodes in G:
         1. _________________________________
         2. _________________________________
         3. _________________________________
         4. _________________________________
Summing 1 through 4 gives __________________ which is a polynomial in m. Therefore CONNECTED is in P.



Polynomial verifiability

The class P contains all languages that can be decided quickly. As you have seen from the previous lecture, and from above, the way to prove that a language is in P is to give a polynomial algorithm to decide it. Because of the relationship between languages and problems, we can say that the class P contains all problems that can be solved quickly.

Another very important class of problems are problems whose solutions can be verified quickly. In this scenario, we are given a problem and a potential solution and asked if one can verify quickly that it is in fact a correct solution to the problem. In terms of languages, we would be given a language L, and a word w, and a certificate c, and asked to verify quickly that w in L. This certificate is like a little hint. For example, the language COMPOSITES is defined COMPOSITES = {x : x in N is the product of two primes}. Given a particular x that we are told is in COMPOSITES, we can verify this fact very quickly if we are given, for example, one of its prime factors as the certificate c.

As another example, let ARRAYELEMENT be a search problem defined as the following language ARRAYELEMENT = {< A, a > : a is an integer, A is an array of integers, and a is in A}. Given a particular < A, a > we can verify very quickly that < A, a > in ARRAYELEMENT if we are also given as the certificate c the particular index in array A at which a appears. To prove that a language L can be verified quickly, we must pick an appropriate certificate c and devise a polynomial algorithm that uses c to verify L. Fortunately, it is often extremely easy to choose an appropriate c that will allow one to verify a language, as the following examples show.


Proof that PATH can be verified in polynomial time. The certificate c that we would like to accompany every < G, s, t > we are asked to verify is in PATH, is the path itself from s to t. (Remember from the previous lab, a path is a tuple of nodes.) The following algorithm MPV uses c to verify PATH.

MPV = "On input < G, s, t, c > :
         1. Check that the first node in path c is s and the last node in c is t.
         2. Check that for every i in the range [0, |c|-2] each pair of nodes (ci, ci+1) in path c is an edge in G.
         3. If both tests pass, accept; otherwise reject."

MPV requires the following number of timesteps for each of its stages, where m is the number of nodes in G:
         1. _________________________________
         2. _________________________________
         3. _________________________________
Summing 1 through 3 gives __________________ which is a polynomial in m. Therefore PATH can be verified in polynomial time.

Problem 3. Fill in the blanks in the analysis of MPV above.

The Hamiltonian path problem, defined as HAMPATH = {< G, s, t > : G is a directed graph that has a path from s to t that goes through every node in G exactly once}.


Proof that HAMPATH can be verified in polynomial time. The certificate c is a path from s to t that goes through every node in G exactly once. The following algorithm MHV uses c to verify HAMPATH.

MHV = "On input < G, s, t, c > :
         1. Check that every node in c is a node in G.
         2. Check that every node in G is a node in c.
         3. Check that s is the first node in c and that t is the last node in c.
         4. Check that for every i in the range [0, |c|-2] each pair of nodes (ci, ci+1) in path c is an edge in G.
         5. If all tests pass, accept; otherwise reject."

MHV requires the following number of timesteps for each of its stages, where m is the number of nodes in G:
         1. _________________________________
         2. _________________________________
         3. _________________________________
         4. _________________________________
         5. _________________________________
Summing 1 through 5 gives __________________ which is a polynomial in m. Therefore HAMPATH can be verified in polynomial time.

Problem 4. Fill in the blanks in the analysis of MHV above.

This is effectively the end of the lab sheet, but if you're one of the top people in the class, or you think you could be, or even if you'd just like to be, read on to see where we're going with this lab sheet.


In this lab, you have proved that some problems (languages) can be solved (decided) quickly, and some can be verified quickly. With a little bit of thought, you will realise that every problem that can be solved quickly can also be verified quickly (you could just ignore c solve the problem from scratch!). But can any problem that can be verified quickly also be solved efficiently? We don't know. For example, we have shown above how to verify HAMPATH quickly. However, no programmer (computer scientist or software engineer) has been clever enough to find a polynomial algorithm to solve HAMPATH. Furthermore, no theorist (computer scientist or mathematician) has been clever enough to prove that HAMPATH is fundamentally a difficult problem that can't be solved in polynomial time.

The question of whether problems that can be verified quickly can also be solved quickly is one of the most important unsolved problems in computer science, and one of the top seven most important mathematical problems of this millenium, according to the Clay Mathematics Institute, who'll give you a cool $1,000,000 if you solve it. Also if you solve it (but probably depending on whether your answer is yes or no) IBM, Intel, Microsoft, Lucent, Apple, Sun, and a few others that haven't even been formed yet, will need to band together to bankroll your salary. More in the next lecture.


End of sheet