Computability and Complexity

Comparing Algorithms

Computability and Complexity

The final part of the course deals with the issue of assessing how difficult specific computational problems are to solve. Given a description of a particular problem a number of questions arise:

Is it possible to design an algorithm which solves the problem?
Is it possible to design an efficient algorithm which solves the problem?
How can one tell if a given algorithm is the `best possible'?

These questions are imprecisely formulated:

What exactly is meant by the term `problem'?
What distinguishes algorithms which can be implemented from those which cannot?
What is meant by terms such as:
- Difficulty
- Efficient
- Best

COMPUTABILITY THEORY

and

COMPUTATIONAL COMPLEXITY THEORY

are the fields of Computer Science concerned with the questions raised earlier.

Computability Theory

is concerned with identifying one particular class of

INTRACTABLE PROBLEMS

(Those for which no `effective' algorithm exists) One aim of

Computational Complexity Theory

is to identify solvable problems that are intractable in the sense that

No Efficient Algorithm Exists

which solves them.

Both areas require a formal definition of the concepts:

Problem

Algorithm

Complexity Theory also proposes definitions for the ideas of

Problem Difficulty

Algorithm Efficiency

Best Algorithm

Computational Problems

At the lowest level of abstraction any computer program may be viewed as a means of producing

Specific Output Values

from

Given Input Data

i.e. Programs compute functions whose domain is the set of valid inputs and whose range is the set of possible outputs. In practice the input and output are (ultimately) represented in binary Hence any program may be viewed as (computing) some function


f : {0,1}^* -> {0,1}^*

({0,1}^* is the set of all finite strings (or words) containing only 0s and 1s.

It is only necessary to consider such functions as have range {0,1}. These are known as

Decision Problems

Summary

All computer programs may be viewed as computing functions
Since all computers employ binary notation, such functions are defined over sets of binary strings.
In considering the question `what problems can be solved by computers' it is sufficient to concentrate on decision problems

Hence the question

What problems can be solved by computers?

is equivalent to

What decision problems can be solved?

Decision Problems

Any binary string can be viewed as the representation of some natural number. Thus for decision problems on binary strings we can concentrate on the set of functions of the form


f : N  -> {0,1}

That is `problems' of the form

Input: n a natural number

Output: 1 if n satisfies some property; 0 if n does not satisfy it.

Examples:

EVEN: Returns 1 if n is an even number; 0 if n is an odd number
PRIME: Returns 1 if n is a prime number; 0 if n is a composite number.
ADA_PROGRAM: Returns 1 if n written in binary corresponds to the ASCII representation of a syntactically correct ADA program; 0 otherwise.

Notice that the last example shows that we are not restricting attention to merely `arithmetic' decision problems.

What is a `reasonable' algorithm?

In all programming systems

Programs have a finite number of instructions.
Progams make use of instructions and operations from a finite set of basic operations
Programs should come to a halt for any valid input data.

An effective algorithm is one which meets these criteria.

Suppose we can show that an ADA program cannot be written which solves some problem.

Why should one take this as evidence that

NO PROGRAM (ALGORITHM) AT ALL

exists for the problem?

In other words

How can it be argued that results on computability apply to all types of computer? (Past, Present, and Future)

The answer is that such results are proved with respect to abstract models of computation that capture all the essential elements of a typical computer. A `typical computer' has

Memory
Input and Output Mechanisms
A processor handling a finite instruction set.

In an abstract model we can permit an infinite supply of memory. The only restraint on an algorithm is that it can only use a finite amount of memory when dealing with any single input (which, if the algorithm terminates, is all that could be accessed).

This is not an unrealistic assumption: we do not want to classify a problem as unsolvable just because on some real machines there may not be enough memory available.

The first abstract model of computation was defined by Alan Turing in 1936. This had:

An infinite memory of single bit cells.
Its `processor' dealt with only two types of `instruction'
A stop instruction which would return a result (0 or 1).
A test instruction which would read and write to a single bit of memory, examine an adjacent memory location, and jump to another instruction.

Although the instruction set of the Turing Machine is extremely limited Turing machine programs can be written to solve any decision problem that can be solved on a `normal' computer.

The Church-Turing Hypothesis

The class of decision problems that can be solved by any reasonable model of computation is exactly the same as the class of decision problems that can be solved by Turing machine programs.

Since 1936 several thousand different models of computation have been proposed, e.g.

Post Systems
Markov Algorithms
lambda-calculus
Gö del-Herbrand-Kleene Equational Calculus
Horn Clause Logic
Unlimited Register Machines
ADA programs on unlimited memory machines

Every single one of these satisfies the Church-Turing hypothesis, i.e. anything that can be solved by a Turing machine program can be programmed in any one of the system above and vice-versa

The importance of the Church-Turing hypothesis is that it allows us, without any loss of generality, to consider computability results solely with respect to

ADA programs on unlimited memory computers

Suppose P is some decision problem.

The Church-Turing hypothesis asserts

There is no Turing machine program that computes P

IF AND ONLY IF

There is no ADA program which solves P

Because we can (in principle) write a Turing machine program which can exactly simulate the behaviour of any given ADA program.

Similarly, we can write an ADA program which can exactly simulate the behaviour of any given Turing machine program.

The discussion above has, casually, assumed that there are decision problems for which algorithms do not exist.

Why should we believe that there are actually any such problems?

(nothing above has given any evidence for the assumption made).

We can prove that there exist decision problems for which no algorithm exists by employing a method of argument called

Diagonalisation

This can used to show that there are `more' different decision problems than there are different ADA programs.

If there are `more' decision problems than programs then it must be the case that some decision problems cannot be solved by programs since since a single program can solve only a single decision problem.

Suppose we are given two sets of items, A and B, say, e.g.

A = Set of all students registered for the course 2CS46

B = Set of all students registered for the course 2CS42

How can we tell if there are more items in A than there are in B?

By `counting' the number of items in each set.

How many different decision problems are there?

A decision problem, f, is specified by saying for each number n what the value of f(n) is. f(n) can either be 0 or 1. So we have:

Thus there are an infinite number of distinct decision problems.

How many different ADA programs are there which take a single number, n, as input and return a 0 or 1 as their result?

Infinitely many. Since there are clearly infinitely many such programs which contain a loop of the form

for i in 1..m loop
    s := 0;
end loop;

`Counting' is all right when the sets being compared are finite.

`Counting' does not appear to tell us anything when both sets are infinite.

Q: What do we actually do when we `count' the number of items in a set?

A: To each item we attach a unique `label': 1,2,3,4,...

Thus, in comparing the sizes of two finite sets, A and B, we attach `labels' to the items in A, and `labels' to the items in B and we say that

A is larger than B

if we use more `labels' (i.e. numbers) with A than we do with B .

But why use `numbers' as the `labels' to count with?

We could use items in A to label items in B (i.e. count) and, if A is larger than B, then at least one item in A will not have been used to label any item in B.

And we can `count' B by using the `labels'

ALF for AVA

BABS for BOB

CALUM for CAROL

DORA for DAVE

ED for ELLA

FIONA for FRANK

and now GARY has not been used to label any element of B and so we can say that

A is larger than B'

What if both A and B are infinite sets?

If it can be shown that for every way of labelling items in B with items in A there is at least one item of A which is not used as a label then, again, we can state that:

A is larger than B'

Diagonalisation

is a method of argument that allows this to be shown.

Let:

A = The set of all decision problems, f : N -> {0,1}.

B = The set of all ADA programs.

We can enumerate (i.e. list) every single ADA program. That is there is an algorithm which will output every single different ADA program in turn, e.g.

i := 1;
loop
   if ADA_PROGRAM ( i ) = 1 then
      put ( Binary representation of i );
    end if;
    i :=i + 1;
end loop;

(Recall that ADA_PROGRAM ( n ) is the decision problem which tests if n written in binary corresponds to the ASCII character representation of a compiling ADA program)

Thus we can talk about a

`First' ADA program, P1

`Second' ADA program, P2

`Third' ADA program, P3

...

`n''th ADA program, Pn

...

i.e. the 1st, 2nd, 3rd, ..., nth, ..., program that is output by the procedure above.

And so we can form an infinite table, T:

1 2 3 .. n ..

P₁ *P₁ (1) P₁ (2) P₁(3) .. P₁(n) ..

P₂ P₂ (1) *P₂ (2) P₂ (3) .. P₂ (n) ..

P₃ P₃ (1) P₃ (2) *P₃ (3) .. P₃ (n) ..

.. .. .. .. ..

.. .. .. .. ..

P_n P_n (1) P_n (2) P_n (3) .. *P_n (n) ..

.. .. .. .. ..

.. .. .. .. ..

If it is the case that every decision problem, f, is computed by at least one ADA program, then in the table above which `labels' each ADA program with a decision problem:

For every decision problem f

There is an ADA program, Pf,

which is labelled with f.

i.e. There is a row, Pf, of the table such that Pf( n ) = f( n ) for all n.

What about the following decision problem?

DIAG is well-defined, since our table, T, above exists.

BUT

DIAG cannot be the decision problem that is labelling

P1 (DIAG(1) /= P1(1))

OR that labelling P2 (DIAG(2) /= P2(2))

OR that labelling P3 (DIAG(3) /= P3(3))

...

OR that labelling Pn (DIAG(n) /= Pn(n)).

And so, the decision problem, DIAG, does not label any Pn, in the table T.

Now matter how the ADA programs are ordered, we can always define a decision problem, DIAG, as above, which does not label any program in the ordering.

In consequence:

There exist decision problems

which cannot be solved by ANY ADA-program

There exist decision problems

for which NO ALGORITHM AT ALL EXISTS

Decision problems for which no algorithm exists are called:

Undecidable

Diagonalisation indicates that undecidable decision problems

exist

It does not help in finding an

explicitly defined

undecidable decision problem.

That is: It does not identify an undecidable decision problem whose input and output behaviour can be given with a finite description.

The decision problems

EVEN
PRIME
ADA_PROGRAM

are explicitly defined (we can give a finite description of each problem; we do not have to state each individual output for each individual input: that would be an infinite length description)

But, all three of these are obviously,

decidable

i.e. an algorithm (ADA program) can be written to solve them.

Alan Turing in 1936 exhibited the first explicit example of an undecidable decision problem:

The Halting Problem

The Halting Problem, HP, is the following decision problem:

Input: An ADA program, P, (which takes as input a single number, n) and an input w for P.

Output: 1 if P comes to a halt when run with input w ; 0 if P never halts when started with input w.

It may be objected that HP is not a decision problem in the sense that we have been using to date since:

It has 2 items of input instead of one.
One of its inputs is a `program' not a number.

However we can define an equivalent decision problem HP2 to which such objections do not apply.

HP2 takes as input a single number, m. This number is a valid input if

m has a binary representation of the form

x 00000000 y
where x and y are binary strings;
x is the ASCII binary representation of an ADA program, Px say, which takes a single number as input and returns a zero or one as output.

Clearly

The problem of checking whether an input is valid is decidable.
No algorithm exists for HP if and only if no algorithm exists for HP2 (since we can translate from inputs for HP to inputs for HP2 and vice-versa).

(Note: the `encoding trick' described above can be applied to transform any decision problem with `multiple inputs' into an equivalent one with a single number as input). Given this fact we will deal with the more `natural' problem HP instead of the precisely encoded version HP2.

THEOREM

The Halting Problem (HP) is undecidable

Thus there is no algorithm which solves the Halting Problem.

PROOF

Suppose that there is an ADA program, HALT, which solves the Halting Problem, HP. That is:

There is an ADA program, HALT, which

Reads an ADA program, P, and a single number, w, as its input.
If P halts when given w as input then HALT returns 1 as output.
If P does not halt when given w as input then HALT returns 0 as output.

Since HALT is assumed to exist we can write an ADA program, BAD say, which uses HALT as a sub-procedure.

procedure BAD is
   Q : integer;
   function HALT (P integer; 
                        w integer) 
                       return boolean is
      if ADA_PROGRAM(P) then
       --**********************************
       -- Halting problem program goes
       -- here.
       --*********************************
      end if;
  end HALT;
  begin
     get (Q);
     if HALT (Q,Q) then
        loop
           put ("NOT OK");
        end loop;
     else
        put ("OK");
     end if;
  end BAD;

What does BAD do?

BAD reads a single number, Q, as input.
If Q, written in binary, corresponds to an ADA program which takes a single number as input then:
BAD tests whether the program represented by Q halts when given th e number Q as input (using the Halting Problem procedure, HALT, to perform the test).
If the Halting problem procedure says that Q halts when started with input Q then BAD goes into a non-terminating loop.
If the Halting problem procedure says that Q fails to halt when started with input Q then BAD prints a message and stops.

HP (which HALT solves) takes an ADA program and single number as input.

The input for BAD should be a number which (when written in binary) is an ADA program which takes a single number as input.
BAD is an ADA program (since HALT is an ADA program).

THEREFORE

There is a number,

BAD_NUMBER,

which when written in binary corresponds exactly to the ASCII representation of the ADA program BAD

BAD_NUMBER is therefore a valid input for the ADA program BAD.

What happens if BAD is given

BAD_NUMBER

as input?

EITHER

BAD halts on input

BAD_NUMBER

OR

BAD does NOT halt on input

BAD_NUMBER

Suppose

BAD halts on input BAD_NUMBER.

This can only happen if

HALT( BAD_NUMBER, BAD_NUMBER )

is false,

But what does this mean? It means that The program corresponding to BAD_NUMBER does not halt on input BAD_NUMBER.

But `the program corresponding to BAD_NUMBER' is BAD i.e.

HALT( BAD_NUMBER, BAD_NUMBER )

is false,

means that

BAD does NOT halt on input

BAD_NUMBER

On the other hand, suppose

BAD does NOT halt on input

BAD_NUMBER.

This can only happen if

HALT( BAD_NUMBER, BAD_NUMBER )

is true,

But what does this mean? It means that

The program corresponding to BAD_NUMBER halts on input BAD_NUMBER.

But `the program corresponding to BAD_NUMBER' is BAD i.e.

HALT( BAD_NUMBER, BAD_NUMBER )

is true,

means that

BAD DOES halt on input BAD_NUMBER

In summary we have shown that:

If a program to solve the Halting problem, HP, exists then the program, BAD exists.
BAD on any valid input either halts or does not halt.
BAD_NUMBER is a valid input for BAD.
BAD on input BAD_NUMBER fails to halt only if BAD on input BAD_NUMBER halts.
BAD on input BAD_NUMBER halts only if BAD on input BAD_NUMBER fails to halt.

We can therefore conclude that

No program to solve the Halting Problem, HP, can be written

i.e.

The Halting Problem is undecidable.

Computational Complexity Theory

Computability Theory is concerned with the question

For which decision problems do algorithms exist

Computational Complexity Theory is concerned with the question

For which decision problems do efficient algorithms exist

This raises the questions:

What `resources' do we wish to be employed `efficiently'
What do we mean by `efficient'?

The two significant `resources' (or complexity measures) of interest are

TIME

and

SPACE

i.e. The number of steps an algorithm takes (in the worst-case) to return its answer. The amount of memory needed in which to run the algorithm.

Recall that a decision problem takes as input a finite length binary string and returns as output a 0 or a 1, hence a decision problem

f : { 0, 1 }^ -> { 0, 1 }*

may be seen as an infinite sequence of decision problems [fn], where

fn : { 0, 1 }^n -> { 0, 1 }

is the problem f limited to inputs containing exactly n bits.

We measure the running time of an algorithm as the number of steps taken on inputs of length n

That is, if A is an algorithm (ADA program, say) that solves a decision problem, f, the running time of A is a function R : N -> N, so that R( n ) is the worst-case number of steps taken on inputs of length n.

The Time Complexity of a decision problem, f, is the running time of the `best' algorithm A for f.

Thus we say that a decision problem, f, has

Time Complexity T( n )

if there is an an algorithm, A, for f such that

The number of steps taken by A on inputs of length n is always less than or equal to T( n )

The Space Complexity of a decision problem, f, is the amount of memory used by the `best' algorithm A for f.

Thus we say that a decision problem, f, has

Space Complexity S( n )

if there is an an algorithm, A, for f such that

The number of memory locations used by A on inputs of length n is always less than or equal to S( n )

These definitions allow us to classify decision problems according to their Time (Space) Complexity.

Thus we can define complexity classes, e.g.

TIME ( n ) is the set of decision problems for which an algorithm running in at most n steps on inputs of length n exists.

TIME ( n^2 ) is the set of decision problems for which an algorithm running in at most n^2 steps on inputs of length n exists.

TIME ( 2^n ) is the set of decision problems for which an algorithm running in at most 2^n steps on inputs of length n exists.

And in general,

TIME ( T(n) )

is the set of decision problems for which an algorithm running in at most T(n) steps on all inputs of length n exists.

Which complexity classes correspond to decision problems with `efficient' (for run-time) algorithms?

Intuitively we regard an algorithm with worst-case run-time of n, n^2, n^3, etc as being `efficient' since the time taken to return an answer for input of length n is `acceptable' even for large values of n; and the amount of additional computation needed for inputs of length n+1 does not `greatly exceed' the time on inputs of length n.

On the other hand, intuitively, we would regard an algorithm with worst-case run-time of 2^n, 3^n, n^n, n, n^{logn}, etc as being `inefficient' since even for moderate values of n the time taken to return an answer on input of length n is `unacceptable'; and the amount of additional computation needed for inputs of length n+1 considerably exceeds the time on inputs of length n.

How can we distinguish run-times in the second category from run-times in the first?

All of the run-times in the first category have the following property:

For each `efficient' run-time there is a constant, k, such that the running time is at most n^k when n is large enough.

A decision problem, f, for which there is an algorithm taking at most n^k steps on inputs of length n (i.e. f in TIME ( n^k )) is called a

Polynomial Time Complexity Decision Problem

All of the run-times in the second category do not have this property, i. e.

For each `inefficient' run-time there is NO constant, k, such that the running time is at most n^k when n is large enough.

A decision problem, f, for which there is NO algorithm taking at most n^k steps on inputs of length n (i.e. f is not in TIME ( n^k )) is called a

Non-Polynomial Time Complexity Decision Problem

It is possible to show (by diagonalisation) that for all run-time functions, T(n):

There exist decision problems, f, such that the Time-Complexity of f is greater than T(n)

(thus there are decision problems which do have non-polynomial time complexity)

There are very few `natural' examples of decision problems for which such lower bounds have been proved.

A major open problem in Computational Complexity Theory is to develop arguments by which important computational problems can have their time complexity described exactly.

For `almost all' `interesting' problems no lower bound on Time Complexity larger than OMEGA( n ) has been obtained.

Example Open Problems

We have seen that multiplication of 2 n-digit numbers has Time Complexity at worst n^{log3}. It is conjectured that

Multiplication is in TIME( n log n )

This conjecture remains unproven.

Multiplication of 2 n by n integer matrices can be done in time O(N^1.81) (where N = n^2). It is conjectured that

Matrix-Multiplication is in TIME( N log N )

Again this remains unproven.

It is unknown whether there is a polynomial time algorithm that can determine if a given n-digit number is a Prime Number.

It is unknown whether there is a polynomial time algorithm that will factorise an n digit number.

Computational Complexity Theory involves a large number of subfields each of which is ultimately concerned with problems such as those above, e.g.

Algebraic Complexity Theory

Complexity of Probabilistic Algorithms

Complexity of Parallel Algorithms

Machine-Based Complexity Theory

Structural Complexity Theory

Complexity of Approximation Algorithms

Axiomatic Complexity Theory

Complexity of Counting and Enumeration

Average Case Complexity

Complexity Theory of Boolean Functions

The study of the last field is over 50 years old.

It is convenient to use a single symbol to denote the class of all decision problems for which time efficient algorithms exist.


P  =  { f : f has a polynomial time algorithm }
   =  Union over all k TIME ( n^k )

The Complexity Class NP

Consider the following:

Composite:

Input: n-bit number, x

Output: 1 if x is a composite number; 0 otherwise.

Satisfiability: (SAT)

Input: A Boolean expression, F, over n variables (x1,. ..,xn)

Output: 1 if the variables of F can be fixed to values which make F equal true; 0 otherwise.

Hamiltonian Cycle (HC)

Input: n-node graph, G( V, E )

Output: 1 if there is a cycle of edges in G which includes each of the n nodes exactly once; 0 otherwise.

CLIQUE

Input: n-node graph G( V, E ); positive integer k

Output: 1 if there is a set of k nodes, W, in V such that every pair of nodes in W is joined by an edge in E; 0 otherwise.

3-Colourability (3-COL)

Input: n-node graph G(V,E)

Output: 1 if each node of G can be assigned one of three colours in such a way that no two nodes joined by an edge are given the same colour; 0 otherwise.

All 5 of these (and infinitely many other) decision problems share a certain property.

They are

`Checkable' by efficient (polynomial time) algorithms

What does this mean?

In order to solve each of the decision problems above we have to demonstrate the existence of a particular object which has a certain relationship to the input.

That is, to show that:

x, is composite we `only' have to find a factor of x.

F(X) is satisfiable we `only' have to find an assignment to X that makes F true.

G(V,E) is hamiltonian we `only' have to find a suitable cycle in G.

G(V,E) has a k-clique we `only' have to find a suitable set of k nodes in V.

G(V,E) is 3-colourable we `only' have to find a suitable colouring of V

So for the examples:

Given an input instance, I, (number, Boolean expression, graph) one has to find a witness, W (number, assignment, cycle, set of nodes, colouring) such that a particular relationship holds between the witness and the input.

e.g.

That W divides I with no remainder.
That W makes I take the value true.
That W is a cycle in I containing each node of I exactly once.
That W is a set of k nodes in I each pair of which is connected by an edge of I.
That W does not assign the same colour to two nodes in I which are joined by an edge of I.

How difficult is it to decide if something is indeed a valid witness for instances to these decision problems?

It is easy.

One can check if :

W is a factor of I (by division)
W satisfies I (by evaluating the expression)
W is a hamiltonian cycle in I (by testing if I contains all the edges in W)
W is a k-clique in I (test if each pair in W is joined by an edge of I)
W is a colouring of I (test if each edge in I joins nodes with different colours in W)

All of these can be carried out by efficient algorithms.

In summary:

For any decision problem, f, with each possible input size, n, there is a set of potential witnesses Wn.
To solve a decision problem `all' that is needed is an algorithm to determine if there is a genuine witness, W is in Wn for any given input instance I of size n.

In principle, one could do this by going through each element, W of Wn, in turn and checking if W is a genuine witness for I.

However, the question

`Is W a genuine witness for I' (for the decision problem f)

(e.g.

Is y a factor of x? (for Composite)
Does alpha make F(X) equal true? (for SAT)
Is C a hamiltonian cycle in G(V,E)? (for HC)
Is K a k-clique in G(V,E) (for Clique)
Is chi is 3-colouring of G(V,E) (for 3-Col) )

is itself a decision problem (based on f)

So for any decision problem, f, we can define another decision problem, CHECK(f)

CHECK(f):

Input: I input instance for f; W possible witness that f(I)=1.

Output: 1 if W is a genuine witness that f(I)=1; 0 otherwise

Now we have seen that

CHECK(Composite)
CHECK(SAT)
CHECK(HC)
CHECK(Clique)
CHECK(3-Col)

are all polynomial time decidable, i.e. all of these have efficient algorithms.

The notation, NP, denotes the class of all decision problems, f, with the property that CHECK(f) is polynomial time decidable,

NP = { f : CHECK(f) is in P }

Thus for decision problems we have two `variants'

Finding version of f (f itself)

which asks

Is there any witness W for I (with f)

and

Verifying version of f (Check(f))

which asks

Is this specific W a witness for I (with f)

P is the class of problems with efficient `finding' algorithms.

NP is the class of problems with efficient `verifying' algorithms.

How are P and NP related?

If f is in P then certainly Check(f) is in P and so f is in NP also.

Thus,

P is a subset of NP

Undoubtedly the most important open question in modern Computational Complexity Theory is:

Is P a proper subset of NP?

i.e. Are there decision problems f such that Check(f) is in P (i.e. f is in NP) but for which f is not in P.

Or, in informal terms,

Are there decision problems f such that checking a solution (witness) is much easier (polynomial time) than finding a solution?

The following problems are a fraction of the tens of thousands of problems which are known to be in NP but for which no efficient algorithm has been found (despite, in some cases, centuries of work on them)

Hamiltonian Cycle
Satisfiability
3-Colouring
Clique
Travelling Salesman Problem
Integer Knapsack Problem
Longest Path in a Graph
Edge Colouring
Optimal Scheduling
Minimising Boolean Expressions
Minimum Graph Partition
Composite Number
Primality

With the exception of the last 2, it is generally believed that no efficient algorithm exists for any of these decision problems. i.e. They are all intractable.

This has yet to be proved.

The concept of NP-complete decision problems offers a possible route to resolving whether P = NP or P = NP.

NP-completeness

The idea underlying the concept of NP-complete problems is to describe formally what can be understood as the

`most difficult' problems in NP

That NP-complete problems are (intuitively) the most difficult in NP follows from the fact that we may prove

P = NP

if and only if some NP-complete problem, f, has a polynomial-time algorithm

Suppose S and T are two decision problems and suppose we can construct an ADA program (algorithm) to solve T that is of the form

procedure T is
   x : T_input_type;
   y : S_input_type;
   function S ( z : S_input_type) 
                    return boolean is
    --***************************************
    -- ADA function for decision problem S
    -- goes in here.
    --***************************************
   end S;
   function transform ( t : T_input_type)
                  return S_input_type is
    --*****************************************
    -- ADA function which transforms an
    -- input instance for T into an input
    -- instance for S.
    --*****************************************
   end transform;
   begin
      get (x);
      y := transform (x);
      if S(y) then
          put (1);
      else
          put (0);
      end if;
  end T;

In this transform is a function which takes an input instance for T and changes it into an input instance for S. For example if T is a decision problem on numbers and S is one on graphs, then transform given a number would construct a graph using that number.

Now suppose that both:

transform and S have efficient (polynomial) algorithms

Then the procedure above gives an efficient algorithm for T.

It follows that (with the above procedure form)

There is an efficient algorithm for T

if there is an efficient algorithm for S

But, more importantly,

If we can prove that there is NO efficient algorithm for S

then we have also proved that there is NO efficient algorithm for T

This means that if we have of an efficient procedure, tau which

transforms input instances x for T into input instances, tau( x), for S
is such that T( x ) = 1 if and only if S( tau( x ) ) = 1

Then we have established a relationship between the Time-Complexity of S and the Time-Complexity of T.

If S and T are decision problems for which such an efficient transformation procedure from inputs for T to inputs for S exists then we say that

The decision problem T is polynomially reducible to the decision problem S

This relationship is denoted

T < =p S

How does the idea of `polynomial reducibility' help in addressing the question of whether P equals NP?

Suppose we have a decision problem, f, for which we can prove the following:

f is in NP, i.e. f there is an efficient checking algorithm for f.
For every other decision problem, g in NP, it is the case that g < =p f,

i.e. for every decision problem, g with an efficient checking algorithm there is an efficient procedure, tau-g, that transforms inputs for g into inputs for f.

Then if there are decision problems in NP for which efficient algorithms do not exist then

the decision problem f must be one such problem

In other words, for such a decision problem f:

P = NP if and only if f has an efficient algorithm (i.e. f is in P)

A decision problem, f, which has these properties, i.e.

f is in NP
For all g in NP, g < =p f

(f has an efficient checking algorithm and every problem with an efficient checking algorithm can be efficiently transformed to f) is called an

NP-complete decision problem

The NP-complete decision problems are the `most difficult' problems in NP in the sense that if

Any single NP-complete problem does not have an efficient algorithm

then NO NP-complete problem has an efficient algorithm Conversely

If we find an efficient algorithm for just one NP-complete problem

then we can construct efficient algorithms for all decision problems in NP

Problem: The definition of NP-completeness requires the following to be done in order to show that a problem f is NP-complete

We have to find an efficient checking algorithm for f (i.e. prove f is in NP)
We have to show that every problem with an efficient checking algorithm can be transformed into f (i.e. prove for all g in NP, g < =p f)

The first is usually (relatively) easy.

But for the second, how can we reduce infinitely many decision problems to a single problem?

Before dealing with this we can note the following: suppose we have shown for two decision problems, f and g say, that:

f is NP-complete.
g is in NP
f < =p g

Then we have also shown that g is NP-complete.

Why?

So we have a polynomial reduction from every problem in NP to g by

Converting an input instance x for h (in NP) to one tau(h)(x) for f.
Converting the input tau(h)(x) for f to an input instance for g.

The first is possible since f is NP-complete. The second since there is a polynomial reduction from f to g.

This argument shows that once we have identified a single NP-complete decision problem, f, in order to prove that another decision problem, g, is NP-complete `all' we have to do is find an efficient algorithm for transforming input instances for f into input instances for g, i.e. prove

f < =p g

But, this still leaves the problem of finding a `first' NP-complete problem, that is of finding some way of

transforming infinitely many decision problems into just one decision problem

In 1971 Steven Cook proved one of the most important results in modern computational complexity theory. This result is (now) known as

Cook's Theorem

SAT is NP-complete

Proof: See next section.

Within a few months of Cook's result dozens of new NP-complete problems had been identified. By 1994 well over 10,000 basic NP-complete decision problems were known.

Thus it is known that the following problems are all NP-complete

CLIQUE
Hamiltonian Cycle
Travelling Salesman Problem
3-Colouring
Set Partitioning
Edge Colouring
Longest Path
Integer Knapsack

A proof that a decision problem is NP-complete is accepted as evidence that the problem is intractable since a fast method of solving a single NP-complete problem would immediately give fast algorithms for all NP-complete problems.

It generally though to be extremely unlikely that any NP-complete problem has a fast algorithm.

It has still to be proved that any NP-complete problem does not have an efficient algorithm, i.e. the question of whether P=NP is still open.

While most of the classical decision problems have now been determined as being in P or being NP-complete a number of important problems are still open. In particular

Primality

Composite Number

Both are known to be in NP. Neither has been shown to be NP-complete. It is believed that both are in fact solvable by efficient algorithms.

Non-deterministic Ada Programs

A program can alter one `bit' of memory at a time.

The current bit is indexed by an integer value: at_cell

The next bit is always adjacent to the previous one.

The program is in one of a finite number, s say, of states.

The next state entered is chosen non-deterministically from a choice of two.

A complete program is specified when the action for each state and bit value is defined.

State s is the unique accept state.

procedure nd_ada_f (T:integer) is
  type workspace is array (integer range <>) of 01B;
  n : integer;
  begin
    get(n);
    declare
      w : workspace( -T..T ):=(others=>B);
      procedure do_it_on ( w : in out workspace;
                          position : in out integer;
                          state : in out integer);
        begin
          case state is
            when k =>
              if w(position)=B then
                w(position):=(0 or 1)_k; position:=position (+-1)_k;
                state:=choose(next₁, next₂)_k;
              elsif w(position)=0 then
                w(position):=(0 or 1)_k; position:=position (+-1)_k;
                state:=choose(next₁, next₂)_k;
              else
                w(position):=(0 or 1)_k; position:=position (+-1)_k;
                state:=choose(next₁, next₂)_k;
              end if;
          end case;
       end do_it_on;
      begin
        at_cell:=1; in_state:=1; get(w(1..n));
        for i in 1.. T loop
          do_it_on(w,at_cell,in_state);
        end loop;
        if in_state=accept then return true; end if;
      end;
  end nd_ada_f;

The decision problem NCOMP is:

Input: A non-deterministic program, AP;

An input x in <0,1>ⁿ for AP;

A time bound T.

Question: Is there a `computation path' of AP on x that accepts in T iterations?

Computation Tree

An Alternative to NP

ADA_NP is the class of decision problems, f, for which there are:

A non-deterministic Ada program, NA, and
A polynomial p(n)

that satisfy:

For all x in {0,1}ⁿ such that f(x)=1, NCOMP( NA,x,p(n))=1

Theorem: ADA_NP=NP

Proof:

a) ADA_NP is a subset of NP.

Suppose that f is in ADA_NP.

Let NA_f and p(n) be the program and polynomial such that for all x in {0,1}ⁿ:

f(x)=1 if and only if NCOMP(NA_f, x, p(n))=1

A set of witnesses to f(x)=1 is the set of computation paths NACP, for NA_f.

Check_f(CP,x)=1 if CP is a valid accept computation for NA_f on input x.

Given NA_f, CP and x this can be checked by simulating NA_f on x using the computation path CP.

This test can be done in polynomial-time, so Check_f is in P, i.e. f is in NP.

b) NP is a subset of ADA_NP

Let f is in NP, so that Check_f is in P.

Let W_n be the witness set for x in {0,1}ⁿ.

For any w in W_n it must hold that |w|<= p(n) for some polynomial p(n).

(otherwise Check_f(x,w) could not be carried out in P).

Let W_n be encoded in binary. Then

For all x in {0,1}ⁿ there are at most 2^p(n) `witnesses to f(x)=1'.

NA_f is the program which:

In its first p(n) iterations generates each w in W_n on some computation path.
In the next q(n) iterations runs the Check_f(x,w) algorithm (reaching accept iff Check_f(x,w)=1).

NCOMP(NA_f,x,p(n)+q(n))=1 if and only if NA_f has a path of length p(n)+q(n) ending in accept on input x.

iff NA_f constructs a witness w for x after p(n) moves which checks correctly in q(n) steps

iff Check_f is in P.

In summary, Check_f in P implies that f is in ADA_NP.

Hence (from (a) and (b)), ADA_NP=NP.

The Satisfiability Problem (SAT) is:

Input: A set of n Boolean variables X_n=<^x₁,x₂,...,x_n>; A Boolean formula F over X_n, (whose length is at most r(n) for some polynomial r)

Question: Is there an assignment, a of true/false to the variables X_n with the property that F(a)=true?

Cook's Theorem

SAT is NP-complete.

Proof: We have to do 2 things:

Prove that SAT belongs to NP;
Show that for any decision problem f in NP: f<=_pSAT:

i.e. Any instance x of f can be transformed into an instance F(x) of SAT such that f(x)=1 iff SAT (F(x))=1.

That (a) holds has already been shown.

Now let f be any decision problem in NP.

We know that f also belongs to ADA_NP (since NP=ADA_NP).

So there is a non-deterministic Ada program NA_f for f and a polynomial p(n) such that, for any x in {0,1}ⁿ, f(x)=1 iff NCOMP( NA_f,x,p(n))=1

To show that SAT is NP-complete, we build an instance of SAT (i.e. a Boolean formula), H (NA_f) (X_m(n)) with the property

H(NA_f)(X_m(n)) is satisfiable

iff

NCOMP(NA_f,x,p(n))=1.

What do we know about NA_f on input x in {0,1}ⁿ?

At any value i of its main for loop its status is fully described by

A1) The value of in_state.

A2) The value of at_cell.

A3) The content (0, 1, or B) of each `bit' w(k).

These can change only in accordance with what the procedure do_it_on allows, given the initial n-bit input x.

So, suppose we wish to know if a given sequence of p(n) configurations (i.e. state, position, memory) is a valid accepting computation path on NA_f.

Then such a sequence must embody all of the following characteristics at all times i:

C1) At time i=0: in_state=1, w(1..n)=x, all other bits equal B; at_cell=1.

C2) in_state has exactly one value at time i.

C3) at_cell has exactly one value at time i.

C4) w(k) has exactly one value at time i, (for all k)

C5) in_state=s (accept state) at i=p(n).

C6) w(-p(n)...p(n)), at_cell, and in_state must change from time i to time i+1 according to the case statement in do_it_on of NA_f.

We construct H(NA_f) with the Boolean variables: (below 0 <= i <= p(n))

V1) 3(2p(n)+1)(p(n)+1) variables

W(v,k,i) with v in {0,1,B}, -p(n) <= k <= p(n)

V2) s(p(n)+1) variables

S(st,i) 1 <= st <= s,

V3) (2p(n)+1)(p(n)+1) variables

A(k,i) -p(n) <= k <= p(n)

H(NA_f(x)) is built so that: in a satisfying assignment those variables which which are assigned the value true define a valid computation, CP of NA_f on x:

V1T) W(v,k,i)=1 iff in CP: at time i, `bit' w(k) contains v.

V2T) S(st,i)=1 iff in CP: at time i, in_state=st.

V3T) A(k,i)=1 iff in CP: at time i, at_cell=k.

Thus the whose only satisfying assignments of H(NA_f)(V1,V2,V3) will be as given by (V1T, V2T, V3T) and define legal computations of NA_f on x, i.e. capture the conditions (C1) through (C6)

(Below & denotes logical and; + denotes logical or. For brevity we assume and is implied).

C1)

S(1,0)A(1,0) &_k=1ⁿ W( x_i,k,0) & &_k=-p(n)⁰ W(B,k,0) &_k=n+1^p(n) W(B,k,0)

C2) `exactly one value' == `at least one value' and at most one value.

&_i=0^p(n)( +_st=1^s S(st,i) ) & &_{1<=c< d<=s} (-S(c,i) + -S(d,i))

C3) Similarly,

&_i=0^p(n) ( +_k=-p(n)^p(n) A(k, i) ) & &_{-p(n) <=c< d<=p(n)} ( -A(c,i) + -A(d,i) )

C4) Again, similarly,

&_i=0^p(n) &_k=-p(n)^p(n) ( +_{{v in {0,1,B }} W(v,k,i) ) &_v₁=v₂ ( -W( v₁,k,i) + -W( v₂,k,i) )

C5) S(s,p(n))

C6) is, superficially, more complicated, however at time i: from

in_state=st, at_cell=k, w(k) is one of {0,1,B}

the case statement determines the values at time i+1 of

the new value of at_cell

the content of w(k),

the two possible next states to jump to.

Thus each of the s states is described by a `move' function:

delta (st,v) = ( ( st₁ , st₂ ), nv, ch )

where v is in {0,1,B}, nv is in {0,1}, ch=(+or-)1

We thus have (C6) as

&_i=1^p(n)-1 &_j=-p(n)^p(n) &_{{delta (st,v)}} MOVE(i,j,st,v, st₁ , st₂,nv,ch)

where MOVE(i,j,st,v, st₁, st₂, nv , ch) is,

S(st,i)& A(j,i)& W(v,j,i) -> ( S( st₁,i+1) & A(j+ch,i+1) & W(nv,j,i+1) + S( st₂,i+1) & A(j+ch,i+1) & W(nv,j,i+1) )

Thus,

If at time i

NA_f is in_state st

at_cell j which contains v then

at time i+1,

NA_f is in_state st₁ or st₂

at_cell j+ch with cell j containing nv.

So the final formula H(NA_f) (V1,V2,V3) is

C1 & C2 & C3 & C4 & C5 & C6 ( V1,V2,V3 )

H(NA_f)(V1,V2,V3) is satisfiable

iff

NCOMP (NA_f,x,p(n))=1.

If NCOMP( NA_f,x, p(n))=1 setting the variables of V1, V2, V3 that encode the accepting computation path of NA_f gives a satisfying assignment of H;

H guarantees that the true variables of any satisfying assignment must represent a valid computation of NA_f on input x that ends in accept after p(n) steps. i.e. a witness that NCOMP ( NA_f, x, p(n) )=1 PED Home Page

	1	2	3	..	n	..
P₁	*P₁ (1)	P₁ (2)	P₁(3)	..	P₁(n)	..
P₂	P₂ (1)	*P₂ (2)	P₂ (3)	..	P₂ (n)	..
P₃	P₃ (1)	P₃ (2)	*P₃ (3)	..	P₃ (n)	..
..		..		..	..	..
..		..		..	..	..
P_n	P_n (1)	P_n (2)	P_n (3)	..	*P_n (n)	..
..		..		..	..	..
..		..		..	..	..

Comparing Algorithms