Annotated List of Selected NP-complete Problems

## An Annotated List of Selected NP-complete Problems

The standard textbook on NP-completeness is:

Michael Garey and David Johnson: Computers and Intractability - A Guide to the Theory of NP-completeness; Freeman, 1979.

David Johnson also runs a column in the journal Journal of Algorithms (in the HCL; there is an on-line bibliography of all issues)

On the Web the following sites may be of interest:

Or trying giving `NP-complete' or `NP and complete' as a search term to

(This is basically a bibliography database, but, you can click on the `on-line papers' button to list electronically readable full texts).

The Center for Discrete Mathematics and Theoretical Computer Science (DIMACS) gives links to technical papers, abstracts, and other information concerning algorithms, approximation techniques and properties of NP-complete problems. Other general algorithmics and complexity related sites may be found at:

A number of relevant journals are available on-line through the University Library Web Pages (these are only accessible to members of Liverpool University). The following journals are all available and publish research papers in the general areas of Algorithmics and Complexity Theory:

• Acta Informatica
• Computer Journal
• Information and Computation
• Journal of Algorithms
• Journal of Computer and System Sciences
• SIAM journal on computing
In the list of NP-complete problems below, the form of a typical entry is as follows:

Number: 0

Name: Wombat Eating Assignment (WEA) [DU0] 2

Input: A set W of n wombats; a forest T of m eucalyptus trees (m>=n); a habitation mapping, mu:W-> subsets of T, such that for each wombat, w, mu (w) defines the set of eucalyptus trees, t in T, in which the wombat lives.

Question: Is there a mapping, E, from the set of wombats to the set of trees which satisfies both:

• Every wombat is assigned a unique eucalyptus tree by E, i.e. if E(u)=t then E(v)/=t for all distinct pairs of wombats u and v;
• E is consistent with the habitation mapping, mu, i.e. if E(w)=t then t belongs to the set of trees, mu(w) in which the wombat lives?

Comments: Can be solved by efficient methods using (0,1)-network flow maximisation methods applied to bipartite graphs, i.e. it's a matching problem.

The first line gives the unique identifying number for the problem (as described above). The second line gives the name of the problem and (if there is one) its usual abbreviation. These are how the problem is usually referred to in research papers, textbooks, etc. If you are looking for material on the Web, then these should give reasonable terms to supply to search engines. The code in square brackets is a reference to the classification in Garey and Johnson's book (with the exceptions of Problem 14, 86, 87, 88).

The next part of the description indicates what a typical instance (i.e. input) consists of: notice, as in the example above, that this will usually consist of a number of distinct objects which must be represented in trying to solve the problem: sometimes these can be quite involved structures, such as graphs, sets, mappings, logical expressions, etc; and, sometimes quite simple forms such as a single integer. The main part of a problem definition is always formulated as a question. This is the core of the decision problem description and defines precisely what property of the input instance must be determined in solving it. Thus, in the example given, one is trying to decide if a given input instance is such that a mapping from wombats to trees, satisfying certain conditions is possible. The final part (which is not always present) gives some (not necessarily useful) comments regarding the problem.

Notice that one way in which the WEA problem could be solved is by exhaustively considering each distinct way of mapping from single wombats to single trees until either one found an assignment that met the required conditions given in the Question; or one had no further mappings left to consider. Clearly the former case would lead to the answer true being returned; the latter to the answer false being given.

### Selected NP-Complete Problems

Number: 1

Name: 3-Satisfiability (3-SAT) [LO2] 2

Input: A set of m clauses - C1 ,C2,...,Cm - over a set of n Boolean valued variables Xn=< x1,x2,...,xn>, such that each clause depends on exactly three distinct variables from Xn. A clause being a Boolean expression of the form yi+yj+yk where each y is of the form x or -x (i.e. negation of x) with x being some variable in Xn. For example if n=4 and m=3 a possible instance could be the (set of) Boolean expressions: C1=(x1+(-x2)+(-x3)); C2=(x2+x3+(-x4)); C3=((-x1)+x3+x4);

Question: Can each variable xi of Xn be assigned a Boolean value alphai in such a way that every clause evaluates to the Boolean result true under the assignment < xi:=alphai:1<=i<=n>?

In the example instance, the assignment < x1:=true;x2:=false;x3:=true; x4:=false> is such that each of the three clauses takes the value true.

Comments: For reasons that will become clearer at the end of the course, this problem had to be Number 1. Some relevant material concerning this problem might be found in the special issue (Volume 81, 1996) of the journal Artificial Intelligence that is available in the Library.

Number: 2

Name: Graph 3-Colourability (3-COL) [GT4] 1

Input: An n-node undirected graph G(V,E) with node set V and edge set E.

Question: Can each node of G(V,E) be assigned exactly one of three colours - Red, Blue, Green - in such a way that no two nodes which are joined by an edge, are assigned the same colour?

Number: 3

Name: Monochromatic triangle [GT6] 2

Input: An n-node undirected graph G(V,E) with node set V and edge set E.

Question: Can the edges, E, of G be partitioned into two disjoint sets E1 and E2, in such a way that neither of the two graphs G1(V,E1) or G2(V,E2) contains a triangle, i.e. a set of three distinct nodes u,v,w such that {u,v}, {u,w}, {v,w} are all edges?

Number: 4

Name: Clique [GT19] 1

Input: An n-node undirected graph G(V,E) with node set V and edge set E; a positive integer k with k<=n.

Question: Does G contain a k-clique, i.e. a subset W of the nodes V such that W has size k and for each distinct pair of nodes u, v in W, {u,v} is an edge of G?

Number: 5

Name: Bipartite Subgraph [GT25] 2

Input: An n-node undirected graph G(V,E) with node set V and edge set E; a positive integer k with k<=|E|.

Question: Is there a subset, F of the edges of G, having size at least k and such that the graph H(V,F) is bipartite?

Comments: G(V,E) is bipartite if the nodes can be partitioned into two disjoint sets U and W such that every edge of G connects a node in U to a node in W, i.e. no two nodes in U (resp. W) form an edge of G.

Number: 6

Name: Vertex Cover [GT1] 1

Input: An n-node undirected graph G(V,E) with node set V and edge set E; a positive integer k with k<=n.

Question: Is there a subset W of V having size at most k and such that for every edge {u,v} in E at least one of u and v belongs to W?

Number: 7

Name: Chromatic Index (Edge Colouring) [OPEN5] 3

Input: An n-node undirected graph G(V,E) with node set V and edge set E; a positive integer k with k<=|E|.

Question: Can the edges of G be assigned exactly one of k colours in such a way that no two edges which have a common node as an endpoint are assigned the same colour?

Comments: Vizing's Theorem from Graph Theory shows that the only `hard' case for this problem is when k is equal to the maximum degree of G. The degree of a node, v, is the number of edges in the graph with v as an end-point; the degree of a graph is the maximum degree of any node in the graph. Chromatic Index was shown to be NP-complete soon after the 1979 edition of Garey and Johnson went to press, hence the OPEN categorisation.

The classical paper on sequential methods is:

• A.M. Gibbons and O.A. Ogunyode: Optimal edge-colouring of almost all simple graphs in polynomial time. Random Graphs `85: Conf. Record of 2nd International Seminar on Random Graphs and Probabilistic Methods in Combinatorics, Poznan, North-Holland, (1985).
There are also a number of methods presenting efficient parallel algorithms for various special types of graph. Some of the ideas presented in these may be useful in formulating sequential techniques, e.g.
• A.M. Gibbons and W. Rytter: Optimally edge-colouring outerplanar graphs is in NC. Theoretical Computer Science, 71, (1990), pp. 401-411
• A.M. Gibbons and W. Rytter: Fast parallel algorithms for optimal edge-colouring of some tree-structured graphs. Fundamentals of Computation Theory (FCT) `87, Springer-Verlag, 1987

Number: 8

Name: Multiprocessor Scheduling [SS8] 4

Input: A set, T, of tasks and for each task t in T a positive (integer) running time len(t). A positive integer D, called the deadline.

Question: Is there a 2-processor schedule for the tasks that completes within the deadline, D, i.e. is there a function sigma:T->N such that both of the following hold:

• For all u>=0 the number of tasks t in T for which sigma(t)<=u< sigma(t)+len(t) is at most 2.
• For all tasks t, sigma(t)+len(t)<=D?

Comments: This is the simplest avatar of a very large number of NP-complete processor scheduling problems and, unsurprisingly given its practical applications in multiprogramming Operating Systems on small parallel systems, there is an enormous range of literature on approximation and heuristic techniques to tackle it, see e.g. relevant sections of this bibliography. The formal framework given by (a) and (b) can be interpreted as meaning: a) at any given time at most two tasks are being executed; b) every task has been completed by the deadline D.

Number: 9

Name: Comparative Divisibility [AN4] 3

Input: A (strictly increasing) sequence A=< a1,a2,...,an> and a (strictly increasing) sequence B=< b1,b2,...,bm> of positive integers.

Question: Is there an integer, c, such that Divides(c,A)> Divides(c,B), where Divides(x,Y) (Y being a sequence of positive integers) is the number of elements, y in Y, for which x is an exact divisor of y?

Comments: You may think that this has an obvious fast algorithm, and, indeed the algorithm in question is obvious: what it is not is efficient. Consider: how many bits are needed to store the input data (assuming, without loss of generality, that an>=bm)? How many steps, however, is this `obvious method' taking in the worst-case? It is important to realise that representing integer values in unary is not considered to be a `reasonable' approach (the number 250-1 requires 250 digits in unary but only 50 digits in binary).

Number: 10

Name: Cyclic Ordering [MS2] 3

Input: A finite set, A, and a collection, C, of ordered triples (a,b,c) of distinct elements from A.

Question: Is there a one-to-one mapping f:A->{1,2,3,...,|A|} (i.e. a function which maps each element of A to a number between 1 and |A| with no two distinct elements of A being mapped to the same number) such that for each (a,b,c) in C one of

f(a) < f(b) < f(c) ; f(b) < f(c) < f(a) ; f(c) < f(a) < f(b)

holds?

Comments: For any triple (a,b,c) there are 6 (six) possible orderings that could result for ((a),f(b),f(c)). The question being asked is whether there is a choice of function that forces every specified triple into one of three `legal' orderings. Garey and Johnson has a typographical error in describing this problem, whereby each triple `in A' is referred to. Obviously `in C' is intended.

Number: 11

Name: Quadratic Diophantine Equations [AN8] 3

Input: Positive integers a, b, and c.

Question: Are there two positive integers x and y such that (a*x*x)+(b*y)=c?

Comments: The comments regarding Problem 9 (Comparative Divisibility) are also pertinent with respect to this problem. Again, there is an `obvious' algorithm that, on the surface, appears to be efficient and is seen not to be so only once one compares the input size (space needed to represent the input data) to the actual computation time in the worst-case.

Number: 12

Name: Maximum 2-Satisfiability [LO5] 3

Input: A set of m clauses C1,C2,...,Cm over n Boolean valued variables Xn, where each clause depends on two distinct variables from Xn; a positive integer k with k<=m.

Question: Is there an assignment of Boolean values to the variables Xn such that at least k distinct clauses take the value true under the assignment?

Comments: For relevant definitions see under Problem 1. 2-SAT, a natural variant of the problem 3-SAT described in Problem 1, can be solved in O(m) steps. The simple variation described here (whereby one asks whether at least some number of clauses can be made true simultaneously) is much more difficult.

Number: 13

Name: Register Sufficiency [PO1] 5

Input: A directed, acyclic graph G(V,A) in which each node has at most two out-going edges; a positive integer k.

Question: Is there a k or fewer register `computation' for G, i.e. is there an ordering < v1,v2,...,vn> of the (n) nodes of G and a sequence < S0;S1;...;Sn> of subsets of V which satisfy all of the following:

• For all i, Si contains at most k nodes from V.
• S0 contains no nodes; Sn contains all of the nodes in V with no in-coming edges in G (recall that G is directed).
• For each i, with 1<=i<=n: vi is in Si; Si-{vi} is a subset of the nodes in Si-1; and Si-1 contains all nodes u for which (vi,u) is an edge in A, i.e. all nodes, u, for which there is an edge from vi to u?

Comments: The background to this problem comes from developing Code Generation methods in compilers for High-Level Languages. On early architectures, performing arithmetic operations with both operands stored in registers was significantly faster than fetching an operand from main memory. Such machines, however, had only a small (16-32) number of fast registers. Thus in generating assembly code to represent the computation of a lengthy arithmetic expression, it was necessary to use registers efficiently. A common first stage in compiling such expressions was to represent the computation as a (so-called) `straight-line program' which in turn could be modelled as a directed acyclic graph. The object of the Register Sufficiency Problem is to determine if such a `straight-line program' can be evaluated using only the specified number of available registers.

Number: 14

Name: Central Slice of Half-Clique 2

Input: An 2n-node undirected graph G(V,E) with node set V and edge set E.

Question: Does either of the following hold true of G(V,E)

• G contains at least (2n2-n)/2+1 edges.
• G contains exactly (2n2-n)/2 edges and G contains an n-clique?

Comments: For the definition of `k-clique' see Problem 4 above. This specific variant of the CLIQUE problem post-dates Garey and Johnson's 1979 text by 5 years. Its classification as NP-complete is, originally, proved in:

• P.E. Dunne: Techniques for the analysis of monotone Boolean networks; Ph.D. dissertation, Univ. of Warwick, October 1984; (Theory of Computation Report No. 69, Dept. of Comp. Sci., Univ. of Warwick, 1984).
The published version appears in
• P.E. Dunne: The Complexity of Central Slice Functions. Theoretical Computer Science, 44, (1986), pp. 247-257
For problems whose inputs are encoded by N bits, there are a number of other NP-complete problems that remain so even when the only `non-trivial' inputs are those with exactly N/2 input bits equal to 1. Three further examples are given in the references cited.

Number: 15

Name: Decision Tree [MS15] 5

Input: A Boolean logic function, f, of n variables, Xn, described by its 1-points, i.e. the set of assignments, alpha=< a1,...,an> such that f(a1,...,an)=1; a positive integer K.

Question: Is there a decision tree for f that has average path length at most K?

Comments: A decision tree is a binary tree in which each non-leaf node is labelled with a variable from Xn, each leaf node is labelled either 0 or 1, the edge from a non-leaf node to its left child is labelled 0, that to its right child is labelled 1. For a given assignment of Boolean values to the variables Xn a path can be traced starting from the root of the decision tree and following the edges marked with the value of the variable labelling each node encountered. The path terminates at a leaf node whose associated label gives the value of the function. The decision tree computes a Boolean function f(Xn) if for every assignment, alpha, to the variables the path traced from the root under alpha terminates in a leaf labelled f(alpha). The average path length of a binary tree with t leaf nodes and root v is: sum from {w:w is a leaf} |Path from x to w|/t.

This problem is a simpler (but still NP-complete) version of the form given in Garey and Johnson. For relevant variations and potential heuristic approaches, the papers

• P.E. Dunne and P.H. Leng, "An algorithm for optimising signal selection in demand-driven circuit simulation", Transactions of the Society for Computer Simulation, vol. 8, no.4, pp. 269-280, 1992
• P.E. Dunne, C.J. Gittings, and P.H. Leng, "Multiprocessor simulation strategies with optimal speed-up", Inf. Proc. Letters, vol. 54, no. 1, pp. 23-33, April 1995
• P.E. Dunne, P.H. Leng, and G.F. Nwana, "On the complexity of Boolean functions computed by lazy oracles", IEEE. Trans. Comput., vol. 44, no. 4, pp. 495-502, April 1995
may provide some ideas of use.

Number: 16

Name: Shortest Common Superstring [SR9] 3

Input: A finite set R={r1,r2,...,rm} of binary strings (sequences of 0 and 1); positive integer k.

Question: Is there a binary string w of length at most k such that every string in R is a substring of w, i.e. for each r in R, w can be decomposed as w=w0rw1 where w0, w1 are (possibly empty) binary strings?

Comments: General problem allows more than two symbols (i.e. not just binary), but this simpler version remains NP-complete.

Number: 17

Name: Longest Circuit [ND28] 2

Input: n-node undirected graph G(V,E); positive integer k.

Question: Does G contain a simple cycle containing at least k nodes?

Comments: Special cases of this are the famous Travelling Salesman Problem and Hamiltonian Circuit Problem. The latter corresponds to the cases k=n; the former to the case with each graph edge being weighted and also having k=n.

Number: 18

Name: Kernel [GT57] 2

Input: n-node directed graph G(V,A).

Question: Does G possess a kernel, i.e. a subset W of the nodes V such that no two nodes in W are joined by an edge in A and such that for each node v in V-W there is a node w in W for which (w,v) is an edge in A?

Number: 19

Name: k-Closure [GT58] 2

Input: n-node directed graph G(V,A); positive integer k<=n.

Question: Is there a subset W of V having size at most k and such that for all edges (u,v) in A either u is in W or v is not in W.

Number: 20

Name: Bandwidth [GT40] 3

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Is there a linear ordering of V with bandwidth at most k, i.e. a one-to-one function f:V->{1,2,...,n} such that for all edges {u,v} in G, |f(u)-f(v)|<=k?

Number: 21

Name: Maximum Leaf Spanning Tree [ND2] 4

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Does G have a spanning tree in which at least k nodes have degree 1.

• P.E. Dunne: A result on k-valent graphs and its application to a graph embedding problem, Acta Informatica, 24, (1987), pp. 447-459
may be found a) in the HCL Library; b) to be completely incomprehensible; and c) to give a couple of ideas as regards the development of heurstic and approximation techniques.

Other papers of interest are given here

Number: 22

Name: Independent Set [GT20] 1

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Does G have an independent set of size at least k, i.e. a subset W of at least k nodes from V such that no pair of nodes in W is joined by an edge in E?

Number: 23

Name: Degree Constrained Spanning Tree [ND1] 4

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Does G have a spanning tree in which no node has degree greater than k?

Number: 24

Name: Hamiltonian Path [GT39] 1

Input: n-node undirected graph G(V,E).

Question: Is there a simple path of edges in G that contains every node in V, and thus contains exactly n-1 edges?

Comments: Material regarding the Hamiltonian circuit problem may be found to be relevant.

Number: 25

Name: Graph Partitioning [ND14] 2

Input: 2n-node undirected graph G(V,E); positive integer k<=|E|.

Question: Can the nodes of G be partitioned into 2 disjoint sets U and W each of size n and such that the total number of distinct edges in E that connect a node u in U to a node w in W is at most k?

Number: 26

Name: Cubic Subgraph [GT32] 3

Input: n-node undirected graph G(V,E).

Question: Is there a (non-empty) subset F of the edges of G such that every node in the graph H(V,F) has either degree 3 or degree 0.

Number: 27

Name: Travelling Salesman [ND22] 3-4

Input: A set C of n cities {c1,...,cn}; for each pair of cities (ci,cj) (1<=i< j<=n) a positive integer distance di,j; a positive integer B.

Question: Is there an ordering < pi(1),pi(2),...,pi(n)> of the n cities such that the value sum from i=1 to n-1 dpi(i),pi(i+1)+dpi(n),pi(1) is no more than B?

Number: 28

Name: Steiner Tree in Bipartite Graphs [ND12a] 3

Input: Bipartite graph B(V,W,E); positive integer k.

Question: Is there a subtree of B(V,W,E) that includes all of the nodes of V and has at most k edges?

Number: 29

Name: Bounded Diameter Spanning Tree [ND4] 4

Input: n-node undirected graph G(V,E); for each edge, {u,v} a positive integer weight w(u,v); positive integer B.

Question: Does G have a spanning tree T such that the total sum of the weights of edges in T is at most B and no simple path in T contains more than 5 edges?

Number: 30

Name: Optimal Linear Arrangement [GT42] 3

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Is there a one-to-one function f:V->{1,2,...,n} such that sum from {{u,v} in E} |f(u)-f(v)|<=K?

Number: 31

Name: Dominating Set [GT2] 2

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Does G contain a dominating set of size at most k, i.e. a subset W of V containing at most k nodes and such that for every node u in V-W (i.e. in V but not in W) there is a node w in W such that {u,w} is an edge of G?

Number: 32

Name: Path with Forbidden Pairs [GT54] 4

Input: n-node directed graph G(V,A); two distinct nodes s and t belonging to V; finite collection C={(a1,b1),...,(ar,br)} of pairs of nodes from V.

Question: Is there a directed path from the node s to the node t in G that contains at most one node from each pair of nodes in the collection C?

Comments: Several simplifications of this problem remain NP-complete: requiring G to be acyclic; requiring the collection C to contain only directed edges from A (rather than arbitrary pairs of nodes); requiring all the pairs to be disjoint.

Number: 33

Name: Oriented Diameter [GT64] 4

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Can a direction be placed on each edge {u,v} of E in such a way that the resulting directed graph H(V,A) is such that both of the following hold:

• H(V,A) is strongly connected, i.e. for each distinct pair of nodes, v and w, there exists a directed path from v to w in H and a directed path from w to v in H.
• The diameter of H(V,A) is at most k, i.e. for every pair of nodes v and w, there is a path of at most k edges from v to w and a path of at most k edges from w to v?

Number: 34

Name: Rural Postman [ND27] 3-4

Input: n-node undirected graph G(V,E); subset F of the edges E; positive integer k<=|E|

Question: Is there a (not necessarily simple) cycle of in G that contains every edge in F and has at most k edges in total?

Comments: A simple cycle in a graph is one in which each node visited in the cycle is visited exactly once. A non-simple cycle allows nodes to be visited more than once (although, obviously, each edge can only occur once). For example in the 5 node graph with edges ({1,2};{1,3};{1,4};{1,5};{2,3};{2,4};{2,5};{3,4}), the cycle 1-> 2-> 3-> 1 is a simple cycle; the cycle 1-> 2-> 4-> 3-> 2-> 5-> 1 is a non-simple cycle since 2 is visited twice (notice that in neither case is any edge used more than once).

Number: 35

Name: Longest Path [ND29] 2

Input: n-node undirected graph G(V,E); nodes s and t in V; positive integer k.

Question: Is there a simple path between s and t in G that contains at least k edges?

Comments: The similarity to Problem 17 should be noted.

Number: 36

Name: 3-Dimensional Matching (3DM) [SP1] 3

Input: 3 disjoint sets X, Y, and Z each comprising exactly n elements; a set M of m triples {(xi,yi,zi):1<=i<=m} such that xi is in X, yi in Y, and zi in Z, i.e. M is a subset of XxYxZ.

Question: Does M contain a matching, i.e. is there a subset Q of M such that |Q|=n and for all distinct pairs of triples (u,v,w) and (x,y,z) in Q it holds that u/=x and v/=y and w/=z.

Comments: The variant 2-dimensional matching in which 2 disjoint sets X and Y form the basis of a set of pairs, can be solved by a number of fast methods.

Number: 37

Name: Set Splitting [SP4] 3

Input: A finite set S; A collection C1,...,Cm of subsets of S.

Question: Can S be partitioned into two disjoint subsets - S1 and S2 - such that for each set Ci it holds that Ci is not a subset of S1 and Ci is not a subset of S2?

Number: 38

Name: Set Packing [SP3] 3

Input: A collection C=(C1,...,Cm) of finite sets; a positive integer k<=m.

Question: Are there k sets - D1,...,Dk - from the collection C such that for all 1<=i< j<=k, Di and Dj have no common elements?

Number: 39

Name: Exact Cover by 3-Sets (X3C) [SP2] 3

Input: A finite set X containing exactly 3n elements; a collection C of subsets of X each of which contains exactly 3 elements.

Question: Does C contain an exact cover for X, i.e. a sub-collection of 3-element sets D=(D1,...,Dn) such that each element of X occurs in exactly one subset in D?

Number: 40

Name: Minimum Cover [SP5] 3-4

Input: A finite set S; A collection C=(C1,...,Cm) of subsets of S; a positive integer k<=m.

Question: Does C contain a cover for S comprising at most k subsets, i.e. a collection D=(D1,...,Dt), where t<=k, each Di is a set in C, and such that every element in S belongs to at least one set in D?

Number: 41

Name: Partition [SP12] 3

Input: Finite set A; for each element a in A a positive integer size s(a).

Question: Can A be partitioned into 2 disjoint sets A1 and A2 in a such a way that the sum of the sizes s(x) of elements x in A1 is exactly the same as the sum of the sizes s(y) of the elements y in A2.

Comments: It should be noted that it is not required that A1 and A2 contain equal numbers of elements, although even with this condition the problem is still NP-complete.

Number: 42

Name: Subset Sum [SP13] 3

Input: Finite set A; for each element a in A a positive integer size s(a); a positive integer K.

Question: Is there a subset B of A such that the sum of the sizes, s(x), of the elements x in B is exactly equal to K?

Number: 43

Name: Comparative Containment [SP10] 4

Input: A finite set X; 2 collections R=(R1,...,Rk) and S=(S1,...,Sm) each of which is a set of subsets of X; for each Ri in R, a positive integer weight w(Ri); for each Si in S a positive integer weight w(Si);

Question: Is there a subset Y of X such that: if RY is the set of subsets, Ri of R having Y as a subset of Ri and SY is the set of subsets, Si of S having Y as a subset of Si then the total weight of the sets in RY is at least the total weight of the sets in SY.

Number: 44

Name: Minimum Test Set [SP6] 3

Input: A finite set S; A collection C=(C1,...,Cm) of subsets of S; a positive integer k<=m.

Question: Is there a sub-collection D=(D1,...,Dt) of the sets in C such that: t<=k and for each distinct pair of elements u, v in S there is a set Du,v in D that contains exactly one of u and v?

Number: 45

Name: Minimum Sum of Squares [SP19] 3-4

Input: A set A of n elements; for each element a in A a positive integer size s(a); positive integers k<=n and J.

Question: Can A be partitioned into k disjoint sets A1,...,Ak such that sum from i=1 to k ( sum from {x in Ai} s(x))2<=J?

Number: 46

Name: 3-Partition [SP15] 3

Input: A set A of 3m elements; a positive integer bound B; for each element x in A a positive integer size s(x) that satisfies B/4< s(x)< B/2, and is such that the sum of the sizes of elements in A is exactly mB.

Question: Can A be partitioned into m disjoint sets A1,...,Am such that each Ai contains exactly 3 elements of A and each Ai has total size equal to B?

Number: 47

Name: Subset Product [SP14] 3

Input: Finite set A; for each element a in A a positive integer size s(a); a positive integer K.

Question: Is there a subset B of A such that the product (i.e. result of multiplying together all) of the sizes, s(x), of the elements x in B is exactly equal to K?

Comments: There is a subtle technical distinction between this and Problem 42: the former case has a `pseudo-efficient' algorithm obtained by allowing numbers to be represented in unary; unless all NP-complete problems can be solved by fast algorithms, however, the Subset Product Problem, cannot be solved by `efficient' methods using even this unreasonable input representation.

Number: 48

Name: Bin Packing [SR1] 3

Input: A finite set U of m items; for each item u in U a positive integer size s(u); positive integers B (called the bin capacity) and k<=m.

Question: Can U be partitioned into k disjoint sets U1,...,Uk such that for each Ui (1<=i<=k) the total sum of the sizes of the items in Ui does not exceed B?

Number: 49

Name: Hitting String [SR12] 3

Input: Finite set S={s1,...,sm} each si being a string of n symbols over {0,1,*}.

Question: Is there a binary string x=x1x2...xn of length n such that for each sj in S, sj and x agree in at least one position.

Number: 50

Name: Rectilinear Picture Compression [SR25] 4-5

Input: An n by n matrix M of 0s and 1s; a positive integer K.

Question: Can all of the 1-valued entries in M be covered by a collection of K or fewer rectangles, i.e. is there a sequence of 4-tuples (ai,bi,ci,di) (where for all 1<=i<=K, ai<=bi and ci<=di) such that:

• For each pair (i,j) (1<=i,j<=n) M[i,j]=1 if and only if for some k (1<=k<=K) it holds that ak<=i<=bk and ck<=j<=dk?

Comments: Despite the apparently complicated definition, this is quite a well-motivated problem. Its background is, as its name suggests, from image compression and the task being set can be interpreted as representing a 2-dimensional image using at most K `blocks' of information.

Number: 51

Name: Sequencing with Release Times and Deadlines [SS1] 4-5

Input: A set T of tasks; for each task t in T: a positive integer length len(t); a positive integer release time r(t); and a positive integer deadline d(t).

Question: Is there a one-processor schedule for the tasks, T, that satisfies the release time constraints and meets all of the deadlines, i.e. a one-to-one function sigma from the set of tasks to positive integers such that all of the following hold:

• For any two distinct tasks t and w if sigma(t)> sigma(w) then sigma(t)>=sigma(w)+len(w).
• For all tasks t in T, sigma(t)>=r(t).
• For all tasks t in T, sigma(t)+len(t)<=d(t)?

Comments: Another scheduling problem in the style of Problem 8, but in a single processor environment. The requirements of the schedule, sigma, are that it must: 1) assign a unique starting time to each task in the set (sigma is a one-to-one function); 2) two tasks cannot be running simultaneously (this is condition (a) above, which says that if a task t is scheduled to start after a task w - sigma(t)> sigma(w) - then the earliest time at which t can be scheduled is after w has completed, i.e. sigma(w)+len(w); 3) no task can start before its specified `release time' (condition (b) above); 4) every task has finished no later than its allotted deadline (condition (c) which states that the scheduled starting time of a task t plus the total amount of time that t takes to complete - len(t) - must be no greater than the deadline that is assigned for t, i.e. d(t)).

In common with other scheduling and resource management problems there is a large volume of research into heuristics and approximation methods for this problem. Some relevant references may be found in sections of this bibliography.

For a reminder of what a `one-to-one' function is, see Problem 10.

Number: 52

Name: Precedence Constrained Scheduling [SS9] 3-4

Input: A set T of tasks each of which has length 1 or length 2; a partial ordering << on the set of tasks, T; a positive integer deadline D.

Question: Is there a 2-processor schedule, sigma, for T that meets the overall deadline D and obeys the `precedence constraints' of the partial order <<, i.e. if t<< w then sigma(w)>=sigma(t)+len(t)?

Comments: For a definition of `2-processor schedule' refer to Problem 8. Recall that a partial order << on a set R is an ordering relation which satisfies: for all distinct s and t at most one of s<< t or t<< s holds; and for all distinct s, t, w in R if s<< t and t<< w then s<< w. The bibliography mentioned above may contain useful references.

Number: 53

Input: Positive integers a, b, and c.

Question: Is there a positive integer x whose value is less than c and is such that x2==a(mod b), i.e. the remainder when x2 is divided by b is equal to a?

Comments: The comments made with respect Problem 9 and Problem 11 are also relevant with respect to this problem.

Number: 54

Name: Square-Tiling [GP13] 5

Input: A set C of n `colours'; a collection of tiles, T, each tile t being a 4-tuple < a,b,c,d> of colours where a is the colour at the top of the tile; b that on the right-hand side; c that at the bottom; and d that on the left-hand side of the tile; a positive integer k<=n.

Question: Is there a proper tiling of a k by k square using the tiles in T, i.e. an assignment of a tile A(i,j) in T to each ordered pair (i,j) with 1<=i<=k, 1<=j<=k such that both of the following hold:

• If A(i,j)=< a,b,c,d> and A(i+1,j)=< e,f,g,h> then the colours b and h are identical.
• If A(i,j)=< a,b,c,d> and A(i,j+1)=< e,f,g,h> then the colours c and e are identical?

Comments: Informally, a proper tiling is one in which two tiles which are next to each other are required to have the same colour on touching sides, i.e. the right-hand side of one is coloured the same as the left-hand side of its neighbour and similarly with vertically adjacent tiles: the lower side of one is coloured the same as the top side of its neighbour. Decision problems involving questions about tiling patterns tend to be extremely difficult. For the most general form of the question it can be proven that no algorithm at all exists to solve it, i.e. not even an extremely inefficient one. A tiling problem also constitutes the only known `natural' `hard' member of the following class of problems: given a positive integer n as input, determine the number of `objects' of `size' n having a particular property, e.g. the number of n-node graphs, or the number of n-node graphs with a Hamiltonian circuit.

Number: 55

Name: Crossword Puzzle Construction [GP14] 5

Input: A finite set of characters SIGMA={sigma1,...,sigmak}; a finite set, W={w1,...,w2n} each wi being a sequence of n characters from SIGMA, i.e. a string.

Question: Can an n by n crossword puzzle be built using all of the 2n words (strings) in W, i.e. if C is an n by n table of `blanks', is there an assignment, f, that maps each entry (i,j) of C to some character in SIGMA in such a way that the word formed by taking the n consecutive characters in a row corresponds to a word in W; the word formed by taking the n consecutive characters in a column (reading from top-to-bottom) corresponds to a word in W?

Comments: The definition may look complicated but it isn't. Here is a simple example instance for which a crossword can be constructed. Let SIGMA={A,B,C,D,E,G,O}, n=3, and W={AGE,AGO,BEG,CAB,CAD,DOG}. A 3 by 3 crossword puzzle for this is given by the assignment C(1,1)=C; C(1,2)=A; C(1,3)=B; C(2,1)=A; C(2,2)=G; C(2,3)=E; C(3,1)=D; C(3,2)=O; C(3,3)=G. This results in the grid

```CAB
AGE
DOG
```

whose correctness is self-evident.

Number: 56

Name: Disjunctive non-tautology [LO8] 2

Input: A A set of m products - P1,P2,...,Pm - over a set of n Boolean valued variables Xn=< x1,x2,...,xn>, such that each product depends on exactly three distinct variables from Xn. A product being a Boolean expression of the form yi AND yj AND yk where each y is of the form x or -x (i.e. negation of x) with x being some variable in Xn.

Question: Is there an assignment of Boolean values to the variables in Xn that results in every product taking the value false (equivalently 0)?

Number: 57

Name: Simultaneous incongruences [AN2] 3

Input: A set of n ordered pairs of positive integers {(a1,b1),...,(an,bn)} where ai<=bi for each 1<=i<=n.

Question: Is there a positive integer x such that: for each i, ai does not equal the remainder when dividing x by bi?

Number: 58

Name: Betweenness [MS1] 3

Input: A finite set of size n, A; a set C of ordered triples, (a,b,c), of distinct elements from A.

Question: Is there a one-to-one function, f:A->{1,2,...,n} such that for each triple (a,b,c) in C it holds that either f(a)< f(b)< f(c) or f(c)< f(b)< f(a)?

Comments: If you need to be reminded of what a one-to-one function is then see Problem 10 to which this problem may appear similar.

Number: 59

Name: Minimum weight and/or graph solution [MS16] 4-5

Input: A directed acyclic graph G(V,A) having a single node s, with no incoming edges; a labelling, f(v) of each node having at least one out-going edge in G, as either an and-node or an or-node; a positive integer K.

Question: Is there a sub-graph H(W,B) of G(V,A), i.e. W is a subset of V and B is a subset of A satisfying all of the following:

• s is in W.
• If w is in W and w is an and-node then all of the edges directed out of w in G belong to B.
• If w is in W and w is an or-node then at least one of the edges directed out of w in G belongs to B.
• B contains at most K edges?

Number: 60

Name: Fault-detection in directed graphs. [MS18] 3-4

Input: A directed, acyclic graph G(V,A) such that G has a unique node t with no out-going edges; I the set of nodes in V having no incoming edges; a positive integer K.

Question: Is there a `test set' of size at most K that can detect every `single fault' in G (N.B. not `every single fault' but every `single fault'), i.e. a subset T of the nodes in I such that

• T contains at most K nodes.
• For every node v in V there exists a node u in T such that v lies on a directed path from u to t in G?

Number: 61

Name: Minimum Broadcast Time. [ND49] 3-4

Input: n-node undirected graph G(V,E); a subset V0 of the nodes in V.

Question: Can a message be `broadcast' from the base set V0 to all of the nodes in V in at most 4 steps, i.e. is there a sequence

V0;E1;V1;E2;V2;E3;V3; E4;V4

which satisfies all of the following:

• Vi is a subset of V for each 0<=i<=4.
• Ei is a subset of E for each 1<=i<=4.
• V4=V.
• Each edge in Ei has exactly one of its endpoints in Vi-1 (1<=i<=4)
• No two edges in Ei share a common end-point (1<=i<=4)
• Vi=Vi-1union {w: {v,w} in Ei} (1<=i<=4)?

Number: 62

Name: Disjoint Connecting Paths [ND40] 3-4

Input: n-node undirected graph G(V,E); a set of disjoint pairs of nodes {(s1,t1);...;(sk,tk)}.

Question: Does G contain k mutually node disjoint paths, Pi, with Pi being a path from si to ti?

Comments: Mutually disjoint means that no two paths have any common nodes.

Number: 63

Name: Shortest Weight-Constrained Path [ND30] 3

Input: n-node undirected graph G(V,E); for each edge e in E a positive integer length len(e) and a positive integer weight w(e); specified nodes s and t in V; positive integers K and W.

Question: Is there a path from s to t in G that has both:

• Total weight at most W.
• Total length at most K?

Comments: If all of the edges have the same length or all of the edges have the same weight, then this is simply the normal shortest-path problem for which various efficient algorithms exist, e.g. the Dynamic Programming method of Floyd that is covered in the course. Minimising a single measure, in this context of path lengths, is `easy', but attempting simultaneously to minimise two (or more) distinct measures is not.

Number: 64

Name: Minimum Maximal Matching [GT10] 3

Input: n-node undirected graph G(V,E); positive integers k<=|E|.

Question: Is there a subset, F, of at most k edges from E that forms a maximal matching in G, i.e. no two edges in F have a common endpoint and every edge of G that is not in F has a common endpoint with at least one edge of F?

Number: 65

Name: Partition into Triangles [GT11] 3

Input: (3n)-node undirected graph G(V,E).

Question: Can the nodes of G be partitioned into n disjoint sets - V1,...,Vn - each of which contains exactly 3 nodes and is such that for each Vi={ui,vi,wi}, all three of the edges {ui,vi}, {ui,wi} and {vi,wi} belong to E?

Number: 66

Name: Partition into Forests [GT14] 3

Input: n-node undirected graph G(V,E); positive integer k<=n

Question: Can the nodes of G be partitioned into t<=k disjoint sets V1,...,Vt - in such a way that for each Vi (1<=i<=t), the subgraph Gi(Vi,Ei) induced by Vi contains no cycles, i.e. is a forest (set of trees)?

Comments: If G(V,E) is a graph, then the subgraph induced by a subset W of V is the graph H(W,F) whose edges, F, are formed by all the edges in E that connect two nodes in W.

Number: 67

Name: Partition into Cliques [GT15] 3

Input: n-node undirected graph G(V,E); positive integer k<=n

Question: Can the nodes of G be partitioned into t<=k disjoint sets - V1,...,Vt - in such a way that for each Vi (1<=i<=t), the subgraph Gi(Vi,Ei) induced by Vi is a clique, i.e. a graph in which every pair of nodes is connected by an edge?

Number: 68

Name: Partition into Perfect Matchings [GT16] 3

Input: n-node undirected graph G(V,E); positive integer k<=n

Question: Can the nodes of G be partitioned into t<=k disjoint sets - V1,...,Vt - in such a way that for each Vi (1<=i<=t), the subgraph Gi(Vi,Ei) induced by Vi is a perfect matching, i.e. a graph in which every node is the endpoint of exactly one edge?

Number: 69

Name: Covering by cliques [GT17] 3

Input: n-node undirected graph G(V,E); positive integer k<=n

Question: Are there t<=k subsets - V1,...,Vt of V - such that both of the following hold:

• The subgraph Gi(Vi,Ei) of G(V,E) induced by Vi is a clique.
• For each edge {u,v} in E there is some subset Vi that contains both u and v?

Number: 70

Name: Degree-bounded connected subgraph [GT26] 3

Input: n-node undirected graph G(V,E); non-negative integer d<=n; positive integer k<=|E|.

Question: Is there a subset F of the edges in E having size at least k and such that subgraph G(V,F) of G is connected and has no node of degree greater than d?

Comments: For the definition of `degree of a node' see Problem 7. A graph is connected if for every pair of nodes v and w there is a path connecting v and w in the graph.

Number: 71

Name: Uniconnected subgraph [GT30] 3

Input: n-node directed graph G(V,A); positive integer k<=|A|.

Question: Is there a subset B of the edges in A having size at least k and such that the subgraph H(V,B) of G has at most one directed path between any pair of nodes in V?

Number: 72

Name: Minimum k-connected subgraph [GT31] 3

Input: n-node undirected graph G(V,E); positive integers k<=n and B<=|E|.

Question: Is there a subset F of the edges E having size at most B and such that the subgraph H(V,F) of G is k-connected, i.e. remains connected after removing any set W of a most k-1 nodes and their adjacent edges?

Number: 73

Name: Minimum cut linear arrangement [GT44] 3

Input: n-node undirected graph G(V,E); positive integer k.

Question: Is there a one-to-one function f:V->{1,2,...,n} such that for all i, with 1< i< n, the total number of edges {u,v} in E for which f(u)<=i< f(v) is at most k?

Number: 74

Name: Consecutive Sets [SR18] 4

Input: Finite alphabet SIGMA of t symbols; collection C={SIGMA1,SIGMA2,...,SIGMAn} of subsets of SIGMA; positive integer K.

Question: Is there a string, w, (sequence of symbols) over SIGMA such that |w|< =K and for each 1< =i< =n, all of the symbols in SIGMAi appear in a consecutive block of |SIGMAi| symbols in the string w?

Comments: N.B. minor typo. in Garey and Johnson, W written instead of w.

Number: 75

Name: Multiple choice matching [GT55] 3

Input: n-node undirected graph G(V,E); partition of E into t disjoint sets - E1,...,Et; positive integer k.

Question: Is there a subset F of E having size at least k and satisfying both of the following:

• No two edges in F share a common endpoint.
• F contains at most one edge from each Ei (1<=i<=t)?

Number: 76

Name: Path distinguishers [GT60] 3

Input: n-node directed, acyclic graph G(V,A); specified nodes s and t in V; positive integer k<=|A|.

Question: Is there a subset B of the edges A having size at most k and such that for any pair of distinct paths P and Q from s to t in G, there is an edge in B that is in exactly one of P and Q?

Number: 77

Name: Nestril-Rodl dimension [GT62] 5

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Is there a one-to-one function

f:V-> {( a1, a2,...,ak) : 1 <= ai<= n }

such that for all pairs of nodes u and v in V, {u,v} is an edge of G if and only if f(u) and f(v) disagree in all k components?

Number: 78

Name: Shortest total path length spanning tree [ND3] 4-5

Input: n-node undirected graph G(V,E); positive integer B.

Question: Is there a spanning tree T(V,F) of G such that the sum over all pairs of nodes u and v of the length of the path between u and v in T is no greater than B?

Number: 79

Name: Induced Path [GT23] 3

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Is there a subset W of the nodes V having size at least k and such that the subgraph H(W,F) of G induced by W is a simple path on |W| nodes?

Number: 80

Name: Bounded Component Spanning Forest [ND10] 3

Input: n-node undirected graph G(V,E); positive integer k<=n.

Question: Can V be partitioned into t disjoint sets - V1,...,Vt - where t<=k, such that for each i (1<=i<=t) both of the following hold:

• The subgraph H(Vi,Ei) of G induced by Vi is connected.
• The number of nodes in Vi is at most 4.

Number: 81

Name: Maximum subgraph matching [GT50] 3

Input: Directed graphs G(V,A), H(W,B); positive integer k.

Question: Is there a subset R of VxW, i.e. pairs of nodes where one node is from V and one from W, having size at least k and such that for every < u,v> and < x,y> in R the edge (u,x) belongs to A if and only if the edge (v,y) belongs to B?

Number: 82

Name: Numerical 3-dimensional matching [SP16] 3

Input: Disjoint sets W, X, and Y each containing exactly m elements; for each element u of W union X union Y, a positive integer size s(u); a positive integer bound B.

Question: Can the 3m element set W union X union Y be partitioned into m disjoint sets A1,...,Am in such a way that:

• Each Ai contains exactly one element from W, exactly one element from X, and exactly one element from Y.
• Each Ai has total size exactly equal to B?

Number: 83

Name: Open Hemisphere [MP6] 4-5

Input: Finite set X of m-tuples of integers; a positive integer k<=|X|.

Question: Is there an m-tuple y of rational numbers such that sum from i=1 to m xiyi<> 0 for at least k of the m-tuples in X, xi (resp. yi) being the ith component of the m-tuple x (resp. y)?

Number: 84

Name: Largest Common Subgraph [GT49] 3-4

Input: 2 n-node undirected graphs G(V,E), H(W,F); positive integer k.

Question: Do there exist subsets E1 of the edges E and F1 of the edges F that satisfy all of the following:

• |E1|=|F1|.
• |E1|>=k, |F1|>=k.
• The graphs G(V,E1) and H(W,F1) are isomorphic, i.e. there is a mapping f between V and W such that {x,y} is an edge in E1 if and only if {f(x),f(y)} is an edge in F1?

Number: 85

Name: Clustering [MS9] 3

Input: Finite set X; for each pair of elements x and y in X, a positive integer distance d(x,y); positive integer B.

Question: Is there a partition of X into 3 disjoint sets - X1,X2,X3 - with which: for each set Xi (1<=i<=3), for all pairs x and y in Xi it holds that d(x,y)<=B?

Number: 86

Name: Protein Folding 5

Input: Three positive integers, t, n, and k; a finite alphabet, SIGMA, of t symbols; a string S of length n.

Question: Is there an embedding of S into a two dimensional grid that has a bond-score of at least k? i.e. if S=s1s2...sn, is there a one-to-one function FOLD:{1,...,n}->NxN such that:

• For all 1 <= i< n if FOLD(i)=(x,y) and FOLD (i+1)=(w,z), then either {x=w and |y-z|=1} or {|x-w|=1 and y=z}. (i.e. consecutive members of S are mapped onto adjacent grid co-ordinates).
• The number of pairs {(x,y);(w,z)} of grid positions for which both
• (x,y) and (w,z) are neighbours, (i.e. {x=w and |y-z|=1} or {|x-w|=1 and y=z}).
• FOLD(i)=(x,y) and FOLD(j)=(w,z) (where i and j satisfy 1<=i< j<=n), and si=sj, i.e. the symbols (in SIGMA) mapped onto (x,y) and (w,z) are identical.
• is at least k?

Comments: The problem arises from an extremely simplified model of protein-folding of interest in computational biology. The NP-completeness proof is highly non-trivial (by a transformation from 3-SAT), is a recent result not mentioned in Garey and Johnson, and is due to Paterson and Przytycka (1996). It is important to note that the alphabet is part of the input. It is an open question as to whether the variant in which an alphabet of a fixed size, e.g. 2 as with binary, remains NP-complete. In should also be remembered that no limit is placed on the grid size, although trivially, an n x n grid will always suffice.

Number: 87

Name: Hamming Centre 3-4

Input: A set S of k binary strings, each of length n; a positive integer r.

Question: Is there an n-bit string y such that for every string x in S, the Hamming distance, H(x,y) is at most r?.

Comments: The Hamming distance, H(x,y), between two binary strings x and y of length n, is the number of bit positions in which x and y have differing values.

This is another recent result not given in Garey and Johnson, its NP-complete classification being given in Frances and Litman (1997).

M. Frances and A. Litman. On Covering Problems of Codes, Theory of Computing Systems, 30, 1997, 113-119

Number: 88

Name: 2-Spanner 3

Input: An undirected graph G(V,E); a positive integer s.

Question: Does G(V,E) contain a 2-spanner with at most s edges, i.e. subset F of the edges E having size at most s and such that for each edge {v,w} in E either {v,w} is in F or (for some node x in V) the edges {v,x} and {x,w} are both in F?

Comments: 2-Spanners (or more generally t-spanners for some positive integer t) are a natural extension of the idea of spanning tree (N.B. A spanner is required neither to be a tree nor, even, a connected graph), and were first considered in the context of problems in communications networks and parallel architectures by Peleg and Ullman (1989). Subsequently, Peleg and Schaffer (1989) showed this problem to be NP-complete, hence its absence from the appendix in Garey and Johnson.

D. Peleg and A.A. Schaffer. Graph Spanners. Jnl. of Graph Theory, 13, (1989), 99-116.

D. Peleg and J.D. Ullman. An Optimal Synchroniser for the Hypercube. SIAM Jnl. on Computing, 18, (1989), 740-747