Algorithm Design Paradigms - Greedy Method

### Greedy Algorithms

This is another approach that is often used to design algorithms for solving

### Optimisation Problems

In contrast to dynamic programming, however,
• Greedy algorithms do not always yield a genuinely optimal solution. In such cases the greedy method is frequently the basis of a heuristic approach.
• Even for problems which can be solved exactly by a greedy algorithm, establishing the correctness of the method may be a non-trivial process.
In order to give a precise description of the greedy paradigm we must first consider a more detailed definition of the environment in which typical optimisation problems occur. Thus in an optimisation problem, one will have, in the context of greedy algorithms, the following:
• A collection (set, list, etc) of candidates, e.g. nodes, edges in a graph, etc.
• A set of candidates which have already been `used'.
• A predicate (solution) to test whether a given set of candidates give a solution (not necessarily optimal).
• A predicate (feasible) to test if a set of candidates can be extended to a (not necessarily optimal) solution.
• A selection function (select) which chooses some candidate which h as not yet been used.
• An objective function which assigns a value to a solution.
In other words: An optimisation problem involves finding a subset, S, from a collection of candidates, C; the subset, S, must satisfy some specified criteria, i.e. be a solution and be such that the objective function is optimised by S. `Optimised' may mean

### Minimised or Maximised

depending on the precise problem being solved. Greedy methods are distinguished by the fact that the selection function assigns a numerical value to each candidate, x, and chooses that candidate for which:

### SELECT( x ) is largest

or SELECT( x ) is smallest

All Greedy Algorithms have exactly the same general form. A Greedy Algorithm for a particular problem is specified by describing the predicates `solution' and `feasible'; and the selection function `select'.

Consequently, Greedy Algorithms are often very easy to design for optimisation problems.

The General Form of a Greedy Algorithm is

```
function select (C : candidate_set) return candidate;
function solution (S : candidate_set) return
boolean;
function feasible (S : candidate_set) return
boolean;
--***************************************************
function greedy (C : candidate_set) return candidate_set is
x : candidate;
S : candidate_set;
begin
S := {};
while (not solution(S)) and C /= {} loop
x := select( C );
C := C - {x};
if feasible( S union {x} ) then
S := S union { x };
end if;
end loop;
if solution( S ) then
return S;
else
return es;
end if;
end greedy;
```

As illustrative examples of the greedy paradigm we shall describe algorithms for the following problems:

• Minimal Spanning Tree.
• Integer Knapsack.
For the first of these, the algorithm always returns an optimal solution.

### Minimal Spanning Tree

The inputs for this problem is an (undirected) graph, G( V, E ) in which each edge, e in E, has an associated positive edge length, denoted Length( e ).

The output is a spanning tree, T( V, F ) of G( V,E ) such that the total edge length, is minimal amongst all the possible spanning trees of G( V,E ).

Note: An n-node tree, T is a connected n-node graph with exactly n-1 edges.

T( V,F ) is a spanning tree of G( V,E ) if and only if T is a tree and the edges in F are a subset of the edges in E.

In terms of general template given previously:

• The candidates are the edges of G(V,E).
• A subset of edges, S, is a solution if the graph T(V,S) is a spanning tree of G(V,E).
• A subset of edges, S, is feasible if there is a spanning tree T(V,H) of G(V,E) for which S sube H.
• The objective function which is to be minimised is the sum of the edge lengths in a solution.
• The select function chooses the candidate (i.e. edge) whose length is smallest (from the remaining candidates).
The full algorithm, discovered by Kruskal, is:

```function min_spanning_tree (E : edge_set)
return edge_set is
S : edge_set;
e : edge;
begin
S := (es;
while (H(V,S) not a tree)
and E /= {} loop
e :=  Shortest edge in E;
E := E - {e};
if H(V, S union {e}) is acyclic then
S := S union {e};
end if;
end loop;
return S;
end min_spanning_tree;
```
Before proving the correctness of this algorithm, we give an example of it running.

The algorithm may be viewed as dividing the set of nodes, V, into n parts or components:

```
{1} ; {2} ; ... ; {n}
```

An edge is added to the set S if and only if it joins two nodes which belong to different components; if an edge is added to S then the two components containing its endpoints are coalesced into a single component.

In this way, the algorithm stops when there is just a single component

```
{ 1, 2 ,..., n }
```

remaining. IterationEdgeComponents
0-{1}; {2}; {3}; {4}; {5}; {6}; {7}
1{1,2}{1,2}; {3}; {4}; {5}; {6}; {7}
2{2,3}{1,2,3}; {4}; {5}; {6}; {7}
3{4,5}{1,2,3}; {4,5}; {6}; {7}
4{6,7}{1,2,3}; {4,5}; {6,7}
5{1,4}{1,2,3,4,5}; {6,7}
7{4,7}{1,2,3,4,5,6,7} Question: How do we know that the resulting set of edges form a Minimal Spanning Tree?

In order to prove this we need the following result.

For G(V,E) as before, a subset, F, of the edges E is called promising if F is a subset of the edges in a minimal spanning tree of G(V,E).

Lemma: Let G(V,E) be as before and W be a subset of V.

Let F, a subset of E be a promising set of edges such that no edges in F has exactly one endpoint in W.

If {p,q} in E-F is a shortest edge having exactly one of p or q in W then: the set of edges F union { {p,q} } is promising.

Proof: Let T(V,H) be a minimal spanning tree of G(V,E) such that F is a subset of H. Note that T exists since F is a promising set of edges.

Consider the edge e = {p,q} of the Lemma statement.

If e is in H then the result follows immediately, so suppose that e is not in H. Assume that p is in W and q is not in W and consider the graph T(V, H union {e}).

Since T is a tree the graph T (which contains one extra edge) must contain a cycle that includes the (new) edge {p,q}.

Now since p is in W and q is not in W there must be some edge, e' = {p',q'} in H which is also part of this cycle and is such that p' is in W and q' is not in W. Now, by the choice of e, we know that

```
Length ( e )  < =  Length ( e' )
```

Removing the edge e' from T gives a new spanning tree of G(V,E).

The cost of this tree is exactly

```
cost( T ) - Length(e') + Length(e)
```

and this is < = cost(T).

T is a minimal spanning tree so either e and e' have the same length or this case cannot occur. It follows that there is a minimal spanning tree containing F union {e} and hence this set of edges is promising as claimed.

Theorem: Kruskal's algorithm always produces a minimal spanning tree.

Proof: We show by induction on k > = 0 - the number of edges in S at each stage - that the set of edges in S is always promising.

Base (k = 0): S = {} and obviously the empty set of edges is promising.

Step: (< = k-1 implies k): Suppose S contains k-1 edges. Let e = {p,q} be the next edge that would be added to S. Then:

• p and q lie in different components.
• {p,q} is a shortest such edge.
Let C be the component in which p lies. By the inductive hypothesis the set S is promising. The Inductive Step now follows by invoking the Lemma, with W = Set of nodes in C and F = S.

### Integer Knapsack

In various forms this is a frequently arising optimisation problem. Input: A set of items U = {u1, u2 ,..., uN}

each item having a given size s( ui ) and value v( ui ).

A capacity K.

Output: A subset B of U such that the sum over u in B of s(u) does not exceed K and the sum over u in B of v(u) is maximised.

No fast algorithm guaranteed to solve this problem has yet been discovered.

It is considered extremely improbable that such an algorithm exists.

Using a greedy approach, however, we can find a solution whose value is at worst 1/2 of the optimal value.

• The items, U, are the candidates.
• A subset, B, is a solution if the total size of B fits within the given capacity, but adding any other item will exceed the capacity.
• The objective function which is to be maximised is the total value.

The selection function chooses that item, ui for which

```                    v( ui )
--------
s( ui )
```

is maximal

These yield the following greedy algorithm which approximately solves the integer knapsack problem.

```function knapsack (U : item_set;
K : integer )
return item_set is
C, S : item_set;
x : item;
begin
C := U; S := {};
while C /= {} loop
x := Item u in C such that
v(u)/s(u) is largest;
C := C - {x};
if ( sum over {u in S} s(u) ) + s(x) < = K then
S := S union {x};
end if;
end loop;
return S;
end knapsack;
```

A very simple example shows that the method can fail to deliver an optimal solution. Let

```
U = { u1, u2, u3 ,..., u12 }
```

```
s(u1) = 101  ;  v(u1) = 102
```

```
s(ui) = v(ui) = 10   2 < = i < = 12
```

```
K = 110
```

Greedy solution: S = {u1}; Value is 102.

Optimal solution: S = U - {u1}; Value is 110. PED Home Page