IBMSkip to main content
  Home     Products & services     Support & downloads     My account  
  Select a country 
Journals Home 
 Systems Journal 
Journal of Research
and Development
 ·  Current Issue 
 ·  Recent Issues 
 ·  Papers in Progress 
 ·  Search/Index 
 ·  Orders 
 ·  Description 
 ·  Patents 
 ·  Recent publications 
 ·  Author's Guide 
 Staff 
 Contact Us 
 Related link: 
    IBM Research:
   Mathematical
   Sciences
 
IBM Journal of Research and Development 
Volume 47, Number 1, 2003
Mathematical Sciences at 40
 Table of contents: arrowHTML arrowPDF   This article: HTML arrowPDF          DOI: 10.1147/rd.471.0025arrowCopyright info
  

On greedy algorithms, partially ordered sets, and submodular functions

by B. L. Dietrich and A. J. Hoffman

Recent developments in the use of greedy algorithms in linear programming are reviewed and extended. We find a common generalization of some theorems of Queyranne–Spieksma–Tardella, Faigle–Kern, and Fujishige about greedy algorithms for linear programs in diverse contexts. Additionally, we extend a well-known theorem of Topkis about submodular functions on the product of chains to submodular functions on the product of lattices.

1. Introduction

A greedy algorithm for solving an optimization problem successively maximizes the variables in some chosen order. Greedy algorithms are widely used in many contexts as a heuristic, with a hope and a prayer (sometimes even a proof) that the rapid calculation will yield an acceptable answer. In this essay, however, we are interested specifically in linear programming problems and in cases where the greedy algorithms actually produce correct answers.

While such cases are rare, they have been influential in linear programming history. A sympathetic interpretation of the 1781 paper by Monge [1] allows us to interpret a passage in that remarkable pamphlet as the description of a linear programming problem (actually of the important subclass known as the transportation problem [2]) which is amenable to a greedy solution. In fact, this notion was eponymized in [3] to celebrate its pedigree. A survey of the impact of the Monge concept is given in [4]. We provide a technical description later.

A second influence has been the use of the idea of greedy algorithms for linear programming to give a general context to some famous combinatorial algorithms. A leading example is the introduction by Edmonds [5] of the concept of polymatroid, and the interplay between greedy algorithms on polymatroids with greedy algorithms on matroids (which, in turn, generalize some algorithms for constructing minimum spanning trees for a graph). We should also mention in this spirit the notion of (0, 1) matrices in which the 1's in each row (or in each column) occur consecutively. Although introduced [6] in the first place to provide a greedy (and correct) solution to the caterer problem [7], it became an essential ingredient in many arguments used to prove combinatorial theorems using linear programming duality.

There is, unfortunately, no survey of the subject of greedy algorithms in linear programming more recent than [8], which is more than 15 years old. Our aim in this paper is to review an interesting development in greedy theory in a series of related papers appearing in the last few years, to describe the common themes, to offer generalizations of two of these themes, and to comment on weaknesses or lacunae remaining in our treatment. Central to this work being summarized and extended are the concepts (to be defined below) of partially ordered set and submodular functions.

2. Definitions; products of chains

A partially ordered set P is a set of elements {a, b, c, ... }, together with a binary relation “<” on some pairs such that a < b, b < c imply a < c, and a < a is never true. Also, a > b means b < a. We say that a and b are comparable if a = b, a < b, or b < a. A chain is a partially ordered set in which any two elements are comparable. P is said to be a lattice if, for every pair of elements a, b, there is a unique least upper bound a vee b in P satisfying

ca, cb imply ca vee b, (1a)

and there is a unique greatest lower bound a teepee b in P satisfying

ca, cb imply ca teepee b. (1b)

We later introduce the more general notion of pseudolattice, but we postpone this to Section 3. In the meantime, we frequently use L to denote the partially ordered set with a lattice structure.

A real function r defined on a lattice L is said to be

modular if r(a vee b) + r(a teepee b) = r(a) + r(b); (2a)

submodular if r(aveeb) + r(ateepeeb) ≤ r(a) + r(b);
(2b)

supermodular if r(aveeb) + r(ateepeeb) ≥ r(a) + r(b).
(2c)

Let k chains C1, C2, ... , Ck be given, with respective cardinalities c1, c2, ... , ck. There is a natural partial ordering on the Cartesian product C1 × ... × Ck that yields a lattice. A submodular function r( ) on this lattice is called a Monge array in case k = 2, and we use the same term for general k. Now consider the linear programming problem arising as follows: Construct a (0, 1) matrix A with c1 c2 ... ck rows and c1 + c2 + ... + ck columns defined as follows: The rows correspond to the Cartesian product of the chains, and the columns to the union of the chains, with a 1 in a particular row and column if the element corresponding to that column is one of the multiplicands in the Cartesian product corresponding to the row. Let d be a given nonnegative vector with c1 + ... + ck coordinates corresponding to the columns of A, with r( ) being a submodular function (Monge array) on the product lattice C1 × ... × Ck corresponding to the rows of A. Our linear programming problem is the following: Choose a nonnegative y that satisfies y'A = d', and of all such y, choose one that minimizes (y, r). It is shown in [9] that the following greedy algorithm solves the problem: Let special At be the row of A corresponding to the greatest element (say t) of the product lattice, dbar the least entry of d corresponding to a 1 in special At. Let yt = dbar, replace d with d – dbarspecial At, delete from A the column containing dbar, and continue. This greedy algorithm terminates with an optimum solution, or the news that the problem is not feasible. (This news would be using up the rows of A without attaining a feasible solution.)

Next, consider the following variation of the polymatroid. Let r be a submodular function defined on the lattice of subsets of a finite set U, with r(emptyset) = 0. Let A be the (0, 1) matrix whose rows are the incidence vectors of the subsets of U, d a nonnegative vector with coordinates corresponding to elements of U. Consider the linear programming problem y'A = d, y ≥ 0, minimize (y, r). Again, a greedy algorithm starting with the greatest element in the lattice—U itself—solves the optimization problem.

In [10] it was pointed out that the validity of the greedy algorithm for polymatroids can be shown to be a consequence of the validity of the greedy chain-product algorithm described above. (Among other interesting items in [10] is a proof that the chain-product algorithm is valid over any sublattice of the chain-product, and an allusion to the theorem of Topkis about submodular functions on chain-products; we comment on both later.)

The reasoning in [10] is that, in the polymatroid problem, we can replace each element u of U with a two-element chain. The greater element u' corresponds to including the element u in a subset; the lesser element u" corresponds to omitting the element. This is how the subsets of U are coded. Of course, we need to choose values of the d vector to correspond to the different u", but it is easy to see how this is accomplished. Thus, [10] succeeds in folding this class of polymatroids into the multi-dimensional Monge array.

Further work [11–13] on topics related to those in [10] is discussed later.

3. Modular, consecutive (0, 1) functions

In this section, we present an alternative approach to the greedy algorithms discussed in Section 2.

Let U be a finite set and L a lattice, with least element m and greatest element M. We are given functions f:L → 2U and r:Lspecr3. We assume that

if f(a) = emptyset, then a = m and r(m) = 0. (3)

For each uU, let f(L, u) be defined by

f(a, u) = 1 if uf(a), 0 if u not member f(a). (4)

We assume that the function f satisfies the following:

for all uU, f(L, u) is modular,

and

for all ab, f(a) ≠ f(b). (5)

In addition to the given functions f and r, we are also given a nonnegative function d:Uspecr3, and we consider the following linear programming problem:

Equation 6 (6)

where

x(a) ≥ 0    for all aL,

and

Equation a

For this problem, we propose a greedy algorithm GL. In the first step, we set x(M) to be the minimum of all d(u) such that uf(M), and we let u* be one of the (possibly several) us where the minimum is attained. If we delete from L all elements a such that u* ∈f(a), subtract d(u*) from all d(u) such that uf(M), and delete u* from U, it is observed from (3), (4), and (5) that the remaining elements form a lattice, and these same stipulations still hold for the remaining lattice and remaining U, on which d is still a nonnegative function. We continue this algorithm indefinitely, until we are compelled to stop. We may stop because the remaining d still has at least one positive coordinate, but the remaining part of L is empty or consists just of m with f(m) = 0, or we may stop because d has been reduced to 0.

Theorem 1

Assume that (6) is feasible. Assume also that (3), (4), and (5) hold. Then GL produces an optimum solution to (6) if the following additional conditions hold:

r is submodular (7)

f(a) ⊃ f(b) implies a > b, (8)

a < b < c, f(a, u) = f(c, u) = 1 imply f(b, u) = 1. (9)

The reader should note that the consecutivity condition (9) from [6] is significant in lattice polyhedra generally (see [8, 14–16]). What is different here from general lattice polyhedra is (8), which specifies relations among the various f(L, u) for different u, whereas we are usually concerned just with properties of f(L, u) for each u by itself. But assumption (8) is unavoidable: In fact, if any one of (7), (8), or (9) is false, there is an instance of (6) in which the greedy algorithm GL does not produce an optimum solution.

Proof

To prove the theorem, all we need prove is the following: There exists a feasible solution x* to (6) for which

x*(M) = d(u*). (10)

The reason is that induction will then justify every step of GL.

By the hypothesis of the theorem, (6) is feasible. Further, the feasible region is bounded, so an optimum solution to (6) exists. Let x be an optimum solution in which x(M) is as large as possible, but x(M) is less than d(u*). This means that, for each uf(M), there exists at least one aL, a < M, such that

x(a) > 0 and uf(a). (11)

Let a be a maximal element of the partially ordered set L satisfying (11). By (8) and (5), there is at least one uf(M) (say u'), that is not in f(a). Thus, there exists at least one element bL, b ≠ a, so that

u'f(b), u' notmember f(a). (12)

Suppose that b and a are comparable. Then, by the definition of a, b < a, and we would have, with M > a > b, and (12), a violation of (9). Hence, a and b are incomparable. Let epsilon be a positive number not greater than the minimum of x(b) and x(a). Reduce x(a) and x(b) by epsilon, and raise x(a teepee b) and x(a vee b) by epsilon. By (5), the new x is feasible; by (7), the new x is optimal. If M = a vee b, we have contradicted the definition of x. If a vee bM, we still have a vee b > a, and have contradicted the definition of a. Hence, we have proved (10) and the theorem.

It remains to show that each of (7), (8), and (9) is needed. The necessity of (7) is obvious, and indeed was pointed out in [17]. For (9), let L be the chain a  <  b  <  c, U = {1, 2, 3}, f(a) = {1, 2}, f(b) = {1, 3}, and f(c) = {2, 3}. Let d(1) = d(2) = d(3) = 1, and let r(a) = r(b) = r(c) = 1. Then (7) and (8) are true, (9) is not, and GL does not produce an optimal solution; it does not even produce a feasible solution, though one exists.

For (8), let L, U, r, and d be as above. Let f(a) = {1, 2}, f(b) = {1, 2, 3}, f(c) = {3}. Let r(a)  = r(b) = r(c) = 1. Then GL does not produce an optimal solution, even though (7) and (9) are true.

box

Observe that the theorem clearly covers the case of polymatroids. It also covers the case of chain-products, as we see in the next section. Besides its conclusions, our theorem has the feature of divorcing (partly, but not totally) the lattice L from the set U. In the other examples, and others to be cited later, U comes from the definition of L. However, our separation is not complete because of conditions (5) and (8).

4. Products of lattices

In this section we consider, more generally than the product of chains in [10], the product of lattices and the extension of the results in Section 3 to such products.

Theorem 2

Let L1, ... , Lk be lattices; U1, ... , Uk disjoint sets; fi:LiUi, i = 1, ... , k, each satisfying the conditions in Section 3. If we define the product lattice LL1 × ... × Lk ≡ {(a1, ... , ak):aiLi, i = 1, ... , k}, with ab if and only if aibi for all i; and we define

f ≡ (f1, …, fk) → U = U1 ∪ … ∪ Uk

by

f(a1, …, ak) = f(a1) ∪ … ∪ f(ak),

we will have satisfied the conditions of Section 3 for f, L, and U.

If, in addition, we introduce a submodular r:LR, we have extended the material on chain-products in [10] to lattice products. It is interesting to observe, however, some facts about submodular r defined on lattice products.

Topkis [18] proved that for a product C = C1 × ... × Ck of chains, r is submodular on C if

r(ab) + r(ab) ≤ r(a) + r(b) (13)

holds when a = (a1, ..., ak) and

b = (b1, …, bk) differ in at most two coordinates. (14)

We will prove Theorem 3.

Theorem 3

Let L = L, × ... × L, r:Lspecr3. Let a = (a1, ... , ak) and b = (b1, ... , bk). Assume that (13) holds in the following cases:

For each i, ai and bi are comparable, and

ai = bi with two exceptions. (15)

and

ai = bi with one exception. (16)

Then r: Lspecr3 is submodular.

Note that if L is the product of chains, (16) is immediate, and (15) is the same as (14). We think it interesting that proving r submodular on L needs only (15) when L is restricted to be the product of two element chains.

To prove that (15) and (16) imply (13), we first consider the case k = 2. Assume a = (a1, a2), b = (b1, b2), and define

c1 = (a1b1), d1 = (a1b1);

c2 = (a2b2), d2 = (a2b2);

By (16),

r(a1, c2) + r(a1, d2) ≤ r(a1, a2) + r(a1, b2),

r(b1, c2) + r(b1, d2) ≤ r(b1, a2) + r(b1, b2),

r(c1, a2) + r(d1, a2) ≤ r(a1, a2) + r(b1, a2),

r(c1, b2) + r(d1, b2) ≤ r(a1, b2) + r(b1, b2),

(17)

By (15),

r(c1, c2) + r(a1, b2) ≤ r(c1, b2) + r(a1, c2),

r(c1, c2) + r(b1, a2) ≤ r(c1, a2) + r(b1, c2),

r(d1, d2) + r(a1, b2) ≤ r(d1, b2) + r(a1, d2),

r(d1, d2) + r(b1, a2) ≤ r(d1, a2) + r(b1, d2),

(18)

Adding one half of (17) and (18) yields (13).

We now consider the case of general k ≥ 3, assuming (inductively) that the theorem has been proved for k – 1. Let us consider L = L1 × ... × Lk as L = L*1 × Lk, where L*1 = L1 × ... × Lk-1. This brings us back to the case of two lattices, but we must verify the hypothesis.

First, for fixed ak, r is a submodular function on L*1 by induction. For fixed aL*1, r is a submodular function on Lk by the original hypothesis. Next, (15) holds, from the theorem of Topkis. Thus, the induction step is complete.

5. Pseudolattices

In [11–13], further examples in the spirit of [10] were presented. To explain them, we first define the following: In a partially ordered set P, we say that b covers a if a < b and there is no c such that a < c < b. Next, an ideal special I of P is a subset such that aspecial I, b < a imply bspecial I. Observe that the union (and also the intersection) of two ideals is an ideal. An antichain special A subset P is a set of mutually incomparable elements. If special A is an antichain, then [special A] denotes the ideal of all bP such that ba for some aspecial A. If special I is an ideal, special I+ is the antichain consisting of the maximal elements of special I.

Let a partially ordered set P be given, and consider the lattice L(P) of all  ideals contained in  P. [L(P) is itself a partially ordered set, ordered by inclusion, with union and intersection respectively the operations vee and teepee.] Let specu be the set of elements of P, and define a mapping f:L(P) → 2U by

f(special I) = special I+. (19)

It is shown in [11] that, when P is a forest, a suitable generalization of the greedy algorithm described in [10] is valid. Note that, when P is a forest, the mapping (19) satisfies the hypothesis (and conclusion) of Theorem 1 above. If P is not a forest, then the greedy algorithm is not in general valid. But [12] points out that an additional hypothesis about the submodular function r:L(P) → specr3, if e is covered by at least two elements, then

r(special I ∪ {e}) ≥ r(special I) (20)

is sufficient to justify the greedy algorithm even when P is not a forest. An interpretation of this result in terms of submodular flows is given in [13]. The key idea in both papers to establish that, if special I1 and special I2 are ideals in P, then the antichain

special Af(special I1) + f(special I2) – f(special I1special I2) is an antichain, (21)

and

[special A] ⊂ special I1special I2. (22)

Further, every element in

(special I1special I2) – [special A] is covered by at least two elements. (23)

From (20), (22), and (23), we infer that r([special A]) ≤ r(special I1special I2), hence

r(special I1special I2) + r([special A]) ≤ r(special I1) + r(special I2). (24)

Instead of the arguments used in [11] and [12], we propose to include the result of [11] in the framework of our approach in Theorem 1 by slightly extending the definitions used therein. Note that Theorem 1, as it stands, cannot encompass [12] because (5) is false. On the other hand, if one wants to exploit (20)–(24), we have the difficulty that, in (20), [special A] is not special I1 intersect special I2.

Our tactic is to extend Theorem 1 by using the notion of pseudolattice. A partially ordered set P is a pseudolattice if, for every pair of elements (a, b) there is a designated element a vee* b satisfying a vee* ba, b, and there is a designated element a teepee* b satisfying a teepee* ba, b.

Further, if a and b are comparable, a vee* b = max(a, b), a teepee* b = min(a, b).

Theorem 4

If, in Theorem 1, the term pseudolattice and the pseudolattice operations vee* and teepee* are substituted for the term lattice and the lattice operations vee and teepee, then the theorem remains valid.

The proof is simply to revisit the proof of Theorem 1 and observe that it remains valid in the more general setting without changing a single word. Further, with this change, we accommodate [12] by observing that when we define

special I1 ∨* special I2special I1special I2

and

special I ∧* special I2 ≡ [special A], (25)

where special A is defined by (21), then (5) holds along with all of the other hypotheses of Theorem 4, so [12] is included in the general setting.

6. Some questions

  1. It is easy to see that in the partially ordered set m < a, b, c < M with {a, b, c} mutually incomparable, no function f satisfying our hypotheses is possible. It would be nice to know what partially ordered sets permit functions f.
  2. A theory for combining greedy linear programming problems on polytopes of the form
    {x|special Axb, x ≥ 0} is presented in [17]. We have no appropriate theory if the polytopes are of the form {x|special Ax = b, x ≥ 0}.

Acknowledgments

We thank Don Coppersmith, Milind Dawande, Jon Lee, and George Nemhauser for their contributions to this paper.

References

Received February 20, 2002; accepted for publication August 29, 2002; Internet publication December 9, 2002