IBMSkip to main content
  Home     Products & services     Support & downloads     My account  
  Select a country 
Journals Home 
 Systems Journal 
 ·  Current Issue 
 ·  Recent Issues 
 ·  Papers in Progress 
 ·  Search/Index 
 ·  Orders 
 ·  Description 
 ·  Author's Guide 
Journal of Research
and Development
 Staff 
 Contact Us 
 Related links: 
  Autonomic Computing 
  IBM AC Research 
  IBM eServer and AC 
IBM Systems Journal 
Volume 42, Number 1, 2003
Autonomic Computing
 Table of contents: arrowHTML arrowPDF   This article: HTML arrowPDF          DOI: 10.1147/sj.421.0085arrowCopyright info
  

Competitive algorithms for the dynamic selection of component implementations

by D. M. Yellin

As component-based development matures, more and more applications are built by integrating multiple distributed components. We suggest providing components with multiple implementations, each optimized for a particular workload, and augmenting the component run-time environment with a mechanism for switching between implementations. This mechanism monitors the types of requests the component is receiving, and adaptively switches implementations for optimal application performance. Achieving this optimal performance depends on making good choices as to when and how to switch implementations, a problem we refer to as the adaptive component problem. We first formalize the generic problem and then provide an algorithm, named Delta, for switching implementations in the special case when the component has exactly two implementations. We show that this algorithm is (3 + epsilon)-competitive with respect to the optimal algorithm, where epsilon is a small fraction. We establish a 3-competitive lower bound for the problem, which implies that Delta is close to optimal. We describe the application of these results to the distributed pub/sub problem, and the data structure selection problem.

Many applications are being built by integrating multiple distributed components in order to implement a particular business function. The increasing popularity of component programming models, such as JavaBeans**1 and Web services,2 is predicted to further accelerate the adoption of component-based development (CBD).

An impediment to the adoption of CBD, however, is the inability of the “user” of the components to optimize their performance for use in a particular solution. Web services, with its premise of loosely coupled distributed components working together, makes this issue even more acute, as the development, deployment, and maintenance of components making up a single solution may come from different vendors and run in very different system environments.1

In this paper, we propose a generic framework that addresses this issue. In our formulation, an adaptive component has multiple implementations, each optimized for a particular request workload. A mechanism for switching between implementations is also provided. The underlying system monitors the current request workload for a given component and adaptively switches to the implementation best suited to this workload. By shifting more of the performance optimization burden to the component, performance tuning is simplified and the resulting overall system performance is likely to be improved. Additionally, by having the system monitor the request workload and dynamically switch implementations based upon current usage patterns, the component can better accommodate changing request patterns.

Because switching between implementations can incur a heavy cost, good algorithms are needed for determining, at run time, when to switch between implementations. We call this the adaptive component problem. This paper describes an algorithm, named Delta, for the case when there are exactly two implementations. We show that Delta is (3 + epsilon)-competitive. This algorithm is designed for the case when the cost of switching between implementations is very large compared to the cost of processing a single request. In this case, the value of epsilon is guaranteed to be very small. Because we also show a 3-competitive lower bound, this algorithm is close to optimal.

We show the applicability of this framework to two problems. The first is an adaptive version of the distributed pub/sub problem, where multiple loosely coupled components are reading and writing from a shared data repository. A component can either read and write to this shared repository or create a local data cache for fast access. The second example is an adaptive version of the data structure selection problem, where an application must choose the appropriate internal data structure to use in order to provide the quickest answer to particular queries. We show that both of these are instances of the adaptive component problem and that Delta can be used to decide when to switch implementations, thus optimizing run-time performance. This illustrates the applicability of our framework to a wide range of problems.

Here is an outline for the rest of this paper. The next section “The adaptive component problem” is followed by “The Delta algorithm.” Then, the section “Examples” contains our two case studies. In the section “Competitiveness” we prove that Delta is (3 + epsilon)-competitive, and in the section “Lower bounds,” we establish a 3-competitive lower bound for this problem. The section “Related work” is followed by “Summary and open issues.”

The adaptive component problem

We propose a model where the component developer implements multiple versions of a component, each optimized for a specific request workload. The developer also writes the code that controls the switching between implementations at run time. For each request type that the component can service, and for each implementation of that component, we determine the cost for processing a request of that type. We similarly determine the cost of switching between any two implementations. These costs may be specified by the component developer or may be derived empirically (e.g., by profiling). An on-line algorithm will monitor request workloads at run time and determine when to switch implementations. The rest of this section formalizes these concepts and defines an optimality criterion for determining when to switch implementations.

Let Comp be an adaptive component, and let ReqTypes = {typi} be the set of request types that Comp can process. Let Impls = {implj} be the set of implementations of Comp, with impl1 being the default implementation. Let Cost:ReqTypes × ImplsR be the function that gives the cost for Comp to process a request of a given type using a particular implementation (R denotes the set of reals). Let SwitchCost:Impls × ImplsR be the function that determines the cost in Comp of switching from one implementation to another. SwitchCost(impli, implj) = ∞ iff Comp cannot switch from impli to implj. The cost functions may reflect internal computation costs, network message costs, or a combination of these and other metrics, depending upon the application environment.

We represent the computation to be performed as a sequence of requests, r1, …, rk. The empty sequence is denoted by capital omega. To facilitate the modeling of many different sorts of problems, we allow a single request to be processed by either a single component or by multiple components. Given such a sequence of requests, a switch of implementations may occur after processing any request in the sequence. We represent this by the operation switch(impli, implj), where impli is the current implementation of the component, and implj is the implementation being switched to. Hence, given a sequence of requests alpha = r1, …, rk, we model the adaptive behavior of the component Comp as transforming this sequence to the sequence alpha' = s1, …, sf such that the following comments are true:

  • For 1 ≤ if, either si is a request or si is a switch operation.
  • Removing the switch operations from alpha' produces alpha.
  • If sm = switch(implj, implk) then either sm is the first switch operation in the sequence and j = 1 or the closest preceding switch operation in the sequence is of the form switch(impli, implj).

An on-line algorithm is one that transforms the sequence alpha by deciding whether or not to insert a switch operation after request ri based only upon the sequence seen so far, r1,  …,  ri. We  write alphaA alpha' to indicate that on-line algorithm A (used by Comp) maps alpha into alpha' when processing it. For any request si in the transformed sequence alpha', we say that the implementation implk is active at si if the closest switch operation preceding si in the sequence is of the form switch(implj, implk), or k = 1 and there is no switch operation preceding si. When algorithm A and sequence alpha are understood from the context, we simply denote this by ImplIs(si, implk).

CostalphaA is the cost of Comp to process request sequence alpha using algorithm A.
CostalphaA = Cost
(alpha') =∑ni=1 C(si) where alphaA alpha' = s1, …, sn and where

C(si) = open brace Cost(si, implk) if si is a request, and implIs(si, implk);
SwitchCost(implj, implk) if si is the operation switch(implj, implk).

Let O be an optimal algorithm; that is, for any sequence alpha and any on-line algorithm A, CostalphaOCostalphaA. An on-line algorithm A is c-competitive3,4 iff, for any sequence alpha, there exist constants c and d such that CostalphaAc*CostalphaO + d. Given Comp, RegTypes, Impls, Cost, and SwitchCost, the adaptive component problem is to find an on-line competitive algorithm for this problem instance, that is, to find a c-competitive algorithm for some constant c.

Now consider a special case of the adaptive component problem where Impls = {impl1, impl2}, which we call the adaptive two-implementation-component problem (the adjective “adaptive” is sometimes omitted, for conciseness). We assume that there is at least one request r such that Cost(r, impl1) < Cost(r, impl2), and at least one request r' such that Cost(r', impl1) > Cost(r', impl2). Also, we assume that there are no constraints on the order in which requests must be composed in a sequence (i.e., any interleaving of requests is a legitimate request sequence).

Let SC1 = SwitchCost(impl1, impl2), SC2 = SwitchCost(impl2, impl1), and SC = SC1 + SC2. We call SC the round trip switching cost. The Delta algorithm described in the next section will perform close to optimal when, for any request r, |Cost(r, impl1) – Cost(r, impl2)| much less than SC. That is, the difference in processing cost for a single request when one implementation is active as opposed to the other implementation, is significantly less than the switching cost.

The Delta algorithm

Consider any two-implementation-component problem, where SC is the previously defined round trip switching cost. Let alpha = r1, …, rk be a sequence of requests. Then Cost(alpha, implj) = ∑ki=1 Cost(ri, implj), j = 1, 2. Algorithm Delta is extremely simple and works as follows. Say that implementation impli of component Comp is active, i ∈{1, 2}, and the processing of request rk in the sequence r1, r2, … has just ended. Given impli, denote the other implementation by (impl sub i)bar. If there exists jk such that Cost((rj, …, rk), (impl sub i)bar) ≤ Cost((rj, …, rk), impli) – SC, then Delta instructs Comp to switch implementations.

Delta also has a simple implementation using just three counters: Impl1Cost, Impl2Cost, and MinDelta. The algorithm, when impl1 is assumed active, is given in Figure 1. The algorithm works analogously when impl2 is active.

Figure 1 Figure 1

To see why this implementation is correct, note that by definition Cost(r0) = 0 and if Delta switches to impl2 after rk, then existsj, 1 ≤ jk, such that

(Cost((r1, …, rk), impl1) – Cost((r0, …, rj-1), impl1)
– (Cost((r1, …, rk), impl2) – Cost((r0, …, rj-1), impl2) ≥ SC
maps to
(Cost((r1, …, rk), impl1) – Cost((r1, …, rk), impl2)
– (Cost((r0, …, rj-1), impl1) – Cost((r0, …, rj-1), impl2) ≥ SC

After processing request k in the sequence, the value of MinDelta in the implementation of Figure 1 equals the minimum of 0 and min0≤jk {Cost((r0, …, rj), impl1) – Cost((r0, …, rj), impl2)}.

At this same point Impl1Cost equals Cost((r1, …, rk), impl1), and Impl2Cost equals Cost((r1, …, rk), impl2). Hence the equation above is true iff:

(Impl1CostImpl2Cost) – MinDeltaSC

This is exactly the computation that the algorithm of Figure 1 performs in order to determine when to switch implementations.

The section “Competitiveness,” later, contains a formal proof that Delta is competitive and close to optimal. We now provide some intuition as to why this is the case. Let A be any algorithm and consider B, the “adversary” of A, an algorithm that observes A and tries to devise a sequence on which A performs worse than B. If A keeps impl1 active too long, the adversary will devise the sequence to be costly when impl1 is active and cheap when impl2 is active. Hence, any competitive algorithm must keep switching between implementations when it determines that the cost of keeping the other implementation active is lower. But if A switches implementations too often, the switching costs will dominate and the adversary can choose to keep the same implementation active, thus avoiding all switching costs. The key insight of Delta is to switch exactly when it accumulates SC more in cost than the other implementation would, if it were active. Choosing any value other than SC results in worse performance.

To see why this is the case, say that A makes impl1 active and switches to impl2 after accumulating additional cost k, k < SC (more than it would have accumulated if impl2 were active). As soon as A makes impl2 active, the adversary would design the sequence so that A accumulates additional cost k when impl2 is active. At that point A will switch back to impl1. The adversary would keep impl1 active throughout this sequence. Hence, the cost to the adversary would be just k, while A's cost would be k + SC + k > 3k. A would thus be worse than 3-competitive (how much worse would depend upon the choice of k).

On the other hand, say that A makes impl1 active and A switches to impl2 after accumulating additional cost k, k > SC. The adversary would make impl2 active at the start of the sequence. As soon as A accumulates k in cost and switches to impl2, the adversary would switch to impl1 and would design the sequence so that A accumulates additional cost k when impl2 is active. At that point A will switch back to impl1. Hence, the cost to the adversary would be just SC, while A's cost would be k + SC + k > 3*SC. Hence, A would be worse than 3-competitive.

The reason Delta is (3 + epsilon)-competitive and not just 3-competitive is due to boundary conditions; Delta does not always have the opportunity to switch when it accumulates exactly SC more in cost. For instance, in may be that impl1 is active and Delta has accumulated SC - 1 more in cost than it would have if impl2 had been active. The next request r may cost it rCost = maxrCost(r, impl1) – Cost(r, impl2) more to process in impl1 than in impl2. Therefore Delta will not actually switch until it accumulates SC + rCost – 1 more cost in impl1 than it would have in impl2. The epsilon term bounds the cost of these boundary conditions.

Examples

In this section we apply the Delta algorithm to two problems: the distributed pub/sub problem and the data structure selection problem.

The distributed pub/sub problem. Consider a data server that serves records from a database to many clients. The server and each client reside on separate nodes of the network. Each client can perform a read or a write on the data. We assume that the clients are independent, that is, a client reads and writes data independent of any other client. There is no synchronization between clients. More formally, for any given “run,” there is a linear order in which each client reads and writes records, but the interleaving at the server of reads and writes from different clients is subject to various factors, and cannot be predicted.

Each client can exist in one of two modes: either in subscription (Sub) mode or in nonsubscription (NonSub) mode. In the latter case, for each read that the client wants to perform, it must send a message to the server and receive a reply back. In Sub mode, a client caches a local copy of the database. All reads of the database go against this local copy. In either case, writes must still go to the server. Upon receiving a write update from any client, the server must inform all subscribers (even the writer of the data if he or she is in Sub mode) of the change to the data. This mechanism is known as “pub/sub” (publication/subscription).

Because it is often impossible to statically predict the read/write behavior of clients, and because their behavior changes over time, we consider an adaptive pub/sub strategy that does not require a client to permanently use either Sub or NonSub mode, but that can flexibly switch between these implementations depending upon current workloads. The goal is to find an optimal strategy for switching implementations that minimizes network traffic.

An instance of the pub/sub problem consists of a server and m clients denoted by clientj (1 ≤ jm). Let r1, …, rk be a sequence of requests where request rj is either readj or writej, 1 ≤ jm, indicating that clientj is either reading or writing a data record to the database. We assume that the database contains p – 1 records (by database we simply mean some collection of data items, where each item can be read and written individually). We now focus on how the requests related to clienti, 1 ≤ im, when in NonSub mode, are carried out.

  • readi(r): This request generates a message from clienti to the server asking for record r, and a message back from the server delivering record r.
  • readj(r), ji: This request does not concern clienti (it is only relevant to clientj).
  • writei(r): This request generates a message from clienti to the server asking for record r to be written.
  • writej(r), ji: This request does not concern clienti.

The requests related to clienti, 1 ≤ im, when in Sub mode are carried out as follows.

  • readi(r): Record r is read from the cached data at clienti, thereby avoiding the need for messages to the server.
  • readj(r), ji: This request does not concern clienti.
  • writei(r): This request generates a message from clienti to the server asking for record r to be written, and an update(r) message from the server to clienti.
  • writej(r), ji: This request generates an update(r) message from the server to clienti informing it that record r has been updated.

Switching implementations is carried out through subscribe and unsubscribe messages.

  • subscribe( ): This is shorthand for switch(NonSub, Sub). It generates a message from clienti to the server, subscribing clienti to the database, and a message from the server delivering a local copy of the database to clienti. It puts clienti into Sub mode.
  • unsubscribe( ): This is shorthand for switch(Sub, NonSub). It generates a message from clienti to the server unsubscribing clienti. It puts clienti into NonSub mode, and discards its local copy of the database.

The cost of a request is proportional to the number of messages it generates and their size, such that sending a single message containing a single record has unit cost. Figure 2 gives the costs to clienti for each of these requests.

Figure 2 Figure 2

Algorithm Delta can be applied to the pub/sub problem by monitoring each read and write request at clienti, switching from NonSub to Sub mode when it detects that Sub mode would have processed a previous subsequence of the requests for p + 1 less in network messaging costs than it was processed in NonSub mode. The number p + 1 is the switch cost of the Delta algorithm; SwitchCost(NonSub, Sub) + SwitchCost(Sub, NonSub) = p + 1. It monitors the requests being processed when in Sub mode and decides when to switch to NonSub mode based upon a similar calculation.

Figure 3 gives a sequence alpha involving two clients and a server. For client1, Cost(alpha, NonSub) = 22 and Cost(alpha, Sub) = 7. If p + 1 ≤ 15, then Delta would dictate that client1 should subscribe in the course of this sequence. alpha' is the transformed sequence, if client1 were to subscribe at the beginning of this sequence. In this example, opi indicates that clienti is performing operation op, where op ∈{read, localRead, write, subscribe, unsubscribe}, i ∈{1, 2}, and updatei indicates that the server is sending an update message to clienti. localRead indicates a read operation from the local cache.

Figure 3 Figure 3

There is one subtlety, however, in using Delta for the pub/sub problem. Looking at Figure 2, we see that Delta needs to know not only the reads and writes done by clienti, but also the writes done by any clientj, ji. If Delta is running at clienti, how does it get this information? One possibility is to have Delta run at the server, where all client reads and writes are known, and have the server tell clienti when to switch implementations. However, it may be advantageous to have the control for switching implementations located at the client, thereby providing a more distributed architecture.

In Reference 5 we address the question “How can Delta, running at clienti, know about the write operations performed by other clients without increasing the number of messages that must be exchanged between the client and the server?” In Sub mode the issue does not exist, as the client can infer this from the number of updates messages it receives. In NonSub mode, it is required that the server piggyback a count of the total number of write operations it has received on the reply to each read request by clienti.

Figure 4 illustrates the processing by Delta of sequence alpha involving client1, client2, and a server. The value of the counters as well as the intermediate computations at client1 (in NonSub mode) are shown, as calculated by Delta (cf. Figure 1). In Figure 4, impl1 is in NonSub mode while impl2 is in Sub mode.

Figure 4 Figure 4

The adaptive data structure selection problem. Often an application chooses a data structure that is optimized for particular requests. When different types of requests require different data structures, either duplicate data structures are maintained, or a single data structure is used, chosen presumably for the most frequent type of requests. Consider, for example, a set of records S that has two possible keys, i1 and i2. (My bank allows me to identify myself either by my social security number or by my account number.) Let Comp be the component that encapsulates all operations on S. Comp may support many types of requests that operate on this set, but all of these requests can be categorized as either of type r1, for requests using key i1, or of type r2, for requests using key i2.

One strategy is for Comp two create two index tables for S, one using key i1 and one using key i2. Each index table is implemented as a dictionary,6 allowing one to perform the usual operations of finding, inserting, and deleting elements. But this strategy may not always be feasible, because there may not be sufficient capacity for storing both indices (e.g., a pervasive device), or the overhead in maintaining these two separate indices may be too high. In such cases, we can support two implementations, one that uses key i1, and one that uses key i2. Algorithm Delta is used to dynamically decide when to switch between these implementations. In Reference 5 we present a more complete discussion on the use of Delta in this context.

Competitiveness

For any two-implementation-component problem, let SC1 = SwitchCost(impl1, impl2), SC2 = SwitchCost(impl2, impl1), and let SC be the round trip switching cost SC = SC1 + SC2. Given request r let

reqCost = maxr|Cost(r, impl1) – Cost(r, impl2)|.

Let epsilon = 2*reqCost/SC. In this section we prove the following theorem:

Theorem 1: Algorithm Delta is (3 + epsilon)-competitive for any two-implementation-component problem.

Note that when the cost of processing a single request is significantly less than the cost of switching implementations, then epsilon much less than 1 and Delta is close to optimal.

Proof: Consider any sequence alpha processed by Delta. alpha can always be viewed as consisting of recurring segments of the form omega = β, switch(impl1, impl2), gamma, switch(impl2, impl1), where β and gamma are sequences of requests, during processing of βimpl1 is  active, and  during processing  of gammaimpl2 is active. Note that β and gamma contain no switch statements. Without loss of generality, we can assume that β, gammacapital omega (capital omega is the empty sequence). We will maintain the following invariant:

Invariant: At the start of processing of an omega segment, for any algorithm processing this sequence, either impl1 is active or impl2 is active and there is a debt of SC1. By “debt” we mean that if, at the start of processing of a segment omega impl2 is active, then a cost SC1 has accrued in a preceding segment that has not yet been charged to the algorithm. Initially we know this is true because we assume that all algorithms start in impl1 mode.

Let r be the last request in β. Note that Cost(r, impl1) ≤ Cost(r, impl2) + reqCost. There are two properties of note regarding the cost to any algorithm of processing the β portion of segment omega.

Property 1: Let special A be any algorithm that processes β with impl1 active at both the start and the end of the sequence, and switching between implementations any number of times in between. Then Costβspecial A > Cost(β, impl1) – reqCost; that is, at most reqCost is gained by switching back and forth during the processing of β. To see why this is true, consider the case when special A switches only once; say, special A makes impl1 active during β1, impl2 active during β2, and impl1 active during β3, where β = β1, β2, β3. Assume that β2capital omega, otherwise this is trivially true. If β3capital omega then Cost2, impl2) + SC > Cost2, impl1), otherwise Delta would have switched to impl2 before or at the end of processing β2. If β3 = capital omega then let β'2 be the prefix of β2 with one operation (r) removed from the end of β2. Then Cost'2, impl2) + SC > Cost'2, impl1) since Delta did not switch implementations before completing the processing of β. Since Cost(r, impl2) ≥ Cost(r, impl1) – reqCost, we have

Cost2, impl2) + SC
  = Cost(β'2, impl2) + Cost(r, impl2) + SC
  > Cost(β'2, impl1) + Cost(r, impl1) – reqCost
  = Cost2, impl1) – reqCost

Hence,

Costβspecial A = Cost1, impl1) + Cost2, impl2) + Cost3, impl1) + SC
 > Cost(β, impl1) – reqCost

A similar argument applies to any algorithm that switches multiple times back and forth between implementations during β. This completes the proof of Property 1.

Property 2: For any β1, β2 such that β = β1β2,

Cost(β, impl1) < Cost1, impl1) + Cost2, impl2) + SC + reqCost

That is, it costs at most SC + reqCost more to process β when impl1 is active than processing part of it when impl1 is active and the rest when impl2 is active. If β2 = capital omega, this is trivially true. So assume that β2capital omega and let β'2 be the prefix of β2 with one operation (r) removed from the end of β2. Then

Cost1β'2, impl1) < Cost1, impl1) + Cost(β'2, impl2) + SC

since Delta did not switch implementations before completing the processing of β. Since

Cost(r, impl1) ≤ Cost(r, impl2) + reqCost

we have

Cost(β, impl1) = Cost1β'2, impl1) + Cost(r, impl1)
< Cost1, impl1) + Cost(β'2, impl2) + SC + Cost(r, impl2) + reqCost
= Cost1, impl1) + Cost2, impl2) + SC + reqCost

This completes the proof of Property 2.

Let l1 = Cost(β, impl1) – SCreqCost. Now we determine the cost to any adversary algorithm Adv of processing β by examining the four cases A through D.

  1. Adv starts with impl1 active and ends with impl1 active. If it never switches implementations, its cost is l1 + SC + reqCost. Otherwise, say that it switches from impl1 to impl2 and back to impl1 during β. By Property 1, this can only decrease its cost by reqCost, so its cost is at least l1 + SC. Switching multiple times will not further decrease its cost.

  2. Adv starts with impl1 active and ends with impl2 active. By Property 2, if it switches only once then it may save at most SC + reqCost over the cost of processing β entirely in impl1 but incurs a cost of SC1 in switching implementations. Hence, its cost is at least l1 + SC1. Switching multiple times in processing β will not decrease its cost any further.

  3. Adv starts with impl2 active and ends with impl2 active. If Adv does not change modes at all during this phase, then by Property 2 the cost to Adv is at least l1 to process β. However, according to the Invariant, there is a debt of SC1 to Adv, since it starts β with impl2 active. We remove that debt and charge it to this phase of the algorithm. Hence, Adv will have at least the cost of l1 + SC1. By an argument similar to Property 1, by changing its mode during the processing of this sequence, Adv will not decrease its cost any further.

  4. Adv starts with impl2 active and ends with impl1 active. This is similar to the preceding case, except that it has an additional cost of SC2 for switching to impl1. Hence, its cost is at least l1 + SC1 + SC2 = l1 + SC.

Now consider gamma. Let r' be the last request in gamma. Note that

Cost(r', impl2) ≤ Cost(r', impl1) + reqCost

The two properties stated above for β have analogs in processing gamma (we omit the proofs as they are similar to those given above).

Property 3: Let special A be any algorithm that processes gamma by initially starting with impl2 active, ending with impl2 active, but switching between implementations any number of times in between. Then Costgammaspecial A > Cost(gamma, impl2) – reqCost.

Property 4: For any gamma1, gamma2 such that gamma = gamma1gamma2,

Cost(gamma, impl2) < Cost(gamma1, impl2) + Cost(gamma2, impl1) + SC + reqCost

Let l2 = Cost(gamma, impl2) – SCreqCost. Now we use Properties 1 through 4 above and determine the cost to any adversary Adv of processing gamma by examining the four cases E through H.

  1. Adv starts with impl2 active and ends with impl2 active. If it never switches implementations, its cost is l2 + SC + reqCost. Otherwise, say, it switches from impl2 to impl1 and back to impl2 during processing of gamma. By Property 3, this can decrease its cost by at most reqCost. However, since Adv ends the processing of segment omega with impl2 active (and will start the next segment with impl2 active), we remove SC1 in cost from this phase of the algorithm and pay the debt, thereby maintaining the Invariant. Its costs are therefore l2 + SCSC1 = l2 + SC2.

  2. Adv starts with impl2 active and ends with impl1 active. By Property 4, if it only switches once, then it may save at most SC + reqCost over processing gamma entirely with impl2 active but incurs a cost of SC2 to switch modes. Hence, its cost is at least l2 + SC2. Switching multiple times in processing gamma will not decrease its cost any further.

  3. Adv starts in impl1 mode and ends in impl1 mode. If Adv does not change modes at all during this phase, then by Property 4 the cost to Adv is at least l2 to process gamma. By an argument similar to Property 3, by changing its mode during the processing of this sequence, Adv will not decrease its cost any further.

  4. Adv starts in impl1 mode and ends in impl2 mode. This is similar to the preceding case, except that it has an additional cost of SC1 for switching to impl2. However, since Adv ends the iteration of the omega segment in impl2, we need to remove SC1 in cost from this phase of the algorithm and assign it to the debt in order to maintain the Invariant. Hence, its cost is at least l2.

To analyze the overall complexity of Adv, we consider the cost of executing one complete segment of omega. Adv must follow one of the following eight “paths”: AG, AH, BE, BF, CE, CF, DG, D → H. For instance, the first path says that Adv first implements case A for β and then implements case G for gamma. Note that these eight paths are the only ones possible, as A cannot be combined with E, for example, since case A states that impl1 is active at the end of β, and case E states that impl2 is active at the start of gamma (= end of β). By summing the costs of each of these paths, one sees that CostβgammaAdvl1 + l2 + SC. The cost to Delta is at most l1 + SC + reqCost to process β, SC1 to switch to impl2 at the end of β, l2 + SC + reqCost to process gamma, and SC2 to switch to impl1 at the end of gamma. Hence,

Equation a

Hence, Delta is (3 + epsilon)-competitive on any omega segment.

To complete the proof of this theorem, we need to show that for any arbitrary long sequence that is a strict subsequence of an omega segment, Delta is also (3 + epsilon)-competitive. We leave this as a straightforward exercise, based on the discussion above.

End of proof

Lower bounds

In this section we show that there are instances of the two-implementation-component problem such that no algorithm is better than 3-competitive. The proof involves the dynamic pub/sub problem, given in the section “Examples.”

In analyzing adaptive on-line algorithms, it is useful to differentiate between different types of adversaries.4,7 An oblivious adversary is one that constructs the sequence of requests without regard to the actions taken by the algorithm. An adaptive adversary determines the action corresponding to each element of a sequence on line by taking into account the previous choices made by algorithm. If, additionally, the adversary processes the sequence only after the entire sequence has been generated, it is then said to be an adaptive off-line adversary. This is the most powerful sort of adversary, and we follow the terminology of Reference 8 and call it a strong adversary.

The result of this section shows that no deterministic or randomized algorithm can do better than Delta against a strong adversary for all two-implementation-component problems. Also, since the distinction between strong and oblivious adversaries is irrelevant for deterministic algorithms,7 it also shows that no deterministic algorithm is better than 3-competitive against any type of adversary for all two-implementation-component problems.

Lemma 1: Let Alg be any adaptive pub/sub algorithm controlling an individual client. There exists an arbitrarily long sequence alpha constructed by a strong adversary such that, as the length of alpha tends to infinity, Alg is at best 3-competitive on alpha.

Proof: Consider just two clients, client1 and client2, and let client1 be running algorithm Alg. For any integer l, we can construct a sequence of the form alpha = read1m1 write2n1 read1m2 write2n2read1mk write2nk with the following characteristics (r1m indicates m consecutive client1 requests of type r):

  • The length of the sequence (number of operations) is larger or equal to l
  • For all t, 1 ≤ tk, client1 subscribes after the read1mt operations and unsubscribes after the write2nt operations.

To find this sequence, we begin by “serving” read requests to the client1 until it subscribes and then “serving” write requests to client2, causing update messages to propagate to client1 until client1 unsubscribes. We continue this process until we reach the desired sequence length. Since we are dealing with a strong adversary, it knows exactly when to serve read (write) requests, as it just waits until it sees client1 subscribe (unsubscribe). We assume that client1 forever alternates between Sub and NonSub modes if served enough write and read requests, respectively. Otherwise, this client would trivially not be competitive. Indeed, if it stayed in Sub mode, the adversary could forever feed it update messages whereas the adversary would incur no cost by staying in NonSub mode. Similarly, if it stayed in NonSub mode the adversary could forever feed it read messages and the adversary would incur no cost by staying in Sub mode.

Let R be the total number of read operations and W be the total number of write operations in alpha. The cost to client1 incurred by Alg, which we denote by CostalphaAlg(1), equals 2R + W + (pk + k), as it costs 2R for client1 read messages, W for client1 update messages (= client2 write operations), pk for the k client1 subscribe messages, and k for the k client1 unsubscribe messages. Let M = min{2R, W, (pk + k)}. Now devise the strong adversary special Sspecial A as follows:

  • Case 1: M = 2R. Let special Sspecial A stay in NonSub mode throughout the sequence. Then
    Costalphaspecial Sspecial A(1) = 2R. CostalphaAlg(1) = 2R + W + (pk + k) ≥ 3(2R) = 3*Costalphaspecial Sspecial A(1). Hence, Alg is at best 3-competitive.
  • Case 2: M = pk + k. Let special Sspecial A subscribe directly before each set of read operations and unsubscribe directly before each set of write operations. Then Costalphaspecial Sspecial A(1) = pk + k. CostalphaAlg(1) = 2R + W + (pK + k) ≥ 3(pk + k) = 3*Costalphaspecial Sspecial A(1). Hence, Alg is at best 3-competitive.
  • Case 3: M = W. Let special Sspecial A initially subscribe and then stay in Sub mode throughout the sequence. Then Costalphaspecial Sspecial A(1) = W + p. Recall that as l → ∞, we can assume W → ∞, otherwise the client is trivially not competitive. Hence, as l → ∞, Costalphaspecial Sspecial A(1) → W. Therefore CostalphaAlg(1) = 2R + W + (pK + k) ≥ 3W = 3*Costalphaspecial Sspecial A(1), and Alg is at best 3-competitive.

End of proof

Related work

The ideas in this paper are related to prior work in program optimization, competitive algorithms, and data replication policies.

Some of the work in program optimization discusses optimizing programs by choosing among a set of alternate implementations. For the most part, this work focuses on low-level static optimizations. For example, the early work on SETL examined how to choose the best set representation based upon program analysis.9 Similarly the work on SQL (Structured Query Language) optimization can be viewed as statically choosing the best implementation for retrieving information from a database given a particular query.10

Closer to our strategy is the recent work described in Reference 11, in which the authors propose that the component writer provide multiple implementations of a component, assign costs to the different operations on each component, profile a program made up of multiple components running in a particular context, and construct an object affinity graph for this program. They then use a graph partitioning algorithm to find an optimal partitioning of the graph, thereby finding the optimal implementation for each component. Central to their scheme is the observation that the implementations at different components affect each other. Unlike algorithm Delta, their strategy optimizes a program globally, not just at the level of an individual component. On the other hand, their technique requires a more complicated methodology (profiling and the creation of an object affinity graph) and cannot adjust to dynamically changing contexts as Delta does.

Recent work on dynamic architecture description languages (see Reference 12 for a summary) aims at building software architectures that can respond adaptively to changes. Algorithms like Delta will be important in helping realize the goals of these architectures.

Our adaptive pub/sub application contrasts with previous work on data replication.13,14 An extended version of Reference 15 gives a good overview of the literature on file allocation, file migration, and file replication problems. Although similar in some ways, our adaptive pub/sub application is closer to the work in References 16 and 17, which we discuss later in this section.

Our work draws inspiration from many competitive algorithms.4,8,15,18,19 Although it may seem that this work could be used to solve our problem, a closer examination shows that this is not the case. For instance, one immediate idea for a competitive adaptive pub/sub algorithm, modeled after the snoopy caching algorithm,8,18 is for a client to subscribe after performing (½)p consecutive read operations, and to unsubscribe after receiving p consecutive update operations. The asymmetry between reads and updates is due to the fact that in our model a read is twice as expensive as an update operation. Consider the sequence (read1(½)p-1, write2)n; that is, client1 issues (½)p – 1 reads followed by a write by client2. This is repeated n times. The cost to the snoopy caching algorithm for this sequence with regard to client1 would be n*(p – 2). One can devise an adversary that would process this sequence at cost p + n (the adversary would initially subscribe and then stay in Sub mode). There do not exist constants c and d independent of p such that n(p – 2) ≤ c(n + p) + d for all n. Hence, the snoopy caching algorithm is not a good competitive algorithm for the pub/sub problem.

As mentioned, our work on the adaptive pub/sub problem is most closely related to the work in References 16 and 17. Reference 16 seems to exactly address our problem and to derive a better lower bound than ours (2k-competitive, for any integer k). In their model, however, the data being replicated are always of unit size, that is, variable p – 1 (the size of the data being replicated) always takes value 1. Hence, our work generalizes that result in an important way, by allowing arbitrary size data.

Let us see how the algorithm in Reference 16 would work for our problem. This algorithm re-evaluates the mode a client is in after every k operations (alternatively, the algorithm in Reference 20 re-evaluates after a fixed time period t). If read operations are the majority, the client switches to (stays in) Sub mode; otherwise it switches to (stays in) NonSub mode. Consider the sequence (read1k, write2k)n; that is, k reads by client1 are followed by k writes by some other client. This is repeated n times. Dynamically switching betweenimplementations of a component should become a useful tool in autonomic computing. The cost with respect to client1 would be (p + 1)n + 3kn. One can devise an adversary whose cost would be W = min{nk, (p + 1)n}. There do not exist constants c and d independent of p such that (p + 1)n + 3kncW + d for all n. Because p may be arbitrarily large, this may result in a large competitive factor. Hence, the algorithm would not perform well in the worst case when one generalizes to larger data structures. The algorithm does model other aspects that we have not considered. Most importantly, whereas we assume a fixed data “server,” Reference 16 uses the notion of a primary site that can dynamically change in the course of execution.

Very recently, as this paper was going into production, I became aware of the very relevant work on metrical task systems.21 Metrical task systems are very similar to the adaptive component problem described in this paper, and the results there are consistent with the results of this paper. However, there are two important differences between the two. First, in metrical task systems, it is assumed that SwitchCost(i, j) = SwitchCost(j, i) (symmetry). We make no such assumption. As a matter of fact, for the distributed pub/sub problem described in the section “Examples,” this is not the case. Second, in our model we assume that once a request is received, the processing of the request must be completed before switching to another implementation. We took this approach to better model scenarios where rapid response time is critical (i.e., the response must be made before embarking on the relatively lengthy process of switching implementations). In metrical task systems, however, switching implementations (states) can be done after a request is received but before it is processed. For both of these reasons, our work generalizes metrical task systems in the case of two implementations (or, in the terminology of Reference 21, in the case of two states), and it is not clear that the algorithms and proofs given in metrical task systems apply directly to the adaptive component problem.

Summary and open issues

This paper provides a framework for analyzing the effectiveness of components that switch implementations dynamically at run time based upon run-time workload characteristics. We believe that dynamic selection of component implementations will become an effective tool in optimizing component performance, and therefore the framework given in this paper should become increasingly important.

An important contribution of this paper is the near optimal (3 + epsilon)-competitive algorithm Delta, for the two-implementation-component problem, in which there are only two implementations to choose from. An obvious question is whether there exist competitive algorithms for components with an arbitrary number of implementations. In Reference 5 we show there does not exist an algorithm that is better than k-competitive when the number of implementations is k. Another limitation of Delta is that it requires a switch from one implementation to another in “one shot,” and this can often be expensive. In Reference 5 we show how Delta can switch implementations in an incremental fashion for the two example problems of section “Examples,” thereby amortizing the cost of the switch operation. Still, a more general method of incremental switching between component implementations is needed.

Competitiveness is only one criterion by which to judge an algorithm for switching among implementations. In Reference 5 we mention one other criterion: convergence. Other metrics can be used as well. For instance, it may be that some domains exhibit great periodicity of patterns in event sequences. The ability to learn these patterns and then optimize to them is an important strategy in these domains.

Most importantly, we need experience to see how well Delta works in practice. This will shed light on how to best evolve this algorithm. For instance, it may be that domain knowledge needs to be taken into account in order to obtain better results.

**Trademark or registered trademark of Sun Microsystems, Inc.

Cited references

Accepted for publication August 30, 2002; Internet publication January 16, 2003