IBM®
Skip to main content
    Country/region [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    

IBM Systems Journal

Real-Time and Event-Based Systems   Volume 47, Number 2, 2008
Table of contents: HTMLPDF This article: HTML PDFDOI: 10.1147/sj.472.0251Copyright info

Generating real-time complex event-processing applications

by Y. Magid,
D. Oren,
D. Botzer,
A. Adi,
B. Shulman,
E. Rabinovich,
and M. Barnea

We propose to exploit the technology for complex event processing (CEP) embodied in the rule-based engine known as IBM Active Middleware Technology™ and extend it to the development of real-time CEP applications. Specifically, we propose to develop a framework that includes an integrated development environment (IDE) for defining rules, and, given a set of rules, generates code for a CEP application and enables us to determine time bounds on the response of this application to a set of supported events. In particular, the IDE helps determine a time bound for the execution time of the code corresponding to each rule. The calculation of time bounds is based on a set of benchmark measurements to be performed on the target hardware and involves code segments corresponding to basic operations. Although we assume the code generation phase produces Java™ code, the same approach can be applied to any other suitable programming language. In support of a feasibility argument for the proposed approach, we present some preliminary experimental results obtained on a partially implemented tool.

Introduction

Complex event processing (CEP) enables the extraction of meaningful and actionable information from event streams. The CEP technology is intended to provide applications with a flexible and scalable mechanism for constructing condensed, refined views of the data. CEP correlates the data (viewed as event streams) in order to detect and report meaningful predefined patterns, thus supplying the application with an effective view of the accumulated incoming data (events), and allowing the application to react to the detections by executing actions.

CEP functionality is particularly effective when processing events from multiple event sources is required—from the perspective of a business, application, or infrastructure within different contexts. Examples of such scenarios where CEP is useful may involve service level agreement (SLA) alerts and compliance checking applicable to the financial markets, banking, and insurance industries, as well as scenarios related to the sensors and actuators domain.

Real-time systems are those systems that are subject to operational deadlines, typically from input event to system response. In order to ensure that these deadlines are achieved, real-time applications are usually monolithic and strongly coupled with the application domain and the system hardware. As a result, developing real-time applications is usually an expensive, time-consuming process.

The following missile-detect-and-engage scenario, in which the system is used to defend a military asset such as a warship, illustrates a case that requires a CEP solution in real time:

  • A number of sensors are used for detecting potential targets. When a sensor detects a potential target it creates an alert-trigger event and notifies the event processing engine of its occurrence. Then, the event-processing engine may create an alert-track event, in which case it notifies specific sensors to track the potential target.

  • The event-processing engine continues to receive sense and tracking data from sensors. After a target is detected it continues to update the target location (which can be defined in ways such as the most recent update, or a specified average over a set of locations).

  • The event-processing engine may identify a potential target as a foe, in which case an alert-confirmation event is generated. Then, the system determines which asset is to be used for defending against the threat, a suitable intercept-asset event is generated, and a notification is sent to the subsystem operating the intercept asset.

Most CEP efforts do not support guaranteed bounds on system response for real-time processing.1-5 Makers of CEP engines focus instead on enhanced performance and on ability to handle large volumes of streaming data. In particular, they strive to achieve maximum throughput and minimum end-to-end latency.

Several methodologies are in use today for designing real-time systems,6 one of the best known being MASCOT (Modular Approach to Software Construction Operation and Test).7 MASCOT focuses on the real-time control and interface definitions between concurrently running processes, rather then functionality of the software. MASCOT was developed in early 1970s for the U.K. Ministry of Defense and is considered an old but very successful method for real-time system design. Other approaches are Real-Time UML,8 Architecture Analysis and Design Language (AADL),9 Ravenscar Profile (a subset of the Ada tasking features for safety-critical, hard real-time systems)10 and Real-Time Java**.11

In this paper we focus on rule-based CEP engines for the real-time domain. Current CEP engines cannot guarantee compliance with real-time requirements and thus cannot be directly used for developing real-time CEP solutions. We propose to develop a framework that, given a set of rules, generates code for a CEP application for which we can determine time bounds on its behavior. In particular, the framework includes an IDE for defining rules, and, given a set of rules, provides a time bound on the execution time of a code segment corresponding to each rule. The calculation of time bounds is based on a set of benchmark measurements to be performed on the target hardware and involves code segments corresponding to basic operations. Although we assume the code-generation phase produces Java bytecode, the same approach can be applied to any other appropriate programming language. Our proposed design exploits the technology used in developing IBM Active Middleware Technology*12,13 and extends it to the real-time domain.14 IBM Active Middleware Technology is a lightweight and flexible rule-based CEP engine intended primarily for automating and monitoring business processes. We make a feasibility argument for the proposed approach by presenting some preliminary experimental results obtained on a partially implemented tool.

The rest of the paper is organized as follows. In the next section we describe IBM Active Middleware Technology, a rule-based CEP engine for non-real-time applications. In the following section we present our proposed approach to generating real-time CEP applications, which is based on extending IBM Active Middleware Technology to the real-time domain. In particular, we discuss the necessary modifications to the situation-definition language and the code-generation approach. In addition, we present the preliminary results of experiments in which we determine bounds on the execution time of the code corresponding to given rules and compare these bounds with measurement data. We conclude with a brief summary.

IBM Active Middleware Technology

IBM Active Middleware Technology is a lightweight and agile CEP engine, capable of detecting complex situations (rules) that arise when processing a context-sensitive composition of events. It is intended for automating and monitoring business processes, facilitating changes to business logic, and supporting an on demand business environment. In this section we introduce the IBM Active Middleware Technology language for defining CEP rules (situations), and we provide a high-level description of the event-processing flow performed by the IBM Active Middleware Technology engine. The IBM Active Middleware Technology language for defining CEP constructs contains the following elements: event, situation, lifespan, and action.

An event is an instantaneous (i.e., it happens at a specific point in time) atomic (i.e., it takes place in its entirety or not at all) occurrence that has significance in the domain of interest. Events can be concrete or inferred. A concrete event has a concrete manifestation in reality, such as a change in the state of an object. An inferred event has no concrete manifestation and can only be inferred from the state of the world (context) and a history of concrete events. A situation, which corresponds to an inferred (complex) event, defines the set of events to be evaluated and the conditions they must satisfy in order to conclude that the event occurred.

A lifespan is the time frame within which situations are to be detected. A lifespan can be initiated by an event, or at startup, or at a specified absolute time. It can be terminated by an event, or at a specified absolute time, or after a specified interval from its initiation time. An action represents a user-defined operation (e.g., a method call, sending an email, or updating a database) that is to be triggered upon detection of a situation or arrival of an event.

The language supports defining the following types (in boldface) and subtypes (in italics) of event correlation patterns, or situations, thus providing a rich language for describing rules:

  • Joining—detect a situation when all required events arrive, either in a predefined order (sequence) or in an arbitrary order (all).

  • Counting—detect a situation when at least, at most, or exactly n events have arrived (the counting can also be over some function of the attributes of the arriving events rather than simply the number of arriving events)—atleast, atmost, nth situation subtypes.

  • Temporal—detect a situation at an absolute time, every specified time interval, or a specified time interval after a certain event has arrived.

  • Aggregation—detect a situation depending on a condition on data aggregated for certain arriving events (e.g., minimum, maximum, sum, average, or percentage) and report this data—report, percentage, crosses situation subtypes.

  • Absence—detect a situation when certain events did not arrive (not), or when some events arrived and others did not (unless).

Each situation definition contains some general parameters, such as a lifespan, a detection condition, whether the situation should be detected immediately (i.e., once all its operands have arrived) or at the end of the lifespan, and so on. In addition, each situation contains some type-specific parameters, such as the quantity for counting situations or an interval for every situation.

The definition of each situation operand event contains the following parameters:

  • Event name

  • Threshold condition—only events satisfying this condition will be considered as operands for this situation.

  • Quantifier—indicates the required behavior when more than one event instance of this event type arrives during the situation lifespan (temporal context), i.e., whether the first, last, or each of the instances should be used as the situation operand.

  • Quantifier type—absolute or relative—is used jointly with the quantifier parameter to determine whether only the absolute first/last/each event should be taken or the first/last/each which causes a situation detection (e.g., the situation condition holds, quantity is met for counting situations, etc.).

  • Override—a condition indicating under what circumstances the arriving event instance should override all previous instances as situation candidate.

  • Retain—a condition indicating under which circumstances the event that was already used to compose a situation detection should be retained to be used for detecting additional situation instances in the same lifespan instance.

  • Some parameters that are more situation-type specific, such as weight/counted for counting situation operands, min/max/sum/average conditions/expressions for aggregation situation operands, etc.

The language defines extensible expression syntax. Expressions are used in various parts of the situation definition in order to filter the events at different levels of the pattern detection (events threshold conditions, situation detection condition, lifespan conditional initiation/termination, etc.), and in order for the notification of a detected situation to contain data meaningful to the user application. The language contains many built-in operators that can be used when composing these expressions (e.g., numeric operators, string operators, operators for querying a user database for obtaining a calculated value or evaluating an “exists” condition). The expression syntax of the language can easily be extended to allow using domain-specific expressions and conditions (e.g., statistical functions, geo-spatial functions.) within the rule definitions. The extension is accomplished via user application code implementing an interface.

The IBM Active Middleware Technology language also supports defining stateless rules, that is, rules based on a single event without a temporal context. Definition of such rules involves defining events that are referenced from another event, under some condition, and the attributes of the referenced events are derived from the attributes of the triggering event by mapping expressions.

The event processing performed within the IBM Active Middleware Technology engine upon arrival of an event can be described as follows:

  • Stateless rules phase—generate the events that are referenced from this event (if any).

  • Events to which actions are mapped are sent to be handled offline by a different module that activates the user-defined actions.

  • For the original event and its referenced events, perform the following:

    • Terminate—terminate all situations for which this event type is a terminator (lifespan terminator). Termination may result in detected situations for situations that are deferred, i.e., detected at the end of their lifespan. In this case the detected situations are processed (recursively, the same way events would be) before continuing processing this event, to allow correct handling of nested situations.

    • addOperand—add this event as a potential operand, or a candidate operand, to all situations in which it is defined as an operand, considering the threshold conditions for this event in the different situations. This operation may also result in situation detection.

    • Initiate—initiate all situations for which this event type is an initiator (lifespan initiator).

    • Recursively process situations detected at the addOperand phase (same as event processing).

  • Report the set of situations detected as a result of the original incoming event; that is, these processing steps are performed repeatedly for the original event, its referenced events, the detected situations as a result of the original event, and its referenced events, and so on until no new situations are detected. We refer to such a collection of situations as the “transitive closure” of the original event.

IBM Active Middleware Technology has two modes of operation—in-memory and persistent. In the in-memory mode, the whole state is kept in memory (i.e., which lifespan instances are open for which situations, which event operand candidates already arrived for which situations, etc.) In the persistent mode, the whole state is kept in a database, and read and updated upon each processing operation. The desired mode of operation is determined at configuration time.

An approach to generating real-time CEP applications

In this section we start by examining the IBM Active Middleware Technology language, and identify the modifications to be made to the language in order to enable the calculation of bounds on the processing time. Next, we describe our approach to generating the application code. Our code generation allows us to calculate bounds on the processing time and is tailored to the given set of CEP rules, so that one can determine if the generated application satisfies the required real-time constraints. We illustrate the code-generation concept through an example in which we include classes that are part of the class infrastructure, which are developed in advance, and one class that is generated according to the given set of rules (situations). We then describe how the generated code is to be analyzed and how the worst-case processing time for each event type is to be calculated. Lastly, we describe the result of an experiment we conducted to provide a “reality check” for the claims made in this paper and in support of a feasibility argument for our approach.

Language restrictions

Examining the language structure and the event-processing flow in IBM Active Middleware Technology, one can identify the properties that are incompatible with time bounds on its response: 1) unlimited number of lifespans as context-related information; 2) unlimited number of candidate events as operands to a situation-detection process; and 3) unbounded database operations.

In IBM Active Middleware Technology, there is no bound on the number of lifespans that can be open at a certain time, nor the number of situations in which an event participates. Indeed, upon event arrival, the processing stages of Terminate, Operand, and Initiate are not bounded, because the event can participate in any number of situations, which may have any number of open lifespans (contexts) at a given time.

Upon arrival, an event is processed according to the situations it participates in, in the appropriate order. Consider an event that is added as a candidate event to a situation operand. If the situation involves immediate detection, then its detection process is invoked. If the situation, however, involves a condition that depends on a set of events, then the arrival of at least one event of each type required allows us to determine whether the situation condition is met. Events for which the situation condition is met are removed unless specified otherwise. Therefore, as long as the situation condition is not met, it is possible that the number of candidate events increases unboundedly. In this case it is not possible to bound the processing time for detecting the situation because all possible combinations of candidates, including the last arrival, have to be reexamined in the attempt to satisfy the situation condition. Moreover, when the situation uses the quantifier each for some of its operands, when the number of candidate events can increase unboundedly, so does the number of detected situation instances as a result of one incoming event.

Running in persistent mode requires database access, and queries using Structured Query Language (SQL) expressions are unbounded operations. It is thus not possible to include accessing of a database in a real-time environment, unless a real-time database is developed and becomes commercially available.

We propose to address these issues by imposing some language restrictions and by providing capabilities for specifying such restrictions:

  • The user can limit the number of simultaneously open lifespan instances for each lifespan type. This way, the context for each incoming event is restricted by the union of the restrictions on all the situation lifespans the event participates in.

  • The user can restrict the number of event instances that are kept as candidates for a given situation operand. Then, the processing time associated with a situation detection is bounded because the number of combinations to be checked for all the operand instances is bounded. This also imposes a bound on the number of situation instances it is possible to detect when the situation uses the quantifier each for some of its operands.

  • SQL queries are not allowed. Thus, SQL is not to be used in expressions and conditions within the language, and configuring the engine to run in persistent mode is not possible.

The code-generation approach

IBM Active Middleware Technology implements the CEP technology. As a generic CEP engine, it has the advantages of flexibility and ease of use when developing an event-based application. Rule definition does not require programming skills. Rules can be modified and reloaded at any time without any code modifications and without shutting down the system.

Whereas we would like to adopt the rule-based approach of IBM Active Middleware Technology, we are not able to use the same generic engine because it cannot provide the bounds on execution times required in the real-time environment. Consequently, we propose a code-generation framework that, for a given set of IBM Active Middleware Technology rules, generates a CEP application that is optimized for the specific set of rules and has known time bounds for the hardware platform on which the application will run. The proposed code-generation approach maintains the advantage of IBM Active Middleware Technology as a generic and flexible CEP engine while upholding the real-time requirements of the system.

When developing the CEP rules, the user has to take into account time bounds for processing the given rules on the target hardware, so that the user can ensure that the real-time constraints are met. The framework we propose includes an IDE that will support rule definitions in the IBM Active Middleware Technology language, similarly to the current IBM Active Middleware Technology IDE, but in addition, given a set of rules, will provide a time bound for the processing time each rule contributes to an event-processing time. The estimate calculation is based on a set of benchmarks executed on the target hardware for measuring time bounds on the basic operations of the programming language.

The code-generation mechanism is based on an infrastructure of abstract base classes for each situation type (such as all, sequence, atleast, and so on) and for containers that manage the event operands candidate instances. Given the specific set of rules, concrete subclasses of these abstract classes are generated according to the rule definitions; that is, the subclasses will implement the rule specifics such as number of operands, operands quantifier (first, last, each), and so on. Concrete classes are also generated to implement each event type in the rule set, based on an event interface.

All expressions used within rule definitions, such as various conditions, situation notification data expressions, mapping expressions for stateless rules, and so on, are generated into code that is integrated into the concrete implementation classes. For stateless rules, each rule (consisting of a condition expression and several mapping expressions) will be kept in a rule repository and evaluated at runtime upon event arrival. Figure 1 illustrates the design of the hierarchy of abstract and generated concrete classes, in which S1 is an example situation of type All. These classes implement the principal algorithms used in IBM Active Middleware Technology for detecting situations, albeit in a different design. Class AbstractSituation is an abstract class of the highest level of abstraction and implements the logic common to all situation types. Extending this class are still abstract classes, one for each situation type, such as AbstractAll. For each situation in the given situation (rule) set, such as situation S1 in this example, a concrete class will be generated, S1All in this case, extending the appropriate situation type and implementing the logic associated with the situation definition. The situation-detection logic uses a container for situation operand candidates. The container is created by extending class AbstractContainer, as shown in Figure 1. In addition, there is a common interface IEvent, implemented by the generated concrete event classes, shown as e1 … en in the figure.

Figure 1  Main abstract and concrete classes and their relationships Figure 1

Figure 2 illustrates the event-processing flow for the real-time system, using the generated code for evaluating expressions. The component RT Event Processor in Figure 2 takes the arriving event as input and produces a set of situations that are made true by the event. The component RT Rule Selector, which consists mainly of generated code, maps each event to the rules (situations) associated with it. To accomplish this, RT Rule Selector accesses the component Rule Repository (which also consists mainly of generated code) and retrieves from it the rules associated with the given event type. These rules are then evaluated by component RT Rule Evaluator to identify the situations detected.

Figure 2  Event processing flow Figure 2

Code-generation example for situations of type All

The following example illustrates the code-generation approach for a situation of type All, focusing on the core of the situation detection logic, as represented in Figure 1. Table 1 shows the base class for this situation type as an abstract class representing a runtime situation-lifespan instance (i.e., a runtime instance of a single situation context).


Table 1 Class AbstractSituation

1abstract public class AbstractSituation {
2  protected CyclicList[] candidates;
3  protected IEventAdder[] adders;
4  protected IEventComposer[] composers;
5  …
6
7  abstract protected IEvent[] compose();
8  abstract protected IEvent createNotification (IEvent[] events);
9  …
10}

This runtime object holds the situation candidate events in this lifespan context (corresponding to class AbstractContainer in Figure 1), an event adder object—responsible for adding candidates when an event instance that is one of this situation's operands arrives, an event composer object—responsible for composing the situation when conditions are met for detection, and several more. All these objects follow the similar design of an abstract base class implementing the common behavior for these operations, extended by concrete classes containing specific implementations according to the event operand parameters chosen in the situation definitions (such as whether the quantifier is first, last, or each, whether override or retain flags are set, etc.). These objects are not described here in detail for the sake of brevity.

The abstract method compose() attempts to detect the situation subject to conditions such as when the situation should be detected (immediately, when its operands arrive, or upon lifespan termination). The detection is done differently for each situation type and therefore the method is implemented by  extending situation-type-specific abstract  classes. Method createNotification() constructs the situation notification object when the situation has been successfully detected. This method is implemented at the concrete class level according to the defined situation attributes for notification. The next step is an abstract class extending the base class, implementing the logic specific to situations of type All, as shown in Table 2.


Table 2 Class AbstractAll

1abstract public class AbstractAll extends AbstractSituation {
2  protected IEvent[] compose() {
3    …
4    IEvent[][] matrix = new IEvent[total][candidates.length];
5
6    for (int i = 0; i < candidates.length; ++i) {
7      …
8
9      int row = 0;
10      while (row < total) {
11        for (int j = 0; j < composers[i].getCount(); ++j)
12          for (int l = 0; l < copyCount; ++l) {
13            matrix[row++][i] = composers[i].get(j);
14            composers[i].get(j).setComposed(true);
15          }
16      }
17  }
18
19  …
20  for (int i = 0; i < total; ++i) {
21    notifications[i] = createNotification(matrix[i]);
22  }
23
24  …
25
26  return notifications;
27  }
28}

Method compose(), implemented at this level, attempts to detect the situation when all situation operands have arrived and uses the composer objects to obtain the result of processing by composer of the specific operand parameters and conditions according to the situation definitions. This method also calls method createNotification() in order to prepare the array of detected situation instances with their required attributes as the method result. Method compose() itself is called after successfully adding an incoming event instance as a situation operand (using the adder object), if the situation was defined to have immediate detection, or upon the lifespan termination if the situation was so defined.

For the next step, suppose now that a situation called AlertTrack of type All is defined by the user. The situation has two event operands, one named AlertTrigger and the other named TrackData. In this case, the situation should be detected when an event of type AlertTrigger and an event of type TrackData have both arrived (assuming no additional conditions). Suppose that both event operands are defined with first quantifier, no threshold condition, and retain and override set to false. Given this situation, the concrete class illustrated in Table 3 is generated.


Table 3 Class AlertTrackAll

1public class AlertTrackAll extends AbstractAll {
2  public AlertTrackAll() {
3    /* 2 operands */
4    candidates = new CyclicList[2];
5    candidates[0] = new CyclicList();
6    candidates[1] = new CyclicList();
7
8    /* Quantifier is ‘first’ */
9    adders = new IEventAdder[2];
10    adders[0] = new FirstEventAdder(candidates[0]);
11    adders[1] = new FirstEventAdder(candidates[1]);
12
13    composers = new IEventComposer[2];
14    composers[0] = new FirstEventComposer(candidates[0]);
15    composers[1] = new FirstEventComposer(candidates[1]);
16    …
17  }
18
19/* Add a TrackData event as candidate */
20  public void add(TrackData td) {
21    adders[0].addEvent(td);
22  }
23
24/* Add an AlertTrigger event as candidate */
25  public void add(AlertTrigger at) {
26    adders[1].addEvent(at);
27  }
28  /* Prepare a detected situation report */
29  protected IEvent createNotification(IEvent[] events) {
30    TrackData td = (TrackData) events[0];
31    AlertTrigger at = (AlertTrigger) events[1];
32
33    AlertTrack track = new AlertTrack();
34    track.trackId = td.trackId;
35    track.x = td.x;
36    track.y = td.y;
37
38    return track;
39  }
40
41  …
42}

The code generator maps the parameters in the situation definition to corresponding operations in AlertTrackAll class. Because situation AlertTrack is of type All over two event types, the AlertTrackAll class extends AbstractAll, as explained above, and the candidates' field in AltertTrackAll is an array of size 2, with two add methods, one for each event type. Because the two events use quantifier first in this example, the corresponding adder and composer code objects are of the type FirstEventAdder and FirstEventComposer, which mimic the behavior of quantifier first.

Calculating time bounds

Given a set of rules, the proposed IDE analyzes the run-time code corresponding to each rule in order to determine bounds on the execution time for these rules. As is further explained below, these bounds are based on the results of benchmarks in which the execution times of basic operations are measured on the target hardware and are stored in a repository or are otherwise provided to the tool as an input. The benchmarks assume that the target platform runs a real-time implementation of the Java Virtual Machine, that is, an implementation with guaranteed limits on the execution time for primitive operations, such as memory allocation and numeric operations, as well as guaranteed limits on the execution time for garbage collection operations.

To derive the worst-case execution time, the tool analyzes the basic behavior of the operations performed in the run-time code, such as method compose in AbstractAll, and method consume in AbstractSituation, as shown in the previous section. The execution times of these operations depend, in turn, on the number of candidate events that are kept in memory, as well as on the execution times of the various event adders (such as FirstEventAdder), composers (such as FirstEventComposer), and consumers. These execution times can all be benchmarked in advance. Consider for example method compose for situation type All, in class AbstractAll shown in Table 2. In order to calculate the worst-case execution time for this method, the tool analyzes what is the worst case in terms of the current runtime state upon which the method is called. In this case we can see that the worst case occurs when the built matrix is maximal, which will happen when the composers' getCount invocation will return the maximal value (according to the real-time restrictions of the definitions in terms of number of candidates accumulated in the candidate list), that is, when the candidate lists are full. Taking into account that this state is the worst-case state, the tool can calculate the processing time of method compose in this case, while relying on the benchmarks of the building blocks.

In addition, the tool analyzes the execution times of rule-specific operations, such as the createNotification method in the AlertTrackAll class shown in the example in the previous section. The createNotification method, for example, is composed of low-level Java primitives (array access, expression evaluation, etc.), which are also benchmarked in advance. The tool uses these benchmarks in computing the time bounds for executing the specific method.

After breaking down the code segment (possibly corresponding to rule definitions elements, as explained above) into primitives and other operations that have been benchmarked in advance, the tool adds up the worst-case execution times for these primitives and operations to obtain the worst-case execution time for the code segment. For instance, the createNotification method in AlertTrackAll consists of two array access operations, the creation of a new object, and assigning values to three variables. The worst-case execution times of these primitives are summed up to give the worst-case execution time to create a notification of a detected situation. Worst-case execution times for the other operations in AlertTrackAll may be determined in a similar manner. Because the number of candidates kept by the runtime situation-lifespan instance is limited, it is possible to reach a worst-case execution time for the compose operation, because the number of candidate combinations is bounded. Also, because the number of simultaneous lifespans is restricted by the language, it is possible to bound the time required to process a rule starting at the arrival time of an event participating in the rule.

Experimental results

We present here the results of an experiment we conducted in order to demonstrate the feasibility of our proposed method. The goal of the experiment was to compare, for a set of situations, the worst-case execution time bound calculated using the proposed approach, with an execution time that approximates the actual execution time. We expect the results to show that the calculated bound is in fact a bound, in the sense that the measured execution time is consistently lower than the bound. In addition, we expect the results to establish that the bound is sufficiently tight, and therefore a useful one, by showing that the measured execution time is of the same order of magnitude as the calculated bound.

We first defined a set of rules. The definitions make use of two event types, AlertTrigger and TrackData, and two situations, AlertTrack and Alert, both of type All. The AlertTrack situation has two operands, AlertTrigger and TrackData, and its detection is immediate, that is, the situation is detected as soon as events of both types have occurred. The Alert situation has a single operand, AlertTrigger, and is detected immediately upon every arrival of an AlertTrigger event. In our definition set there is a single lifespan, initiated at startup and never ending, which is used as context for both situations. Moreover, the size of the accumulated set of candidate events for the situation operands is limited to 10 candidates per operand.

Next we implemented a minimal prototype of the proposed real-time event processor by implementing only the strictly necessary components of the infrastructure of abstract classes, according to the requirements for the above definition set. In addition, we manually implemented the code that would be generated by the framework, the specific situations in the definitions set, and the rule selector that attempts detection for the given situations upon the arrival of an incoming event.

Then we implemented a rudimentary prototype of the analyzer component (a component of the proposed framework). Given a set of definitions and the results of benchmarks for basic operations on the target platform, the analyzer calculates bounds for the processing times of the event types used in our definitions. Although the prototype implementation is not hard-coded for this specific definition set, it is tailored to these definitions in the sense that the sections of code necessary for situation types, or combinations of situation operands parameters other than the ones required by our definition set, have not been implemented. For example, for the sake of this experiment we implemented quantifier type relative first, whereas other quantifiers such as relative each did not have to be implemented. As previously described, the analyzer uses the assumed structure of the generated code and the different parameters in the situation definitions to determine the composition of the generated code and the worst-case runtime state, and then uses the measured time bounds for basic operations to calculate the time bound for processing a given event. For situations of type All, for instance, the worst-case state occurs when the candidates list for all situation operands is at full capacity, or is at full capacity after adding the incoming event to the list (because the table to be processed contains as entries all possible combinations of candidate events).

All measurements were performed in a non-real-time environment, on a Lenovo T60 ThinkPad*, with Intel** Core** 2 Duo 1.83-GHz processor, 2.00 GB memory, using Microsoft Windows** XP operating system and Java 1.4.2 (neither real-time Java nor a real-time platform were used). Unlike a real-time platform, which guarantees time bounds on the basic operations performed, the system used does not offer any such guarantees. Using this system means that the performance observed for basic operations (atomic operations, linked list access, etc.) is closer to average execution times rather than worst-case values. As a result, the calculations of the analyzer do not reflect the absolute worst-case execution time, but a case that is worst case in terms of the maximum amount of processing required per event (i.e., the calculations are worst-case in terms of runtime-state but use close to average values for the performance of basic operations). We refer to these results as average-worst-case.

We ran benchmarks on the above platform and collected data for the duration of basic operations used in both the infrastructure and the generated code. These measurements were fed to the analyzer component, along with the situation definitions set. The calculated bounds for both AlertTrigger and TrackData events, shown in Table 4 as “Processing time bound” can be thought of as “average-worst-case processing time.” Next, we measured the performance of the prototype in terms of the processing time for each event type. Starting from an empty runtime state (system startup, no events recorded), we avoid event streams that involve events of a single type because the detection of situation AlertTrack depends on the arrival of both event types. In addition, using a scenario in which both event types arrive randomly in an arbitrary order and with uniformly distributed probabilities would not be sufficient, because it is not possible to obtain separately the average processing times for each event type (we obtain instead an overall average event processing time). Also, this overall average processing time is expected to be significantly faster than the bounds calculated by the analyzer (by at least an order of magnitude), because the probability to reach a state close to the worst-case state (i.e., full capacity of the operands candidates list) is very low (when events of both types are received, then the situation is detected and the events are consumed, and thus the events are not likely to accumulate and grow into a list of 10 candidates for a single operand). Performing these measurements, we obtained the overall average event processing time 0.0014066 ms (which is an order of magnitude lower than the calculated bounds, as expected).


Table 4 Numerical results
 Capacity used (%)AlertTriggerTrackData
Processing time (ms)100.0027630.0013
200.0032260.00172
300.004010.002446
400.0049460.003283
500.005990.004426
600.0073430.00573
700.009010.00745
800.010730.009166
900.013070.011406
1000.0156260.014063
Processing time bound (ms) 0.0307550.019641
Average processing time (ms) 0.015470.0138533

In the next step of the experiment, we engineered a controlled constant runtime state that is close to the worst-case state. We prepared in advance the AlertTrack situation in a state in which 10 instances of the opposite operand to the measured event type, and nine instances of the same operand as the measured event type, already occupy the candidate lists. Into this constant state an event of the measured type is added (and processed) repeatedly, and the average performance is measured. The state is artificially preserved as constant throughout the measurements, so that each incoming event finds the same state. The behavior for the Alert situation does not change; the worst-case state of full-capacity candidate lists is not applicable for this situation, because each arriving AlertTrigger event leads immediately to situation detection and the event is consumed. Therefore, we expected the measured processing times for both events to be lower than the bound of the analyzer, but of the same order of magnitude. Also, the actual performance of the TrackData event processing would be quite closer to the bound of the analyzer, because it only participates in the AlertTrack situation, which is run under a controlled constant worst-case state. These measurements were performed in a scenario consisting of 300,000 events of the measured type, and each run was repeated several times in order to make sure the behavior was stable. The results are listed in Table 4 in the row labeled “Average processing time.”

In order to validate the correlation assumption between the size of the candidate lists when an event arrives, and the processing time for this event, we performed a series of additional measurements, also under a controlled constant state, when the size of the candidate lists varies. Table 4 shows the processing times when the occupancy of the candidate list capacity varies between 10 and 100 percent.

As can be seen in Table 4, the worst-case processing time bounds calculated by the analyzer for both event types are reasonable bounds. Because the measured processing times are consistently lower than these bounds and of the same order of magnitude, the bounds are rather tight. The results of Table 4 also show, as expected, a direct correlation between the occupancy of the candidate lists capacity, and the measured processing time.

Conclusion

In this paper we proposed an approach to generating real-time CEP applications that is based on extending to the real-time domain the technology embodied in IBM Active Middleware Technology tool. When implemented, this approach should facilitate the development and maintenance of real-time CEP systems.

Although our proposal is based on IBM Active Middleware Technology, the approach could also be applied to other rule-based CEP engines, with different rule definition languages. Moreover, the proposed code-generation process could also be adapted to implementation languages other than Java.

*Trademark, service mark, or registered trademark of International Business Machine Corporation in the United States, other countries, or both.
**Trademark, service mark, or registered trademark of Sun Microsystems, Inc., Intel Corporation, or Microsoft Corporation, in the United States, other countries, or both.

Cited references

Accepted for publication November 12, 2007; Published online April 18, 2008.


    About IBMPrivacyContact