IBM Research
IBM Research
Performance Modeling & Analysis

Speakers' Bureau

Talks

Many of our researchers in the area of performance modeling and analysis are available to present their work at your site.


arrow A Stastical Approach to Predictive Detection
arrow Variance Reduction Techniques for Value-at-Risk With Heavy-Tailed Risk Factors
arrow Overview of Performance Management Activities in IBM Research
arrow Predictive Detection: A New Approach to Service Level Management
arrow Automated Drilldown: A New Approach to Problem Isolation
arrow ETE: A Flexible, Scalable Approach to Measuring End to End Response Times and Their Components
arrow Event Mining: A Discovery-Based Approach to Event Management
arrow FARM: A Framework for Exploring Mining Spaces with Multiple Attributes
arrow Techniques for Improving Web Performance and Efficiently Serving Dynamic Content
arrow Web Traffic Modeling and Server Performance Analysis
arrow Optimal Stochastic Scheduling in Multiclass Parallel Queues
arrow Resource Management and Scheduling in High-Performance Parallel Systems
arrow Resource Allocation Problems in Computer Science
arrow Threshold-Based Priority Policies for Parallel-Server Systems
arrow Towards Discovery of Event Correlation Rules
arrow Using Control Theory to Achieve Service Level Objectives In Performance Management

Abstracts for Talks

Variance Reduction Techniques for Value-at-Risk With Heavy-Tailed Risk Factors (Philip Heidelberger)

The calculation of value-at-risk (VAR) for large portfolios of complex instruments is among the most demanding and widespread computational challenges facing the financial industry. Current methods for calculating VAR include comparatively fast numerical approximations - especially the linear and quadratic (delta-gamma) approximations - and more robust but more computationally demanding Monte Carlo simulation. The linear and delta-gamma methods typically rely on an assumption that the underlying market risk factors have a Gaussian distribution over the VAR horizon. But there is ample empirical evidence that market data is more accurately described by heavy-tailed distributions. Capturing heavy tails in VAR calculations has to date required highly time-consuming Monte Carlo simulation. We describe two methods for computationally efficient calculation of VAR in the presence of heavy-tailed risk factors, specifically when risk factors have a multivariate t distribution. The first method uses transform inversion to develop a fast numerical algorithm for computing the distribution of the heavy-tailed delta-gamma approximation. For greater accuracy, the second method uses the numerical approximation to guide in the design of an effective Monte Carlo variance reduction technique; the algorithm combines importance sampling and stratified sampling. This method can produce enormous speed-ups compared with standard Monte Carlo. (Joint work with Paul Glasserman and Perwez Shahabuddin, Columbia University)

Towards Discovery of Event Correlation Rules (Joseph Hellerstein)

For large installations, event management is critical to ensuring service quality by responding rapidly to exceptional situations. Key to this is having experts encode their knowledge (e.g., in rules, state machines, codebooks) about the relationship between event patterns and actions to take. Unfortunately, doing so is time-consuming and knowledge-intensive. We propose reducing this burden by using offline decision support consisting of visualizing and mining event histories to discover patterns in event data. Our experience with a wide variety of production data has identified several patterns of interest, such as event bursts and partial periodicities. Herein, we use production data to illustrate how to visualize and mine event patterns, and we describe a tool we have developed to aid in pattern discovery.

FARM: A Framework for Exploring Mining Spaces with Multiple Attributes (Joseph Hellerstein)

Mining for frequent itemsets typically involves a preprocessing step in which data with multiple attributes are grouped into transactions, and items are defined based on attribute values. We have observed that such fixed attribute mining can severely constrain the patterns that are discovered. Herein, we introduce mining spaces, a new framework for mining multi-attribute data that includes the discovery of transaction and item definitions (with the exploitation of taxonomies and functional dependencies if they are available). We prove that special downward closure properties hold for mining spaces, a result that allows us to construct efficient algorithms for mining patterns without the constraints of fixed attribute mining. We apply our algorithms to synthetic data and to real world data collected from a production computer network. The results show that by exploiting the special kinds of downward closure in mining spaces, execution times for mining can be reduced by a factor of three to four.

Using Control Theory to Achieve Service Level Objectives In Performance Management (Joseph Hellerstein)

A widely used approach to achieving service level objectives for a target system (e.g., an email server) is to add a controller that manipulates the target system's tuning parameters. We describe a methodology for designing such controllers for software systems that builds on classical control theory. The classical approach proceeds in two steps: system identification and controller design. In system identification, we construct mathematical models of the target system. Traditionally, this has been based on a first-principles approach, using detailed knowledge of the target system. Such models can be difficult to build, and too complex to validate, use, and maintain. In our methodology, a statistical (ARMA) model is fit to historical measurements of the target being controlled. These models are easier to obtain and use and allow us to apply control-theoretic design techniques to a larger class of systems. When applied to a Lotus Notes groupware server, we obtain model fits with $R¬{2}$ no lower than 75\% and as high as 98\%.
In controller design, an analysis of the models leads to a controller that will achieve the service level objectives. We report on an analysis of a closed-loop system using an integral control law with Lotus Notes as the target. The objective is to maintain a reference queue length. Using root-locus analysis from control theory, we are able to predict the occurrence (or absence) of controller-induced oscillations in the system's response. Such oscillations are undesirable since they increase variability, thereby resulting in a failure to meet the service level objective. We implement this controller for a real Lotus Notes system, and observe a remarkable correspondence between the behavior of the real system and the predictions of the analysis. This allows us to select the proper parameter for the controller from the analysis alone.

A Statistical Approach to Predictive Detection (Joseph Hellerstein)

Service providers typically define quality of service problems using threshold tests, such as ``Are HTTP operations greater than 12 per second on server XYZ?" This paper explores the feasibility of {\em predicting} violations of threshold tests. Such a capability would allow providers to take corrective actions in advance of service disruptions. Our approach estimates the probability of threshold violations for specific times in the future. We model the threshold metric (e.g., HTTP operations per second) at two levels: (1) nonstationary behavior (as is done in workload forecasting for capacity planning) and (2) stationary, time-serial dependencies. Using these models, we compute the probability of threshold violations. We assess our approach using measurements of HTTP operations per second collected from a production web server. These assessments suggest that our approach works well if (a) the actual values of predicted metrics are sufficiently distant from their thresholds and/or (b) the prediction horizon is not too far into the future.

Techniques for Improving Web Performance and Efficiently Serving Dynamic Content, Arun Iyengar

This talk presents techniques for designing Web sites which need to handle large request volumes, provide high availability, or generate significant dynamic content. These techniques include using load balancers to distribute load among multiple Web servers, Web server accelerators, and caching dynamic pages. We present the architecture of a scalable and highly available Web server accelerator we have developed which significantly improves Web server performance. We also present a publishing system we have developed for efficiently generating complex dynamic pages from simpler fragments. The talk will also describe new techniques we have developed for keeping cached dynamic data current and synchronizing caches with underlying databases. The work described in this talk has been used to improve performance at several highly accessed Web sites and been incorporated into several products. We will describe how our publishing system is being used at the official Web site for the 2000 Olympic Games. We will also illustrate how a slightly earlier version of our technology was deployed at the official Web site for the 1998 Olympic Winter Games. Performance and high availability were critical for this site not only because it was one of the most popular on the Web but also because the site was providing results which were constantly changing. The 1998 Olympic Games Web site achieved quick response times even under peak loads which set world records and was available 100% of the time.

Web Traffic Modeling and Server Performance Analysis (Mark Squillante, Li Zhang)

We present a study of the traffic patterns for a dynamic and heavily-accessed Web server environment, and the impact of such traffic patterns on Web server performance. Using the data from the official Web site during the 1998 Winter Olympic Games in Nagano, Japan, we develop traffic models to represent the process of user requests that are the input to the geographically-distributed Web server systems. Our analysis of the traffic data illustrates traffic patterns that exhibit both light-tailed and heavy-tailed behaviors. We then feed these traffic processes into one of the Web server systems modeled as a general single-server queue and analyze the waiting-time process, which models the latency encountered by user requests, a key measure of quality of service (QoS).

Optimal Stochastic Scheduling in Multiclass Parallel Queues (Mark Squillante)

In this talk we consider the problem of scheduling different classes of customers on multiple distributed servers to minimize an objective function based on per-class mean response times. This problem arises in a wide range of distributed systems, networks and applications. Within the context of our model, we observe that the optimal sequencing strategy at each of the servers is a simple static priority policy. Using this observation, we argue that the globally optimal scheduling problem reduces to finding an optimal routing matrix under this sequencing policy. We formulate the latter problem as a nonlinear programming problem and show that any interior local minimum is a global minimum, which significantly simplifies the solution of the optimization problem. In the case of Poisson arrivals, we provide an optimal scheduling strategy that also tends to minimize a function of the per-class response time variances. Applying our analysis to various static instances of the general problem leads us to rederive many results, yielding simple approximation algorithms whose guarantees match the best known results.

Resource Management and Scheduling in High-performance Parallel Systems (Mark Squillante)

Scheduling in large-scale parallel systems has been and continues to be an important and challenging research problem. Several key factors, including the increasing use of off-the-shelf clusters of workstations to build such parallel systems, have resulted in the emergence of a new class of scheduling strategies, broadly referred to as dynamic coscheduling. Unfortunately, the size of both the design and performance spaces of these emerging scheduling strategies is quite large, due in part to the numerous dynamic interactions among the different components of the parallel computing environment as well as the wide range of applications and systems that can comprise this parallel environment. This in turn makes it difficult to fully explore the benefits and limitations of the various proposed dynamic coscheduling approaches for large-scale systems solely with the use of simulation and/or experimentation. To gain a better understanding of the fundamental benefits and limitations of different dynamic coscheduling methods, we formulate a general mathematical model of this class of scheduling strategies within a unified framework that allows us to investigate a wide range of parallel environments. We derive a matrix-analytic analysis based on stochastic decomposition, which is %shown to be asymptotically exact, and a fixed-point iteration. A large number of numerical experiments were performed in part to examine the accuracy of our approach. We note that these numerical results are in excellent agreement with detailed simulation results, often within 5\% and always less than 10\%, especially given the various dynamic system, application and policy complexities. Our mathematical model and analysis is then used to explore several fundamental design and performance tradeoffs associated with the class of dynamic coscheduling policies across a broad spectrum of parallel computing environments.

Resource Allocation Problems in Computer Science (Joel Wolf (local))

Resource allocation problems (RAPs) form a class of non-linear combinatorial optimization problems that arise quite naturally and frequently in computer performance optimization. One can find in the literature, for example, RAP applications on such diverse computer science topics as buffer management, query optimization, load balancing, scheduling, communications and chip placement. RAPs can be categorized by the fact that they possess a (generally single) constraint of a very simple nature. Specifically, there is a fixed amount of some resource, which must be allocated amongst some number of competing activities. The resource might be some fixed number of processors or workstations, a memory, cache or buffer of some fixed size, a total amount of time, bandwidth, or the like. The objective function itself might be any standard performance metric, such as average response time, throughput, cache hit rate, maximum task completion time, and so on. Because of their widespread applicability to performance optimization problems (and their inherent beauty), we believe that techniques for solving RAPs are a useful addition to any computer scientist's knapsack. Therefore, in this talk we shall learn how to recognize and solve some of the most commonly occurring RAPs. We emphasize that this talk will NOT require much background in optimization theory.

Threshold-Based Priority Policies for Parallel-Server Systems (Cathy Xia)

Across a wide spectrum of parallel and distributed computer systems, it is quite common that a job can be processed more efficiently on one server than on another. However, ``greedily'' scheduling jobs where they are processed most efficiently could lead to unbalanced workload or even result in an unstable system. Here we propose a general class of threshold-based priority queueing policies, along with a simple algorithm that determines where (which queues) to set thresholds. To derive the optimal threshold values, we develop approximations for major performance measures, making use of a combination of priority queue and server vacation models.

 
Privacy Terms of use Contact IBM www.research Research Sites Page Contact