 |
Performance
Modeling & Analysis
|
|
|
|
Speakers' Bureau
|
Talks |
|
Many
of our researchers in the area of performance modeling and
analysis are available to present their work at your site.
|
|
|
Abstracts for Talks
Variance Reduction Techniques for Value-at-Risk With
Heavy-Tailed Risk Factors (Philip Heidelberger)
The
calculation of value-at-risk (VAR) for large portfolios of complex instruments
is among the most demanding and widespread computational challenges
facing the financial industry. Current methods for calculating VAR include
comparatively fast numerical approximations - especially the linear
and quadratic (delta-gamma) approximations - and more robust but more
computationally demanding Monte Carlo simulation. The linear and delta-gamma
methods typically rely on an assumption that the underlying market risk
factors have a Gaussian distribution over the VAR horizon. But there
is ample empirical evidence that market data is more accurately described
by heavy-tailed distributions. Capturing heavy tails in VAR calculations
has to date required highly time-consuming Monte Carlo simulation. We
describe two methods for computationally efficient calculation of VAR
in the presence of heavy-tailed risk factors, specifically when risk
factors have a multivariate t distribution. The first method uses transform
inversion to develop a fast numerical algorithm for computing the distribution
of the heavy-tailed delta-gamma approximation. For greater accuracy,
the second method uses the numerical approximation to guide in the design
of an effective Monte Carlo variance reduction technique; the algorithm
combines importance sampling and stratified sampling. This method can
produce enormous speed-ups compared with standard Monte Carlo. (Joint
work with Paul Glasserman and Perwez Shahabuddin, Columbia University)
Towards Discovery of Event Correlation Rules
(Joseph Hellerstein)
For
large installations, event management is critical to ensuring service
quality by responding rapidly to exceptional situations. Key to this
is having experts encode their knowledge (e.g., in rules, state machines,
codebooks) about the relationship between event patterns and actions
to take. Unfortunately, doing so is time-consuming and knowledge-intensive.
We propose reducing this burden by using offline decision support consisting
of visualizing and mining event histories to discover patterns in event
data. Our experience with a wide variety of production data has identified
several patterns of interest, such as event bursts and partial periodicities.
Herein, we use production data to illustrate how to visualize and mine
event patterns, and we describe a tool we have developed to aid in pattern
discovery.
FARM: A Framework for Exploring Mining Spaces with Multiple Attributes
(Joseph Hellerstein)
Mining
for frequent itemsets typically involves a preprocessing step in which
data with multiple attributes are grouped into transactions, and items
are defined based on attribute values. We have observed that such fixed
attribute mining can severely constrain the patterns that are discovered.
Herein, we introduce mining spaces, a new framework for mining multi-attribute
data that includes the discovery of transaction and item definitions
(with the exploitation of taxonomies and functional dependencies if
they are available). We prove that special downward closure properties
hold for mining spaces, a result that allows us to construct efficient
algorithms for mining patterns without the constraints of fixed attribute
mining. We apply our algorithms to synthetic data and to real world
data collected from a production computer network. The results show
that by exploiting the special kinds of downward closure in mining spaces,
execution times for mining can be reduced by a factor of three to four.
Using Control Theory to Achieve Service Level Objectives In Performance
Management (Joseph Hellerstein)
A
widely used approach to achieving service level objectives for a target
system (e.g., an email server) is to add a controller that manipulates
the target system's tuning parameters. We describe a methodology for
designing such controllers for software systems that builds on classical
control theory. The classical approach proceeds in two steps: system
identification and controller design. In system identification, we construct
mathematical models of the target system. Traditionally, this has been
based on a first-principles approach, using detailed knowledge of the
target system. Such models can be difficult to build, and too complex
to validate, use, and maintain. In our methodology, a statistical (ARMA)
model is fit to historical measurements of the target being controlled.
These models are easier to obtain and use and allow us to apply control-theoretic
design techniques to a larger class of systems. When applied to a Lotus
Notes groupware server, we obtain model fits with $R¬{2}$ no lower than
75\% and as high as 98\%.
In controller design, an analysis of the models leads to a controller
that will achieve the service level objectives. We report on an analysis
of a closed-loop system using an integral control law with Lotus Notes
as the target. The objective is to maintain a reference queue length.
Using root-locus analysis from control theory, we are able to predict
the occurrence (or absence) of controller-induced oscillations in the
system's response. Such oscillations are undesirable since they increase
variability, thereby resulting in a failure to meet the service level
objective. We implement this controller for a real Lotus Notes system,
and observe a remarkable correspondence between the behavior of the
real system and the predictions of the analysis. This allows us to select
the proper parameter for the controller from the analysis alone.
A Statistical Approach to Predictive Detection (Joseph
Hellerstein)
Service providers
typically define quality of service problems using threshold tests,
such as ``Are HTTP operations greater than 12 per second on server XYZ?"
This paper explores the feasibility of {\em predicting} violations of
threshold tests. Such a capability would allow providers to take corrective
actions in advance of service disruptions. Our approach estimates the
probability of threshold violations for specific times in the future.
We model the threshold metric (e.g., HTTP operations per second) at
two levels: (1) nonstationary behavior (as is done in workload forecasting
for capacity planning) and (2) stationary, time-serial dependencies.
Using these models, we compute the probability of threshold violations.
We assess our approach using measurements of HTTP operations per second
collected from a production web server. These assessments suggest that
our approach works well if (a) the actual values of predicted metrics
are sufficiently distant from their thresholds and/or (b) the prediction
horizon is not too far into the future.
Techniques
for Improving Web Performance and Efficiently Serving Dynamic Content,
Arun Iyengar
This
talk presents techniques for designing Web sites which need to handle
large request volumes, provide high availability, or generate significant
dynamic content. These techniques include using load balancers to distribute
load among multiple Web servers, Web server accelerators, and caching
dynamic pages. We present the architecture of a scalable and highly
available Web server accelerator we have developed which significantly
improves Web server performance. We also present a publishing system
we have developed for efficiently generating complex dynamic pages from
simpler fragments. The talk will also describe new techniques we have
developed for keeping cached dynamic data current and synchronizing
caches with underlying databases. The work described in this talk has
been used to improve performance at several highly accessed Web sites
and been incorporated into several products. We will describe how our
publishing system is being used at the official Web site for the 2000
Olympic Games. We will also illustrate how a slightly earlier version
of our technology was deployed at the official Web site for the 1998
Olympic Winter Games. Performance and high availability were critical
for this site not only because it was one of the most popular on the
Web but also because the site was providing results which were constantly
changing. The 1998 Olympic Games Web site achieved quick response times
even under peak loads which set world records and was available 100%
of the time.
Web Traffic Modeling and Server Performance Analysis
(Mark Squillante, Li Zhang)
We present
a study of the traffic patterns for a dynamic and heavily-accessed Web
server environment, and the impact of such traffic patterns on Web server
performance. Using the data from the official Web site during the 1998
Winter Olympic Games in Nagano, Japan, we develop traffic models to
represent the process of user requests that are the input to the geographically-distributed
Web server systems. Our analysis of the traffic data illustrates traffic
patterns that exhibit both light-tailed and heavy-tailed behaviors.
We then feed these traffic processes into one of the Web server systems
modeled as a general single-server queue and analyze the waiting-time
process, which models the latency encountered by user requests, a key
measure of quality of service (QoS).
Optimal Stochastic Scheduling in Multiclass Parallel
Queues (Mark Squillante)
In
this talk we consider the problem of scheduling different classes of
customers on multiple distributed servers to minimize an objective function
based on per-class mean response times. This problem arises in a wide
range of distributed systems, networks and applications. Within the
context of our model, we observe that the optimal sequencing strategy
at each of the servers is a simple static priority policy. Using this
observation, we argue that the globally optimal scheduling problem reduces
to finding an optimal routing matrix under this sequencing policy. We
formulate the latter problem as a nonlinear programming problem and
show that any interior local minimum is a global minimum, which significantly
simplifies the solution of the optimization problem. In the case of
Poisson arrivals, we provide an optimal scheduling strategy that also
tends to minimize a function of the per-class response time variances.
Applying our analysis to various static instances of the general problem
leads us to rederive many results, yielding simple approximation algorithms
whose guarantees match the best known results.
Resource Management and Scheduling in High-performance
Parallel Systems (Mark Squillante)
Scheduling
in large-scale parallel systems has been and continues to be an important
and challenging research problem. Several key factors, including the
increasing use of off-the-shelf clusters of workstations to build such
parallel systems, have resulted in the emergence of a new class of scheduling
strategies, broadly referred to as dynamic coscheduling. Unfortunately,
the size of both the design and performance spaces of these emerging
scheduling strategies is quite large, due in part to the numerous dynamic
interactions among the different components of the parallel computing
environment as well as the wide range of applications and systems that
can comprise this parallel environment. This in turn makes it difficult
to fully explore the benefits and limitations of the various proposed
dynamic coscheduling approaches for large-scale systems solely with
the use of simulation and/or experimentation. To gain a better understanding
of the fundamental benefits and limitations of different dynamic coscheduling
methods, we formulate a general mathematical model of this class of
scheduling strategies within a unified framework that allows us to investigate
a wide range of parallel environments. We derive a matrix-analytic analysis
based on stochastic decomposition, which is %shown to be asymptotically
exact, and a fixed-point iteration. A large number of numerical experiments
were performed in part to examine the accuracy of our approach. We note
that these numerical results are in excellent agreement with detailed
simulation results, often within 5\% and always less than 10\%, especially
given the various dynamic system, application and policy complexities.
Our mathematical model and analysis is then used to explore several
fundamental design and performance tradeoffs associated with the class
of dynamic coscheduling policies across a broad spectrum of parallel
computing environments.
Resource Allocation Problems in Computer Science
(Joel Wolf (local))
Resource
allocation problems (RAPs) form a class of non-linear combinatorial
optimization problems that arise quite naturally and frequently in computer
performance optimization. One can find in the literature, for example,
RAP applications on such diverse computer science topics as buffer management,
query optimization, load balancing, scheduling, communications and chip
placement. RAPs can be categorized by the fact that they possess a (generally
single) constraint of a very simple nature. Specifically, there is a
fixed amount of some resource, which must be allocated amongst some
number of competing activities. The resource might be some fixed number
of processors or workstations, a memory, cache or buffer of some fixed
size, a total amount of time, bandwidth, or the like. The objective
function itself might be any standard performance metric, such as average
response time, throughput, cache hit rate, maximum task completion time,
and so on. Because of their widespread applicability to performance
optimization problems (and their inherent beauty), we believe that techniques
for solving RAPs are a useful addition to any computer scientist's knapsack.
Therefore, in this talk we shall learn how to recognize and solve some
of the most commonly occurring RAPs. We emphasize that this talk will
NOT require much background in optimization theory.
Threshold-Based Priority Policies for Parallel-Server
Systems (Cathy Xia)
Across
a wide spectrum of parallel and distributed computer systems, it is
quite common that a job can be processed more efficiently on one server
than on another. However, ``greedily'' scheduling jobs where they are
processed most efficiently could lead to unbalanced workload or even
result in an unstable system. Here we propose a general class of threshold-based
priority queueing policies, along with a simple algorithm that determines
where (which queues) to set thresholds. To derive the optimal threshold
values, we develop approximations for major performance measures, making
use of a combination of priority queue and server vacation models.
|