Skip to main content
    Country/region [change]    Terms of use
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

Computer Science

Innovation Matters

Performance Modeling and Analysis

AMBIENCE - Automatic Model Building using InferENCE

For several decades, performance modeling has been of great theoretical and practical importance in the design, engineering and optimization of computer and communication systems and applications. A modeling approach is particularly efficient in providing architects and engineers with qualitative and quantitative insights about the system under consideration.

There are two streams of traditional performance modeling methods used in the literature. One method is applying inference techniques on linear system, such as neural networks, learning theory or statistical inference techniques. This method is weak in nature of capturing non-linear system behavior. Another method is using queueing network models. The primary advantage of a queueing model is that it captures the fundamental relationship between performance and capacity. However, traditional modeling with queueing networks requires the knowledge of the service demands of each type of request for each device. In real systems, such service demands can be technically very difficult to measure. Even if the instrumentation can be done, itís very costly, time consuming and system intrusive. A principal difficulty in building a valid queueing network of an IT system is the fine-tuning of the service requirements.

Performance Modeling and Analysis
Performance modeling and analysis framework for on-demand system infrastructure

In this project, we developed an optimization-based inference technique to tackle this important yet highly challenging problem. It is formulated as a parameter estimation problem using a general Kelly-type queueing network. A general Kelly-type queueing network has the property that its stationary queue length distributions have a product-form. This allows a clean, analytical formulation of the problem. A typical on-demand system processes different types of requests from clients. The network dispatcher (ND) routs each request to one of the front-end servers following some dispatching policy. Some requests are also processed in the back-end server. We consider the case where aggregate and end-to-end measurement data (i.e. system throughput, utilization of the servers, and end-to-end response times) are available. Note that such data are typically much easier to obtain than model parameters such as service requirements. Each set of measurements in which the working environment (load, scripts, etc.) is constant, is referred to as an experiment.

First, we formulated the overall problem as a set of tractable, quadratic programs, one for each set of end-to-end measurements. Then, based upon that formulation, we developed a novel and highly robust method for solving the problem. The robustness of the method means the model performs well in the presence of noisy data, and further is able to detect and remove outlying experiments within the procedure itself. This robustness comes at a very low computational cost.

After the model is calibrated, we can use the model to do what-if analysis and capacity planning. We can help answer questions such as: How many users can the system support with the current infrastructure? What level of service quality is being delivered for each service? How fast can the site architecture be scaled up, or down? What components should be upgraded? What are the potential bottlenecks?

In an on-demand system infrastructure, real-time system measurement data continuously flow into the modeling component to keep the models and the model parameters up to date. The performance predictions as well as appropriate system control actions are generated from the models. The system scheduling, admission control policies, in addition to the dispatching policies at the network dispatcher (ND), are all adjusted accordingly to keep the system operate under an optimal state.

We have applied our modeling technique to several pilot engagements and obtained successful results.

Selected Publications

A Comprehensive Toolset for Workload Characterization, Performance Modeling and On-line Control. Li Zhang, Zhen Liu, Anton Riabov, Monty Schulman, Cathy Xia and Fan Zhang. In Performance TOOLS Conference 2003.

A smart hill-climbing algorithm for application server configuration, Bowei Xi, Zhen Liu, Mukund Raghavachari, Cathy H. Xia and Li Zhang, WWW 2004.

Analysis of Performance Impact of Drill-down Techniques for Web Traffic Models, Cathy H. Xia, Zhen Liu, Mark S. Squillante, Li Zhang, and Naceur Malouch, Proceedings of the 18th International Teletaffic Congress (ITC18), Berlin, Germany 2003.

Parameter Inference of Queueing Models for IT Systems using End-to-End Measurements, Zhen Liu, Laura Wynter, Cathy H. Xia and Fan Zhang, Performance Evaluation, to be appeared.

Profile-based Traffic Characterization of Commercial Web Sites, Zhen Liu, Mark S. Squillante, Cathy H. Xia, Shun-Zheng Yu, Li Zhang, Proceedings of the 18th International Teletaffic Congress (ITC18), Berlin, Germany 2003.

Web Workload Service Requirement Analysis: A Queueing Network Approach, Li. Zhang, Cathy H. Xia, Mark S. Squillante, W. Nat Mills, MASCOTS 2002.

Innovators Corner
Zhen Liu  
Zhen Liu

What is the most exciting potential future use for the work you're doing?
With the increasingly successful on demand hosting services in IBM, capacity planning, with or without quality-of-service guarantees, becomes crucial to the profitability and customer satisfaction of our service engagements. Performance modeling methodologies are key elements of capacity planning. The development of state-of-the-art predictive modeling technologies allows IBM to achieve cost savings while improving customer satisfaction. Through interactions with many performance engineers, I realized that the biggest pain point of performance modeling is the parameter tuning of performance models. It is very time consuming to obtain a valid model with appropriate parameterization. Together with my team members, we started this adventure of developing a brand new approach to performance modeling: automating the model calibration process. This research project has given rise to the tool AMBIENCE (for Automatic Model Building using InferENCE) which is now being deployed in a variety of IBM internal and external engagements.

What is the most interesting part of your research?
It is exciting to be in an environment where we can develop fundamental theories and apply them to practical systems and business engagements. The most interesting part I found in performance modeling research is that it bridges the gap between mathematics and computer systems.

What inspired you to go into this field?
While I was starting my Ph.D. thesis, parallel and distributed computing were hot topics. After looking into these areas, I discovered that performance issues were critical and central those fields, whether in programming models or the computer architecture. Traditional performance models are not amenable to characterize synchonizations in such systems. I thus decided to develop a performance modeling framework for that area, and two years later, proposed the extended queueing network framework, referred to as, Synchronized Queueing Networks, for the quantitative modeling of the dynamics of parallel programs.

What is your favorite invention of all time?
LaTeX as text processing system. I have used different kinds of text processing systems, None of them is comparable to LaTeX in terms of the ease of use and the quality of the presentation of mathematics.

Team Members
Research Team
Carlos Fonseca Zhen Liu Laura Wynter
Carlos Fonseca
Zhen Liu
Laura Wynter
Cathy Xia Fan Zhang Li Zhang
Cathy Xia
Fan Zhang
Li Zhang

Related Links
arrowDiscipline: Computer Science
arrowResearch Area: Performance Modeling and Analysis
arrowResearch Site: Watson

    About IBMPrivacyContact