IBM Israel
Skip to main content
Search IBM Research
   Home  |  Products & services  |  Support & downloads  |  My account
Select a Country Select a country
IBM Research Home IBM Research Home
IBM Haifa Labs Homepage IBM Haifa Labs Home
IBM Haifa Labs Homepage IBM Haifa Labs Leadership Seminars

Testing and Debugging

Workshop Homepage
Author Instructions
Important Dates
Workshop Co-chairs
Program Committee
Paper Submission

Benchmark and Framework for Encouraging Research on Multi-threaded Testing Tools (Abstract)
(Klaus Havelund, Scott Stoller and Shmuel Ur)

A problem that has been getting prominence in testing is that of looking for intermittent bugs. Multi-threaded code is becoming very common, mostly on the server side. As there is no silver bullet solution, research focuses on a variety of partial solutions. In this paper (invited by PADTAD 2003) we outline a proposed project to facilitate research. The project goals are as follows. The first goal is to create a benchmark that can be used to evaluate different solutions. The benchmark, apart from containing programs with documented bugs, will include other artifacts, such as traces, that are useful for evaluating some of the technologies. The second goal is to create a set of tools with open API's that can be used to check ideas without building a large system. For example an instrumentor will be available, that could be used to test temporal noise making heuristics. The third goal is to create a focus for the research in this area around which a community of people who try to solve similar problems with different techniques, could congregate.

Concurrent Bug Patterns and How to Test Them (Abstract)
(Eitan Farchi, Yarden Nir and Shmuel Ur)

We present and categorize a taxonomy of concurrent bug patters. We then demonstrate how the taxonomy can be used to create new heuristics for ConTest that improve its bug finding ability. Other testing techniques such as coverage can be developed from this new bug taxonomy. We also demonstrate how concurrent bug patterns can be derived from concurrent design patterns. Further research is required to complete the concurrent bug taxonomy.

A Classification of Concurrency Failures in Java Components (Abstract)
(Brad Long and Paul Strooper)

The Java programming language supports concurrency. Concurrent programs are hard to test due to their inherent non-determinism. This paper presents a classification of concurrency failures based on a petri-net model of Java concurrency. By systematically analyzing the model, we come up with a complete classification of concurrency failures. We plan to use the classification to determine suitable verification and test strategies that can be used for each of the different failure classes. We illustrate how the model can be used to guide test case selection by constructing concurrency flow graphs for each method in the component under test. We use a producer-consumer monitor to discuss how this approach could be applied in practice in conjunction with an existing tool for deterministic test execution.

Efficient On-the-Fly Data Race Detection in Multithreaded C++ Programs (Abstract)
(Eli Pozniansky and Assaf Schuster)

Data race detection is highly essential for debugging multithreaded programs and assuring their correctness. Nevertheless, there is no single universal technique capable of handling the task efficiently, since the data race detection problem is computationally hard in the general case. Thus, all currently available tools, when applied to some general case program, usually result in excessive false alarms or in a large number of undetected races.

Another major drawback of currently available tools is that they are restricted, for performance reasons, to detection units of fixed size. Thus, they all suffer from the same problemóchoosing a small unit might result in missing some of the data races, while choosing a large one might lead to false detection.

We present a novel testing tool, called MultiRace, which combines improved versions of Djit and Locksetótwo very powerful on-the-fly algorithms for dynamic detection of apparent data races. Both extended algorithms detect races in multithreaded programs that may execute on weak consistency systems, and may use two-way as well as global synchronization primitives.

By employing novel technologies, MultiRace adjusts its detection to the native granularity of objects and variables in the program under examination. In order to monitor all accesses to each of the shared locations, MultiRace instruments the C++ source code of the program. It lets the user fine-tune the detection process, but otherwise is completely automatic and transparent.

This paper describes the algorithms employed in Multi-Race, gives highlights of its implementation issues, and suggests some optimizations. It shows that the overheads imposed by MultiRace are often much smaller (orders of magnitude) than those obtained by other existing tools.

Heuristics for Finding Concurrent Bugs (Abstract)
(Yosi Ben-Asher, Yaniv Eytani and Eitan Farchi)

This paper presents new heuristics that increase the probability of manifesting concurrent bugs. The heuristics are based on cross-run monitoring. A contended shared variable is chosen and random context switching is performed at accesses to that variable. The relative strength of the new heuristics is analyzed. In comparison to previous works, our heuristics increase the frequency of bug manifestation. In addition, the new heuristics were able to find bugs that previous methods did not discover.

Replay Debugging of Real-Time Systems Using Time Machines (Abstract)
(Henrik Thane, Daniel Sundmark, Joel Huselius and Anders Pettersson)

In this paper we present a new approach to deterministic replay using standard components. Our method facilitates cyclic debugging of real-time systems with industry standard real-time operating systems using industry standard debuggers. The method is based on a number of new techniques: A new marker for deterministic differentiation between loop iterations (as well as subroutine calls) for deterministic reproduction of interrupts and task preemptions, an algorithm for finding well-defined starting points of replay sessions, as well as a technique for using conditional breakpoints in standard debuggers to replay the target system.

We also propose and discuss different methods for deterministic monitoring, and provide benchmarking results from an industrial case study demonstrating the feasibility of our method.

Previously published solutions to the problem of debugging real-time systems have based on the concept of deterministic replay: where significant system events like task-switches of multitasking software and external inputs are recorded during run-time, and later replayed (re-executed) off-line. Previous works have been based on either non-standard hardware, specially designed compilers or modified real-time operating systems. The reliance on non-standard components has limited the success of the approach. Even though this idea has been around for 20 years, no industrial application for debugging of real-time systems of the method has been presented.

Choosing Among Alternative Pasts (Abstract)
(Marina Biberstein, Eitan Farchi and Shmuel Ur)

The main problem with testing concurrent programs is their non-determinism: two executions of such a program may yield different results. Most of the work in the field of concurrent testing has been focused on detecting race conditions. However, race conditions have a low probability of manifesting themselves, and even when they do it is not always an indication of a fault. Recently, good results were produced with a different approach, which generates different interleavings at runtime using embedded sleep statements. This approach has the advantage of finding more problems than the traditional one, and never generates false alarms.

This paper proposed a totally different approach for the generation of interleavings. In our technique, the execution of the application is shadowed, which allows us to calculate restrictions on the relative order of the events., for every assignment, the value is chosen randomly among the alternative values consistent with the restrictions. The restrictions are then updated based on the value chosen.

The paper presents an online technique for finding the set of possible legal substitutions. The problem is farm from simple due to the fact that past value substitutions affect future ones. Our solution is computationally intensive and, therefore, impractical as is. However, insights gained from it lead to practical heuristics for operating the embedded sleep statements.

Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters (Abstract)
(Jack Dongarra, Kevin London, Shirley Moore, Philip Mucci, Patricia Teller, Dan Terpstra, Haihang You and Min Zhou)

Over the past three years, the PAPI project has developed a portable library interface to hardware performance counters that has become widely adopted and used for measuring and tuning application performance. PAPI provides not only a standard set of routines for accessing the counters, but also a common set of metrics deemed most relevant for performance evaluation. The PAPI library may be used either directly or via a number of third-party performance analysis tools that assist the user in collecting and interpreting PAPI data. This paper discusses a number of issues that have arisen during development and deployment of PAPI. Collaborations with tool developers have helped better define library functionality and performance requirements for supporting higher level tools. Analysis of PAPI data collected from a range of applications on different platforms has raised subtle issues about exactly what events are being measured and how the data should be interpreted. The need to reduce the overhead and resulting perturbation of performance measurement has led to the development of statistical sampling approaches to collection of hardware counter data. These experiences have motivated a re-design of PAPI which will support a broader and more flexible approach to accessing hardware performance data while reducing library overhead.

Early Error Detection in Industrial Strength Cache Coherence Protocols Using SQL (Abstract)
(M. Subramaniam)

A novel, table-driven approach for designing industrial strength cache coherence protocols based on relational database technology and the query language SQL is described. The approach is fully automatic and allows protocol errors to be detected early in the protocol development cycle. The protocols are specified using several interacting multi-input, and multi-output controller state machines represented as database tables. It is shown how the relational database technology can be effectively used to automatically generate and manipulate these controller tables through the different phases of the protocol development cycle. Protocol scenarios specified using compact SQL constraints are solved to automatically generate controller tables. SQL constraints specified over controller tables are used to statically check protocol properties including absence of deadlocks and establish several other protocol invariants. The debugged controller tables are automatically mapped to hardware using SQL table operations while preserving the established protocol properties. The proposed approach is deployed at Fujitsu System Technology Division, in the design of their next generation multiprocessor system and has been highly successful in reducing protocol development time and in early discovery of several protocol design errors.

A Case Study of Selected SPLASH-2 Applications and the SBT Debugging Tool (Abstract)
(Ernesto Novillo and Paul Lu)

SBT is portable library and tool for on-line debugging and performance monitoring of shared-memory parallel programs using the single-program-multiple-data (SPMD) model of parallelism. SPMD programs often use barriers to synchronize threads of execution and to delimit the start and end of different phases of computation. Through its useful barrier constructs, dynamic performance warnings, and integration with hardware event counter libraries, SBT helps programmers localize deadlocks and performance bottlenecks in their parallel programs.

In order to demonstrate SBT's applicability and usefulness, we present a simple, case study performance analysis using three programs from the SPLASH-2 suite. In addition, we quantify the overhead incurred by the programs when they are monitored with SBT, and conclude that the cost of the instrumentation is negligible.

  About IBM  |  Privacy  |  Terms of use  |  Contact