Proceedings publication: Springer Lecture Notes in Computer Science


Haifa Verification Conference 2008
IBM Haifa Labs
October 2730, 2008 Organized by IBM Haifa Research Lab
Keynote Talk: Hazards of Verification
Daniel Jackson, MIT
Verification technology has taken great strides recently, but there are hazards in the use of this technology  in particular by encouraging unwarranted confidence. I'll explain where some of these hazards lie, and some approaches that address them.
Efficient Decision Procedure for Bounded Integer Nonlinear Operations Using SMT(LIA)
Malay Ganai
For the verification of complex designs, one often needs to encode verification conditions containing integer nonlinear constraints. Due to the undecidability of the problem, one usually considers bounded integers and then either linearizes the formula into a SMT(LIA) problem (i.e., theory of linear integer arithmetic with Boolean constraints) or bitblasts into a SAT problem. We present a novel way of linearizing those constraints and then show how this encoding to SMT(LIA) can be integrated into an incremental lazy bounding and refinement procedure (LBR) that leverages on the success of stateoftheart SMT(LIA) solvers. The most important feature of our LBR procedure is that the formula does not need to be reencoded at every step of the procedure; only bounds on variables need to be asserted/retracted, which are very efficiently supported by the recent SMT(LIA) solvers. In a series of controlled experiments, we show the effectiveness of our linearization encoding and LBR procedure in reducing the SMT solve time. We observe similar effectiveness of the LBR procedure when used in a software verification framework applied on industry benchmarks.
Synthesizing Test Models from Test Cases
Antti Jääskeläinen, Antti Kervinen, Mika Katara, Antti Valmari and Heikki Virtanen
In this paper we describe a methodology for synthesizing test models from test cases. The context of our approach is modelbased graphical user interface (GUI) testing of smartphone applications. To facilitate the deployment of modelbased testing practices, existing assets in test automation should be utilized. While companies are interested in the benefits of new approaches, they may have already invested heavily in conventional test suites. The approach represented in this paper enables using such suites for creating complex test models that should have better defect detection capability. The synthesis is illustrated with examples from two small case studies conducted using real test cases from industry. Our approach is semiautomatic requiring user interaction. We also outline planned tool support to enable efficient synthesis process.
A Meta Heuristic for Effectively Detecting Concurrency Errors
Neha Rungta and Eric Mercer
Mainstream programming is migrating to concurrent architectures to improve performance and facilitate more complex computation. The stateoftheart static analysis tools for detecting concurrency errors are imprecise, generate a large number of false error warnings, and require manual verification of each warning. In this paper we present a metaheuristic to help reduce the manual effort required in the verification of warnings generated by static analysis tools. We manually generate a small sequence of program locations that represent points of interest in checking the feasibility of a particular static analysis warning; then we use a meta heuristic to automatically control scheduling decisions while guiding the program, in a greedy depthfirst search, along the input sequence to test the feasibility of the warning. The metaheuristic automatically controls thread schedules based on a twotier ranking system that first considers the number of program locations already observed from the input sequence and the perceived closeness to the next location in the input sequence. The error traces generated by using this technique are real and require no further manual verification. We show the effectiveness of our approach by detecting feasible concurrency errors in benchmarked concurrent programs and the JDK 1.4 concurrent libraries based on warnings generated by the Jlint static analysis tool.
Evaluating Workloads Using Comparative Functional Coverage
Yoram Adler, Dale Blue, Thomas Conti, Richard Prewitt and Shmuel Ur
In this paper we introduce comparative functional coverage  a technique for comparing the coverage of multiple workloads  and the tool it was implemented in. Both the need to compare workloads, as well as using functional coverage as a technique to explore data are not new. However this "long felt need" for comparing workloads using functional coverage was not addressed. We show how to augment the functional coverage tool to handle multiple data sources, how to present the data, and an experiment that shows its usefulness.
Iterative Delta Debugging
Cyrille Valentin Artho
Automated debugging attempts to locate the reason for a failure. Delta debugging minimizes the difference between two inputs, where one input is processed correctly while the other input causes a failure, using a series of test runs to determine the outcome of applied changes. Delta debugging is applicable to inputs or to the program itself, as long as a correct version of the program exists. However, complex errors are often masked by other program defects, making it impossible to obtain a correct version of the program through delta debugging in such cases. Iterative delta debugging extends delta debugging and removes series of defects step by step, until the final unresolved defect alone is isolated. The method is fully automated and managed to localize a bug in some reallife examples.
Statistical Model Checking of MixedAnalog Circuits with an Application to a Third Order Delta Sigma Modulator
Edmund Clarke, Alexandre Donzé, and Axel Legay
In this paper, we consider verifying properties of mixedsignal circuits, i.e., circuits for which there is an interaction between analog (continuous) and digital (discrete) quantities. We follow a statistical model checking approach that consists of evaluating the property on a representative subset of behaviors, generated by simulation, and answering the question of whether the circuit satisfies the property with a probability greater than or equal to some value. The answer is correct up to a certain probability of error, which can be prespecified. The method automatically determines the minimal number of simulations needed to achieve the desired accuracy, thus providing a convenient way to control the tradeoff between precision and computational cost. We propose a logic adapted to the specification of properties of mixedsignal circuits, in the temporal domain as well as in the frequency domain. Our logic is unique in that it allows us to compare the Fourier transform of two signals. We also demonstrate the applicability of the method on a model of a thirdorder DeltaSigma modulator for which previous formal verification attempts were too conservative and required excessive computation time.
Tool Presentation: SeeCode  A Code Review Plugin for Eclipse
Moran Shochat, Orna Raz and Eitan Farchi
It is well known that code reviews are among the most effective techniques for finding bugs [2, 3, 4]. In this paper, we describe a code review tool, SeeCode, which supports the code review process. SeeCode is an Eclipse plugin and thus naturally integrates into the developer's working environment. It supports a distributed review environment and the various roles used in a review meeting. Reviewers can review the code at the same time, through either a virtual or a facetoface meeting, or at different times. Review comments and author navigation through the code are visible to all reviewers. Review comments are associated with line numbers, and the association is maintained when the code is changed by the developer. Integration with the Eclipse [8] Integrated Development Environment (IDE) enables easy code navigation, which is especially required when objectoriented code is reviewed. SeeCode also supports a quantitative feedback mechanism that reports the effectiveness of the ongoing review effort. This feedback is updated as the review progresses, and can be utilized by the review moderator to keep the review process on track. SeeCode has been piloted by several IBM groups with good feedback. The distributed review feature and integration with the IDE are especially highlighted by users as key features.
Tool Presentation: Progress in Automated Software Defect Prediction
Elaine Weyuker and Thomas Ostrand
We have designed and implemented a tool that predicts files most likely to have defects in a future release of a large software system. The tool builds a statistical regression model based on the version and defect history of the system, and produces a list of the next release's most probable faultprone files, sorted in decreasing order of the number of predicted defects. Testers can use this information to decide where to focus the most resources, and to help determine how much effort to allocate to various parts of the system. Developers can use the tool's output to help decide whether files should be rewritten rather than patched. A prototype version of the tool has been integrated with AT&T's internal software change management system, providing the tool with seamless access to the system's version and defect information, and giving users a simple interface to the tool's output.
Significant Diagnostic Counterexamples in Probabilistic Model Checking
Miguel E. Andrés, Pedro R. D'Argenio and Peter van Rossum
This paper presents a novel technique for counterexample generation in probabilistic model checking of Markov Chains and Markov Decision Processes. (Finite) paths in counterexamples are grouped together in witnesses that are likely to provide similar debugging information to the user. We list five properties that witnesses should satisfy in order to be useful as debugging aids: similarity, accuracy, originality, significance, and finiteness. Our witnesses contain paths that behave similarly outside strongly connected components. We then show how to compute these witnesses by reducing the problem of generating counterexamples for general properties over Markov Decision Processes, in several steps, to the easy problem of generating counterexamples for reachability properties over acyclic Markov Chains.
A Uniform Approach to ThreeValued Semantics for µCalculus on Abstractions of Hybrid Automata
Kerstin Bauer, Raffaella Gentilini and Klaus Schneider
Abstraction/refinement methods play a central role in the analysis of hybrid automata that are rarely decidable. Due to the introduction of unrealistic behaviors, soundness (of evaluated properties) is a major challenge for these methods. In this paper, we consider the definition of a threevalued semantics for ?calculus on abstractions of hybrid automata. Our approach relies on two steps: First, we develop a framework that is general in the sense that it provides a preservation result that holds for several possible semantics of the modal operators. In a second step, we instantiate our framework to two particular abstractions. To this end, a key issue is the consideration of both over and underapproximated reachability analysis, while classic simulationbased abstractions for hybrid automata rely only on overapproximations, and limit the preservation to the universal (?calculus') fragment. To specialize our general result, we consider (1) modal abstractions, based on the approximation of hybrid dynamics via may/must transitions, and (2) socalled discrete bounded bisimulations, suitably designed for the abstraction of hybrid automata.
HVC Award Winner: Proofs, Interpolants, and Relevance Heuristics
Ken McMillan
The technique of Craig interpolation provides a means of extracting from a proof about a bounded execution of a system the necessary information to construct a proof about unbounded executions. This allows us to exploit the relevance heuristics that are built into modern SAT solvers and SATbased decision procedures to scale model checking to larger systems.
This talk will cover applications of Craig interpolation in various domains, including hardware verification using propositional logic, and software verification using firstorder logic.
Invited Talk: Is Verification Getting Too Complex?
Yoav Hollander, Cadence
Verification of HW and HW/SW systems is becoming more and more complex. This presentation will look at the reasons for the increased complexity and what can be done about it (sneak preview: quite a lot).
Invited talk: Can Mutation Analysis Help Fix Our Broken Coverage Metrics?
Brian Bailey
The semiconductor industry relies on coverage metrics as its primary means of gauging both quality and readiness of a chip for production, and yet the metrics in use today measure neither quality nor provide an objective measure of completeness. This talk will explore the problems with existing metrics and why they are not proper measures of verification. Mutation analysis looks like a promising technology to help bridge the divide between what we have and what we need in terms of metrics and may also be able to help bridge the divide between static and dynamic verification. The talk will conclude with some of the remaining challenges that have to be overcome, such as its correct fit within a verification methodology and the standardization of a fault model.
Automatic Boosting of CrossProduct Coverage Using Bayesian Networks
Dorit Baras, Avi Ziv, and Laurent Fournier
Closing the feedback loop from coverage data to the stimuli generator is one of the main challenges in the verification process. Typically, verification engineers with deep domain knowledge manually prepare a set of stimuli generation directives for that purpose. Bayesian networks based coverage directed generation (CDG) systems have been successfully used to assist the process by automatically closing this feedback loop. However, constructing these CDG systems requires manual effort and a certain amount of domain knowledge from a machine learning specialist. We propose a new method that boosts coverage at early stages of the verification process with minimal effort, namely a fully automatic construction of a CDG system that requires no domain knowledge. Experimental results on a reallife crossproduct coverage model demonstrate the efficiency of the proposed method.
Simulationbased Verification of SystemonChip Designs through an Automated Specificationbased Testcase Generation
Christoph M. Kirchsteiger, Christoph Trummer, Christian Steger, Reinhold Weiss and Markus Pistauer
70% of the entire design effort for SystemonChip (SoC) designs is spent on functional verification to verify that the design fulfills the specification. Whereas the simulationbased testcase execution is done rather fast, finding the "right"' testcases manually to check that the design corresponds to the usually informal specification document is a timeintensive and errorprone process. In this work, we present a novel methodology to reduce the time for functional verification significantly by automatically generating testcases from the specification document. The specification is expressed as a set of semiformal textual use cases, which is a widely accepted and easytouse documentbased description format for requirements specification. During a simulationbased verification we use advanced verification mechanisms such as the randomconstrained verification technique to select the generated testcases randomly and apply them on the SystemunderVerification. We use a sample RFID SoC to demonstrate the benefits of our methodology. We show that it significantly reduces the time for functional verification, removes errors in the specification and detects a number of discrepancies between the RFID SoC and the RFID protocol specification.
Keynote Talk: AutomataTheoretic Model Checking Revisited
Moshe Vardi, Rice University
In automatatheoretic model checking we compose the design under verification with a Buechi automaton that accepts traces violating the specification. We then use graph algorithms to search for a counterexample trace. The theory of this approach originated in the 1980s, and the basic algorithms were developed during the 1990s.Both explicit and symbolic implementations, such as SPIN and SMV, are widely used.
A Framework for Inherent Vacuity
Orna Kupferman, Dana Fisman, Moshe Vardi and Sarai Sheinvald
Vacuity checking is traditionally performed after model checking has terminated successfully. It ensures that all the elements of the specification have played a role in its satisfaction by the design. Vacuity checking gets both design and specification as input, and is based on an indepth investigation of the relation between them. Vacuity checking has been proven to be very useful in detecting errors in the modeling of the design or the specification. The need to check the quality of specifications is even more acute in propertybased design, where the specification is the only input, serving as a basis for the development of the system. Current work on property assurance suggests various sanity checks, mostly based on satisfiability, nonvalidity, and realizability, but lacks a general framework for reasoning about the quality of specifications.
Contemporary PostSilicon Verification Mechanisms
Warren Hunt, UT Austin
We present a general description of the postsilicon verification problem. We differentiate the postsilicon verification problem from presilicon and manufacturing test verifications. We describe the inuse mechanisms for performing postsilicon verification, such as internal tracing and external interfaces to record internal events. We demonstrate the scale at which this problem needs to be addressed for commercial microprocessors by using the Centaur 64bit, X86compatible microprocessor as an example.
A Mechanized Framework for Applying Formal Analysis in Postsilicon Verification
Warren Hunt, UT Austin
We present an framework to facilitate postsilicon verification of hardware designs. We make use of execution trace monitors to transfer presilicon verification results to the postsilicon verification problem. Our framework makes it possible to apply formal reasoning techniques, namely theorem proving, bounded model checking, and SAT solving, to bear upon postsilicon verification problem. The framework also makes explicit the bottlenecks induced by limited observability at the postsilicon stage, and the tradeoffs between the quality of logical guarantee and the overhead of additional hardware.
The framework is being mechanized in the ACL2 theorem prover and we illustrate its use in analysis of a cache system.
Application of Formal Technology in Post Silicon Verification and Debugging
Jamil Mazzawi, Jasper
The exponential design growth and complexity of modern SoC and chip design products is a major concern in the electronic design and manufacturing market. Verification of these complex systems is always one of the most daunting and costly challenges. Presilicon verification is traditionally the lion’s share of the verification effort; however, it is not always effective enough and complete. Therefore, unfortunately, postsilicon bugs sometimes slip to silicon products and impact the time to market. Furthermore, one design has various manufacturing configurations; therefore, verifying each postsilicon configuration is time and effortconsuming. Formal verification methods are traditionally applied in the presilicon verification phase and are effective in complementing dynamic verification. Therefore, it begs the questions, can formal verification also be applied in postsilicon verification and the debugging phase with a similar impact? What are the challenges? and what methodologies are needed to adopt formal in the postsilicon phase? In this talk, we address these questions, outline the ingredients of a successful application of formal, and showcase two reallife examples using a stateoftheart formal verification system. We demonstrate the impact of formal where it was relatively easy to find several logic bugs much faster than other simulation methods, delineate the logic area where a logic fix was needed, and prove the fix to increase the confidence in the design.
Multithreading Post Silicon Exerciser  ThreadMill
Amir Nahir, IBM HRL
Presilicon functional verification techniques provide excellent dynamic observability, and static analysis of design models can dramatically increase coverage in limited areas. Yet, only a tiny fraction of the huge reachable statespace can be sampled and verified. Although substantial effort is invested in controlling and intelligently directing the verification resources, stateoftheart presilicon techniques cannot cope with the increasing complexity of modern highend designs. More bugs escape into the silicon. Should we reset our quality expectation from the presilicon design? Can we better exploit siliconcasting samples as yet another verification platform? Which presilicon verification methodologies can be adapted to silicon?
In this talk we will discuss tradeoffs between the different platforms and will point out opportunities to bridge methodologies. We will describe an architecture of a presilicon test generator that can be successfully adapted to silicon exercisers, to enable systematic implementation of functionalcoverage oriented verification plans.
Invited talk: Practical Considerations Concerning HLto RT Equivalence Checking
Carl Pixley, Synopsys
We will discuss several years' experience with commercial HLtoRTL equivalence checking with the Hector technology. We will also discuss several considerations based upon the reality that our company is an EDA vendor. This is quite different from the position of a semiconductor company, which can concentrate on a very specific methodology and design type.
Our observations will include some case studies from customers about the methodology and designs on which they are using Hector. Most of the development of Hector was based upon solutions to problems presented by our customers. We will also discuss the general architecture of Hector and some technological information about the engines that underlie Hector.
Tool Presentation: UserFriendly Model Checking: Automatically Configuring Algorithms with RuleBase/PE
Ziv Nevo
Model checking is known to be computationally hard, meaning that no single algorithm can efficiently solve all problems. A possible approach is to run many algorithms in parallel until one of them finds a solution. This approach is sometimes called stateoftheart (SOTA) model checker. However, hardware resources are often limited, forcing some selection. In this paper we present an automatic decision system, called Whisperer, which generates an optimized set of configured algorithms for a given modelchecking problem. The system weights the advice of advisors, each predicting the fitness of a different algorithm for the problem. Advisors also monitor the progress of currently running algorithms, allowing the replacement of ineffective algorithms. Whisperer is built into the formal verification platform, RuleBase/PE, and allows novice users to skip the delicate task of algorithm selection. Our experiments show that Whisperer, after some training, performs nearly as well as SOTA.
Tool Presentation: DTSR: Parallelizing SMTbased BMC Using Tunnels over Distributed Framework
Malay Ganai and Weihong Li
We present a tool DTSR for parallelizing an SMTbased BMC overdistributed environment, targeted for checking safety properties in lowlevel embedded (sequential) software. We use a tunneling and slicingbased reduction (TSR) approach combined with verificationfriendly model transformation to decompose a BMC instance disjunctively (at a given depth) into simpler and independent subproblems. We exploit such a decomposition to cut down the communication cost and the idle time of the CPUs during synchronization while solving BMC instances. Our approach scales almost linearly with the number of CPUs, as demonstrated in our experimental results.
Lineartime Reductions of Resolution Proofs
Omer Barilan, Oded Fuhrmann, Shlomo Hoory, Ohad Shacham and Ofer Strichman
DPLLbased SAT solvers progress by implicitly applying binary resolution. The resolution proofs that they generate are used, after the SAT solver's run has terminated, for various purposes. Most notable uses in formal verification are: extracting an unsatisfiable core, extracting an interpolant, and detecting clauses that can be reused in an incremental satisfiability setting (the latter uses the proof only implicitly, during the run of the SAT solver). Making the resolution proof smaller can benefit all of these goals. We suggest two methods that are linear in the size of the proof for doing so. Our first technique, called Recycleunits, uses each learned constant (unit clause) x for simplifying resolution steps in which x was the pivot, prior to when it was learned. Our second technique, called Recyclepivots, simplifies proofs in which there are several nodes in the resolution graph, one of which dominates the others, that correspond to the same pivot. Our experiments with industrial instances show that these simplifications reduce the core by ~5% and the proof by ~13%. It reduces the core less than competing methods such as runtillfix, but whereas our algorithms are linear in the size of the proof, the latter and other competing techniques are all exponential as they are based on SAT runs. If we consider the size of the proof graph as being polynomial in the number of variables (it is not necessarily the case in general), this gives our method an exponential time reduction compared to existing tools for small core extraction. Our experiments show that this result is evident in practice more so for the second method: it rarely takes more than a few seconds, even when competing tools time out, and hence it can be used as a cheap postprocessing procedure for proofs.
Structural Contradictions
Cindy Eisner and Dana Fisman
We study the relation between logical contradictions such as (p AND (not p)) and structural contradictions such as (p intersects (p concatenated with p). Intuitively, we expect logical and structural contradictions to be treated similarly, but it turns out that this does not hold for PSL (or for SVA). We examine the problem, and provide a solution in which logical and structural contradictions are treated in a consistent manner. The solution reveals that not all structural contradictions are created equal: we need to distinguish between them to preserve important characteristics of the logic. A happy result of our solution is that it provides the semantics over the natural alphabet of all subsets of P as opposed to the current semantics of PSL/SVA that give the semantics over an inflated alphabet including the cryptic letters "top" and "bottom". We show that the complexity of model checking is not affected by the changes with respect to the existing semantics of PSL.
 

