The complete result can be found here.

## Monthly archives for April, 2008

# Performance Study for Model Checking Optimization Techniques

The following table presents part of the experimental results of PAT (conducted on Windows XP with a 2.0 GHz Intel Core Duo CPU and 1 GB memory). The first model is the dining philosophers. The property is []<>eat.0. Without fairness assumptions, this property is false. A counterexample is quickly produced in most of the cases. Nonetheless, it may take considerably long if the trace leading to a counterexample is explored very late (e.g., for College(15) without reduction). Partial order reduction significantly reduces the time to discover a counterexample for this example. This model is then annotated with fairness. The last three columns show verification results of lCollege(N). The property becomes true and therefore a complete search is necessary. Note that partial order reduction gains little. The reason is that the model is highly coupled and heavy in communication. Manually hiding local communicating could reduce the verification time. The second model is Milner’s cyclic scheduler. Milner’s cyclic scheduler describes a scheduler for N concurrent processes. The processes are scheduled in cyclic fashion so that the first process is reactivated after the N-th process has been activated. The fairness assumptions state that a process must eventually finish its local task and then activate the next process. The property to verify is that a process must eventually be scheduled, which is true with/without the fairness assumptions. This model demonstrates the effectiveness of the partial order reduction. Without the reduction, the size of the search graph grows exponentially and thus verification soon becomes infeasible (e.g., for 15 processes).With partial order reduction, we are able to verify 400 processes reasonably fast (using less than 2 minutes). Notice that the computational overhead for handling fairness annotations are negligible, e.g., same amount of time is taken to verify the model with/without fairness. The third models the classic readers/writers problem. The readers/writers model describes a protocol for coordination of N identical readers and N identical writers accessing a shared resource. The property to verify is reachability of an erroneous situation (i.e., wrong readers/writers coordination). This mode is then annotated with fairness assumptions to state that each reader/writer must eventually finish reading/writing. This model demonstrates the effectiveness of the reduction for handling identical/similar processes.

The complete result can be found here.

# Performance comparison with SPIN and FDR for Model Checking

We compare PAT with FDR and Spin. FDR is de facto model checker for CSP, which has been actively developing for years. We choose Spin over others because it is the most established explicit model checker and its input language is loosely based on CSP. Because there has not yet been bounded model checkers dedicated to event-based process algebras, the performance of the bounded model

checker in PAT is compared to FDR and Spin as well. PAT has been evaluated with a variety of models with two award wining SAT solvers, i.e., MiniSAT and RSAT. The experiments in this section are conducted in a 2.0 GHz Intel Core Duo CPU and 1 GB memory.

The following 4 figures summarizes the efficiency of PAT using three benchmark models, i.e., the dining philosopher problem, the classic readers/writers problem and Milner’s cyclic scheduler. The readers/writers problem describes a protocol for coordination of N readers and N writers accessing a shared resource. The property to verify is reachability of an erroneous situation (i.e., wrong readers/writers coordination). Milner’s cyclic scheduler describes a scheduler for N concurrent processes. The processes are scheduled in cyclic fashion so that the first process is reactivated after the N-th process has been activated. The property to verify is that a process must eventually be scheduled. Note that the experiment results on FDR should be taken with a grain of salt, since they run on different platforms (on the same machine) and that the results are obtained by showing a (failure) refinement relationship between the system model and a process capturing the property to verify.

For the dining philosopher example, our on-the-fly model checker (i.e., PAT-Exp) performs best to produce a counterexample. Our bounded model checker (i.e., PAT-SAT) outperforms Spin for 13 or more philosophers. One of the reasons is that the LTL to Buchi automata conversion in Spin suffers from large LTL formulae, i.e., takes more time and produces bigger automata. All verifiers outperforms FDR (except PAT-SAT for small number of philosophers because of the overhead of encoding), which is not feasible for more than 12 philosophers.

For the readers/writers example, all verifiers except FDR produces a counterexample efficiently. Note that for every experiment Spin takes less than a few seconds to build model-specific executables.

For the Milner’s example, the full state space (exponentially increasing) must be explored because the property to verify is true. Before using partial order reduction, PAT-Exp can only handle the size of 15, where the time goes exponentially with respect to the size. After incoperate partial order reduction, 500 milner’s example can be finished in 60 seconds. The time is increased linearly, which shows how significant the optimization techniques are.The experiments on FDR and PAT-SAT produce comparable results for 12 or more processes. Given that PAT-SAT is designed to produce counterexamples efficiently, we believe its performance can be further improved by adapting the recent developments to prove true properties more efficiently. The time taken by Spin remains constant. This should probably be credited to one of the many optimization techniques that have been implemented in Spin. In the future, sophisticated optimization techniques like partial

order reduction, symmetry reduction will be incorporated into PAT.

The following chart summarizes the performance our SAT-based verifier in terms of the size of the generated formula, the time needed for encoding and solving against the number of states of the model. The estimated number of states increase exponentially whereas the number of Boolean variables and the time need increase much slower.