All experiments with configuration information, dataset are in the process of being standardized and made available at PAT website. The data is to show that PAT’s model checking library is highly optimized.

Details about the experiments can be found in *Post to Post Links II error: No post found with slug "experiments"*.

## Monthly archives for November, 2010

# 06, Nov, 2010: PAT experiments are standardized and published

# RTS Module

In PAT, real-time systems are modeled using compositional timed processes. As to model the timed behaviors, instead of explicitly manipulating clocks, we dynamically create and prune clocks by implementing a number of timed behavior patterns such as delay, timeout, deadline, waituntil, timed interrupt, within, etc.

The following experiments shows 5 popular real time system models which is used by lots of model checkers such as KRONOS and UPPAAL. In the experiments, we reasoned about various behaviors about the real time systems such as deadlock freeness, mutual exlucsion, liveness and safety properties. Details are elaborated as follows:

—Fischer’s mutual exclusion protocol: The protocol is proposed by Fischer in 1985. Instead of using atomic test-and-set instructions or semaphores, mutual exclusion is guaranteed by carefully placing bounds on the execution time of the instructions, leading to a protocol which is simple and relies heavily on time aspects. The property is mutual exclusion.

—Railway control system: A railway control system controls trains passing a critical point such as a bridge. The idea is to use a computer to guide trains from several tracks crossing a single bridge instead of building many bridges. Obviously, a property of such a system is that a train always eventually passes the bridge.

—CSMA/CD protocol: This protocol describes a solution to the problem of assigning the bus to only one of many competing stations arises in a broadcast network with a multi-access bus. The property is that a collision is always eventually detected. The model is based on the one in KRONOS [BDM+98], extended to arbitrary number of stations.

—Fiber Distributed Data Interface (FDDI): FDDI is a high-performance fiber-optic token ring local area network. An FDDI network is composed by N identical stations and a ring, where the stations can communicate by synchronous messages with high priority or asynchronous messages with low priority. The property is that the token cannot be in two stations simultaneously which is the safety property.

— Timed Pacemaker: Pacemaker is a small device placed in the chest or abdomen to help control abnormal heart rhythms. It’s critical real time application that with a complex hierarchical structure that is deemed to be too complicated to be modeled in Timed Automata, while it is feasible for PAT to model it. A simple property is that the pacemaker is deadlock-free.

Experiment results are shown in the table below. Here Time_zeno means that the property is verified with non-Zenoness check. We also done the comparison work with UPPAAL, and here UPPAAL-r means a revised version of its original models with some τ -transitions to make them consistent with our PAT model.

From the experiments, we can see that PAT Real Time System Module can handle a rich set of properties within reasonable time. Also, a number of observations are here: Firstly, PAT currently handles in average 13, 000 states per second (i.e., the total number of visited states – not new states – divided by the total number of seconds), which is comparable to existing model checkers. Secondly, model checking with non-Zenoness has little or no computational overhead in PAT. Thirdly, PAT is considerably slower than UPPAAL. A number of reasons might account for why UPPAAL outperforms PAT. For instance, system configurations in PAT are more complicated than those in UPPAAL as Stateful Timed CSP is designed for hierarchical complex systems. The main reason is that UPPAAL has benefited from many dedicated optimization techniques such as extrapolation, however PAT has none yet. The experiment data is available here.

# TA Module

TA module in PAT is designed to verify systems which can be modeled as

networks of timed automata extended with shared variables, structured data types, user defined functions, and channel synchronization.

The following demonstrates the experimental results on embedded examples

in TA module and UPPAAL as well as the explanations of experiment

results.

We run experiments on four models, namely, Fischer protocol, railway

control systems, FDDI protocol and CSMACD protocol to verify properties

such as deadlock-freeness, liveness, reachability and so on. The

following table summarizes the verification results. Here, ‘Proc’ means

the number of interleaving processesand ‘-‘ means out of memory or more

than 2 hours. We also do the comparison work with UPPAAL, and here ‘UPPAAL-r’ means a recvised version of the original models in UPPAAL to make them consisten with the models in PAT, ‘State-s’ means the bumber of stored states,

‘State-e’ means the number of explored states. The environment of experiments ruuning on UPPALL 4.1.2 is as follows:

- Use convex-hull approximation
- Hash table size 16M
- Automatic extrapolation operator
- Depth First Search
- Conservative space consumption
- Enable sweepline method
- Enable peephole optimiser
- Enable symmetry reduction

From the above table, we can see that PAT TA module can verify a rich set of priorities with reasonable time. In addition, some abservations are here:

firstly, we can see that the number of states and

running time increase rapidly with increasing the number of processes,

e.g., 8 processes in Fischer model vs. 9 processes. Secondly, PAT is

efficient and can handle million states in less than one hour. Thirdly,

in some case, PAT is slower than UPPAAL. The main reason of UPPAAL outperforming

PAT is that UPPAAL has benefied from many dedicated optimization techniques such

as extrapolation, while PAT has none yet. The experiment data is available here.

# PRTS Module

The example we use in PRTS module is a multi-lift system. The motivation is an

interesting phenomena: sometimes a user has requested a service in multi-lift

system, but a lift traveling in the same direction passes by that user and the

user must wait for another lift to serve it. This is actually because the

request assignment algorithm in the system and some other unpredictable

reasons such as a lift could be occupied arbitrary time because of other

users. We use PAT to model this complicated system which combines concurrency,

real-time and probability, and 2 methods of assigning the requests are

modeled: one is choosing the nearest lift and the other is choosing an

arbitrary one.

Because of the complexity of this system, now PAT could just handle 2 lifts,

otherwise there will be state explosion. But we could still get some

reasonable results from this table. Firstly, the more users are there, the

higher probability that some users will be passed by. Secondly, if we use

nearest lift assignment, we could expect a smaller probability that such bad

behavior could happen, which could be found in Lift_223_nearest and

Lift_223_random cases. There are many optimization could be implemented in

PAT, so we can expect better and better performance in this module. The experiment data is available here.

# PCSP Module

The following experiments are used to indicate what the performance that PAT

has in probabilistic model checking. The models we use include several

protocols:

model ME describes a probabilistic solution to N-process mutual exclusion

problem; model RC is a shared coin protocol of the randomized consensus

algorithm; Model DP is the probabilistic N-dining philosophers under fairness

and Model CS is the IEEE 802.3 CSMA/CD (Carrier Sense, Multiple Access with

Collision

Detection) protocol. Then we check the maximal probability(pmax) or minimal

probability(pmin) that the model could satisfy the desired properties and

compare the result with PRISM.

Table 1. Experiments on modeling

Table 2. Experiments on refinement checking

Table 3. Experiments on LTL checking

From Table 3, it is obvious that PAT has much good performance in probabilistic

model checking. Although the states and verification time increase quickly

when the parameter becomes larger and larger, PAT always has a comparable and

even better output compared with PRISM. Because our algorithm uses safety

recognition, the LTL which is safety or co-safety could be verified quite

fast. For example, in the RC(N=6, K=6) case, the co-safety(1) takes PAT 6s but

takes PRISM almost half an hour; and the co-safety(2) property cannot be

verified in PRISM within 10000 iteration but in PAT we could get the result in

several minutes. Even for liveness property, PAT could have a better

performance because our modeling language could guarantee smaller state space

in the same model compared with PRISM. However, PRISM could handle more states

per seconds, which could be indicated by the deadlock checking in the table,

so we are still improving our algorithm and data structure in this module, in

order to get better performance. The experiment data is available here.

# CSP Module

The following shows experimental results on embedded examples in CSP module to demonstrate the strength of PAT of handling various kinds of fairness constraints and verifying linearizability of concurrent data structure implementations.

#### LTL Checking under Various Kinds of Fairness Assumptions

This part shows verification statistics to show power of PAT to handle fairness in comparison to SPIN. The verified benchmark systems consist of dining philosopher problem, Milner’s cyclic scheduler algorithm and Peterson’s mutual exclusive algorithm.

Because of the deadlock state, the dining philosophers model (dp(N) for N philosophers and forks) does not guarantee that a philosopher always eventually eats ([]<> eat0) whether under no fairness or the strongest fairness. This experiment shows PAT takes little extra time for handling the fairness assumption. Notice that SPIN finds a counterexample (under no fairness) quickly. This is due to the particular order of exploration as well as its nested depth-first-search algorithm. Milner’s cyclic scheduler algorithm (ms(N) for N processes) is a showcase for the effectiveness of partial order reduction. We apply fairness in two different ways, i.e., one applying strong local fairness to the whole system (slf-whole) and the other applying only to inter-process communications (slf-action). In the latter case, partial order reduction allows us to prove the property over a much larger number of processes (e.g., 200 vs 12). Peterson’s mutual exclusive algorithm (peterson(N)) requires process-level weak fairness to guarantee bounded by-pass. The property is verified under weak fairness in PAT and process-level weak fairness in SPIN. PAT outperforms SPIN in this setting. The experiment data is available here.

#### Linearizability Checking

The experiment includes a number of

concurrent data structure algorithms, including a K-valued single-writer register, a

concurrent stack, mailbox problem and scalable non-zero indicator. The following table summarizes

part of our experiments, where ‘#Processes’ means the number of interleaving

processes, ‘#Operations’ means the number of operations a process performs, ‘N/A’ means out of memory or more than two hours, and ‘Unlimited’ means a process executes operations in a loop.

In the experiments, the number of states

and the running time increase rapidly with increasing data size or the number

of processes, e.g., 3 processes (in the Register or Stack) vs. 2

processes. Compared with the results in other method, our solution is

much more scalable to handle all the possible interleaving of operations. In

summary, PAT is effective and can handle millions of states in hours. The experiment data is available here.

However, the checking process still suffers from state space explosion problem.

We further address the challenge of state space reduction

of model checking linearizability by incorporating both symmetry

reduction and partial order reduction techniques. Our insight

emerged from our observation about characteristics of concurrent

object algorithms. First, concurrent objects are often designed

for a number of classes of similar threads. Such similarity or symmetry

often induces equivalent portions of the state space during

model checking process. Symmetry reduction targets at eliminating

such redundant equivalent states. Second, concurrent threads

are loosely coupled and different execution orders of transitions of

different threads may result in the same global state. Partial order

reduction is used to take advantage of this fact and restricts

model checking to explore only one of the equivalent execution

orders. Therefore, the features of concurrent data structures potentially

enable effective symmetry and partial order reductions, and

furthermore, we prove that these two techniques can be combined

for linearizability checking.

The new method has also been implemented in our model checker PAT

and applied to a number of real-world concurrent object algorithms,

including register-the K-valued multi-reader single-writer register, stack-a concurrent stack algorithm, buggy queue-an

incorrect queue algorithm, fixed queue-the modified correct

queue proposed, SNZI-recently-proposed complicated scalable

Non-Zero indicators espectively. For each algorithm, any

thread is constructed as a most general user to non-deterministically

execute one operation. The experiment data are presented in the following table

. In the table, ‘T’ is the total number of interleaving threads,

‘Res’ is the verification result, ‘Time’ is the running time in seconds

and ‘States’ is the number of stored states respectively for checking

with specific options (SR-symmetry reduction, POR-partial order

reduction). ‘Gain’ shows the percentage of reduced states and time

by applying two reduction techniques. ‘-’ denotes out of memory

or more than one hour.

From the table, it is shown that symmetry reduction and/or partial

order reduction reduce both memory and time consumption as

expected, and the combination of both techniques works best in

most cases. The computational overhead stems from two aspects.

One is dependency analysis between transitions of different threads

at each exploration step for partial order reduction; the other is

checking whether any equivalent state of a visiting state has been

explored for symmetry reduction. The experiment shows that our

approach can significantly save time and space for demonstrating

absence of errors and at the same time it does not sacrifice the capability

of detecting bugs.