**Overview**

Markov population models (MPMs) are a widely used modelling formalism in the area of computational biology and related areas. The semantics of a MPM is an *infinite-state *continuous-time Markov chain. The established *continuous stochastic logic *(CSL) allows to express properties of MPMs, such as probabilistic reachability, survivability, oscillations, switching times between attractor regions, and various others. *GeoInf* combines the tools *Geobound* and *Infamy* to handle complete CSL for MPMs.

For a complete description of the used algorithms we refer to our technical report “A CSL Model Checker for Markov Population Models” on arXiv.org.

**Tool Download**

The *GeoInf* tool chain is based on two tools called *Geobound* and *Infamy* that were slightly extended in order to communicate with each other, i.e. *Infamy* is used to do the model checking and transient analysis while *Geobound* provides the steady state analysis.

*Notice*: The tools were built on and should be run on recent versions of (for instance, Ubuntu) Linux.

More information about *Geobound* can be obtained on the Geobound tool homepage. More information about *Infamy* can be found on the Infamy tool homepage.

Please download *both* tools and modify the parameters in the shell scripts shipped along the case studies accordingly to the place where the tools reside. Then, run the shell scripts.

**Authors**: *Geobound* was written by David Spieler. *Infamy* was written by Ernst Moritz Hahn.

**Experimental Results – Case Studies**

Using several case studies, we assess the effectiveness of our technique. We have combined the *Geobound* tool to compute bounds on steady state probabilities for Markov population models with the infinite-state model checker *Infamy*. This way, we can effectively handle the following combination of models and properties. To show the efficiency of the approach, we applied our tool chain on a number of models from different areas. The results were obtained on an Ubuntu 10.04 machine with an Intel dual-core processor at 2.66 GHz equipped with 3 GB of RAM. Instead of the truth value for the formula under considerations, in the result tables we give intervals of the probability measure of the outer formula. We also make use of a derived operator for conditional steady state measures defined as defined in the technical report.

**Protein Synthesis [1]: ****protein-synthesis.tar.gz**

We analyze the MPM encoding protein synthesis, as depicted in Table 1. In biological cells, each protein (P, x2) is encoded by a gene (G, x1). If the gene is active (G = 1), the corresponding protein will be synthesized with rate ν = 1.0. Proteins may degenerate with rate δ = 0.02 and thus disappear after a time. The gene switches from active state to inactive (G = 0) with rate μ = 5.0 and vice versa with rate λ = 1.0.

We consider the property that on the long run, given that there are more than 20 proteins, a state with 20 or less proteins is most likely (with a probability of at least 0.9) reached withing t time units:

*S*>p( *P*>0.9(♢[0,t] P ≤ 20 | P>20 ) ).

We give results in Table 2. The shortcut stop** **was not used for the analysis. With “depth” we specify the number of states of the shortest path from the initial state to any other state of the truncation. The runtimes of *Geobound* and *Infamy* is given in seconds. The rate of decay of proteins depends on the number of proteins existing. For the states on the border of the steady-state state-space window for 1-ε of the total probability mass, we have large rates back to existing states. Because of this, for the given parameters the state space exploration algorithm does not need to explore further states, and the total number states explored *n* does not increase with the time bound. To obtain different *n*, we would have needed to choose extremely large time bounds, for which analysis would be on the one hand infeasible and on the other hand would lead to results extremely close to 1.0. The lower and upper bounds are further away than ε. This results, because we have to divide by SU (Φ2 ) for the lower and by SL(Φ2) for the lower bound. In turn, this may lead to a much larger error than ε.

**Gene Expression [2]: ****gene-expression.tar.gz**

Next, we analyze a network of chemical reactions where a gene is transcribed into mRNA (M) with rate λ = 25.0 and the mRNA is translated into proteins (P ) with rate μ = 1.0. Both populations can degrade at rates δM = 2.0 and δP = 1.0, respectively. The corresponding transition classes are listed in Table 3. The property of interest is the steady state probability of leaving a certain set of states W enclosing more than 80% of the steady state probability mass most likely within t time units, i.e.,

*S*>p( *P*>0.9( ♢[0,t] ¬W ) | W ),

where W = M>5 ∧ M<20 ∧ P>5 ∧ P<20 with S>0.8 ( W ) = ⊤.

The results are stated in Table 4. Similar to the protein synthesis case study, we see that there is no increase in the number of states, because the window size already comprises enough states for the transient analysis. In Table 5 we consider results for the sub-formula *P*>0.9( ♢[0,t] ¬W ). We compare the methods to explore states for the transient until described in our technical report (Advanced) with the finite state projection [3] (FSP) previously used in *Infamy*. We see that the time needed is comparable, but the new algorithm needs to explore less states. This is the case because with the new method when building the finite truncation we have more control into which direction we explore. In contrast, the FSP explores the model into all directions at the same time, until enough precision is reached. When we use the shortcut stop (Advanced + AP), we will not explore states further in which ¬W holds. When exploring the model, for larger time bounds there is some point at which there are almost only states of which all successors have been completely explored and states for which ¬W holds. Thus, the maximal number of states to be explored is constant with this optimization, except for very large time bounds.

For the protein synthesis, there is almost no difference in the number of states needed by the new method and FSP. Because it has only one infinite variable, there is just one direction to be explored. Thus, the new method performs worse, as it needs more effort to explore into this direction.

Using the shortcut method also allows us to handle the formula P>0.9( ♢ ¬W ), involving the unbounded until operator. Using a precision of 1e−6, we needed a total time of 1.5 seconds, reached 974 states and obtained a reachability probability almost 1.0. Using a precision of 1e−10, we needed 3.7 seconds and explored 1823 states. For the computation of unbounded until probabilities, efficient specialized algorithms were used, which explains that we needed less time than for some of the time bounded experiments.

**Exclusive Switch [4]: ****exclusive-switch.tar.gz**

The exclusive switch is a gene regulatory network with one promotor region shared by two genes. That promotor region can either be unbound (G = 1,G.P1 = 0,G.P2 = 0) or bound by a protein expressed by gene 1 (G = 0,G.P1 = 1,G.P2 = 0) or gene 2 (G = 0,G.P1 = 0,G.P2 = 1). If the promotor is unbound, both proteins are expressed, each with rate ρ = 0.05, otherwise only the protein that is currently bound to the promotor is produced at rate ρ. Proteins degrade at rate δ = 0.005. Binding happens at rate λ = 0.01 and unbinding at rate μ = 0.008. The transition class structure is given in Table 6.

This system has two attractor regions, i.e., two spatial maxima in the steady state probability distribution over the protein levels, one at P1 = 10, P2 = 0 and the other one at P1 = 0, P2 = 10. We are interested in the switching time between these two regions. For this, we estimate the time needed for a 90%-quantile of the steady state probability mass of one of the two attractors to reach the other attractor region. More precisely, let

start := sqrt((P1-10)^2 + P2^2) ≤ 4

end := sqrt(P1^2 + (P2-10)^2) ≤ 4.

Then, the formula to check is

*S*>p ( *P*>0.9( ♢[o,t] end ) | start ).

Note that since the model is symmetric we only have to check one formula from one attractor to the other. The corresponding results are depicted in Table 7.

From these results we may conclude that in half of the cases, most likely the switching time between the attractor regions is at most 7800 time units, while in almost all cases the switching time is most likely below 8000 time units, assuming the system has stabilized to a steady state.

[1] Gross,P.J.E.,Peccoud,J.:Quantitative modeling of stochastic systems in molecular biology by using stochastic Petri nets. Proc. Natl. Acad. Sci. 95, 6750–6755 (1998)

[2] Thattai, M., van Oudenaarden, A.: Intrinsic noise in gene regulatory networks. Proceedings of the National Academy of Science, USA 98(15), 8614–8619 (2001)

[3] Munsky, B., Khammash, M.: The finite state projection algorithm for the solution of the chemical master equation. Journal of Chemical Physics 124(044104) (2006)

[4] Loinger,A.,Lipshtat,A.,Balaban,N.Q.,Biham,O.: Stochastic simulations of genetics witch systems. Physical Review E 75(2), 021904 (2007)