GeoInf

Overview

Markov population models (MPMs) are a widely used modelling formalism in the area of computational biology and related areas. The semantics of a MPM is an infinite-state continuous-time Markov chain. The established continuous stochastic logic (CSL) allows to express properties of MPMs, such as probabilistic reachability, survivability, oscillations, switching times between attractor regions, and various others. GeoInf combines the tools Geobound and Infamy to handle complete CSL for MPMs.

For a complete description of the used algorithms we refer to our technical report “A CSL Model Checker for Markov Population Models” on arXiv.org.

Tool Download

The GeoInf tool chain is based on two tools called Geobound and Infamy that were slightly extended in order to communicate with each other, i.e. Infamy is used to do the model checking and transient analysis while Geobound provides the steady state analysis.

Notice: The tools were built on and should be run on recent versions of (for instance, Ubuntu) Linux.

More information about Geobound can be obtained on the Geobound tool homepage. More information about Infamy can be found on the Infamy tool homepage.

Please download both tools and modify the parameters in the shell scripts shipped along the case studies accordingly to the place where the tools reside. Then, run the shell scripts.

Authors: Geobound was written by David Spieler. Infamy was written by Ernst Moritz Hahn.

Experimental Results – Case Studies

Using several case studies, we assess the effectiveness of our technique. We have combined the Geobound tool to compute bounds on steady state probabilities for Markov population models with the infinite-state model checker Infamy. This way, we can effectively handle the following combination of models and properties. To show the efficiency of the approach, we applied our tool chain on a number of models from different areas. The results were obtained on an Ubuntu 10.04 machine with an Intel dual-core processor at 2.66 GHz equipped with 3 GB of RAM. Instead of the truth value for the formula under considerations, in the result tables we give intervals of the probability measure of the outer formula. We also make use of a derived operator for conditional steady state measures defined as defined in the technical report.

Protein Synthesis [1]: protein-synthesis.tar.gz

We analyze the MPM encoding protein synthesis, as depicted in Table 1. In biological cells, each protein (P, x2) is encoded by a gene (G, x1). If the gene is active (G = 1), the corresponding protein will be synthesized with rate ν = 1.0. Proteins may degenerate with rate δ = 0.02 and thus disappear after a time. The gene switches from active state to inactive (G = 0) with rate μ = 5.0 and vice versa with rate λ = 1.0.

Transition classes of the exclusive switch.

Table 1: Transition classes of the protein synthesis case study.

We consider the property that on the long run, given that there are more than 20 proteins, a state with 20 or less proteins is most likely (with a probability of at least 0.9) reached withing t time units:

S>p( P>0.9(♢[0,t] P ≤ 20 | P>20 ) ).

We give results in Table 2. The shortcut stop was not used for the analysis. With “depth” we specify the number of states of the shortest path from the initial state to any other state of the truncation. The runtimes of Geobound and Infamy is given in seconds. The rate of decay of proteins depends on the number of proteins existing. For the states on the border of the steady-state state-space window for 1-ε of the total probability mass, we have large rates back to existing states. Because of this, for the given parameters the state space exploration algorithm does not need to explore further states, and the total number states explored n does not increase with the time bound. To obtain different n, we would have needed to choose extremely large time bounds, for which analysis would be on the one hand infeasible and on the other hand would lead to results extremely close to 1.0. The lower and upper bounds are further away than ε. This results, because we have to divide by SU (Φ2 ) for the lower and by SL(Φ2) for the lower bound. In turn, this may lead to a much larger error than ε.

Results of the protein synthesis case study.

Table 2: Results of the protein synthesis case study.

Gene Expression [2]: gene-expression.tar.gz

Next, we analyze a network of chemical reactions where a gene is transcribed into mRNA (M) with rate λ = 25.0 and the mRNA is translated into proteins (P ) with rate μ = 1.0. Both populations can degrade at rates δM = 2.0 and δP = 1.0, respectively. The corresponding transition classes are listed in Table 3. The property of interest is the steady state probability of leaving a certain set of states W enclosing more than 80% of the steady state probability mass most likely within t time units, i.e.,

S>p( P>0.9( ♢[0,t] ¬W ) | W ),

where W = M>5 ∧ M<20 ∧ P>5 ∧ P<20 with S>0.8 ( W ) = ⊤.

Transition classes of the gene expression case study.

Table 3: Transition classes of the gene expression case study.

The results are stated in Table 4. Similar to the protein synthesis case study, we see that there is no increase in the number of states, because the window size already comprises enough states for the transient analysis. In Table 5 we consider results for the sub-formula P>0.9( ♢[0,t] ¬W ). We compare the methods to explore states for the transient until described in our technical report (Advanced) with the finite state projection [3] (FSP) previously used in Infamy. We see that the time needed is comparable, but the new algorithm needs to explore less states. This is the case because with the new method  when building the finite truncation we have more control into which direction we explore. In contrast, the FSP explores the model into all directions at the same time, until enough precision is reached. When we use the shortcut stop (Advanced + AP), we will not explore states further in which ¬W holds. When exploring the model, for larger time bounds there is some point at which there are almost only states of which all successors have been completely explored and states for which ¬W holds. Thus, the maximal number of states to be explored is constant with this optimization, except for very large time bounds.

Results of the gene expression case study.

Table 4: Results of the gene expression case study.

Results for the gene expression case study (sub-formula).

Table 5: Results of the gene expression case study (sub-formula).

For the protein synthesis, there is almost no difference in the number of states needed by the new method and FSP. Because it has only one infinite variable, there is just one direction to be explored. Thus, the new method performs worse, as it needs more effort to explore into this direction.

Using the shortcut method also allows us to handle the formula P>0.9( ♢ ¬W ), involving the unbounded until operator. Using a precision of 1e−6, we needed a total time of 1.5 seconds, reached 974 states and obtained a reachability probability almost 1.0. Using a precision of 1e−10, we needed 3.7 seconds and explored 1823 states. For the computation of unbounded until probabilities, efficient specialized algorithms were used, which explains that we needed less time than for some of the time bounded experiments.

Exclusive Switch [4]: exclusive-switch.tar.gz

The exclusive switch is a gene regulatory network with one promotor region shared by two genes. That promotor region can either be unbound (G = 1,G.P1 = 0,G.P2 = 0) or bound by a protein expressed by gene 1 (G = 0,G.P1 = 1,G.P2 = 0) or gene 2 (G = 0,G.P1 = 0,G.P2 = 1). If the promotor is unbound, both proteins are expressed, each with rate ρ = 0.05, otherwise only the protein that is currently bound to the promotor is produced at rate ρ. Proteins degrade at rate δ = 0.005. Binding happens at rate λ = 0.01 and unbinding at rate μ = 0.008. The transition class structure is given in Table 6.

Transition classes of the exclusive switch case study.

Table 6: Transition classes of the exclusive switch case study.

This system has two attractor regions, i.e., two spatial maxima in the steady state probability distribution over the protein levels, one at P1 = 10, P2 = 0 and the other one at P1 = 0, P2 = 10. We are interested in the switching time between these two regions. For this, we estimate the time needed for a 90%-quantile of the steady state probability mass of one of the two attractors to reach the other attractor region. More precisely, let

start := sqrt((P1-10)^2 + P2^2) ≤ 4
end := sqrt(P1^2 + (P2-10)^2)  ≤ 4.

Then, the formula to check is

S>p ( P>0.9( ♢[o,t] end ) | start ).

Note that since the model is symmetric we only have to check one formula from one attractor to the other. The corresponding results are depicted in Table 7.

Results of the exclusive switch case study.

Table 7: Results of the exclusive switch case study.

From these results we may conclude that in half of the cases, most likely the switching time between the attractor regions is at most 7800 time units, while in almost all cases the switching time is most likely below 8000 time units, assuming the system has stabilized to a steady state.

[1] Gross,P.J.E.,Peccoud,J.:Quantitative modeling of stochastic systems in molecular biology by using stochastic Petri nets. Proc. Natl. Acad. Sci. 95, 6750–6755 (1998)
[2] Thattai, M., van Oudenaarden, A.: Intrinsic noise in gene regulatory networks. Proceedings of the National Academy of Science, USA 98(15), 8614–8619 (2001)
[3] Munsky, B., Khammash, M.: The finite state projection algorithm for the solution of the chemical master equation. Journal of Chemical Physics 124(044104) (2006)
[4] Loinger,A.,Lipshtat,A.,Balaban,N.Q.,Biham,O.: Stochastic simulations of genetics witch systems. Physical Review E 75(2), 021904 (2007)