Biomodeling and Genetic Programming

One of my main research interests is the computational modeling and simulation of biological networks. In particular, my students and I have been applying genetic programming towards the modeling of stochastic bio-networks.

1. Genetic Programming and the Stochastic Pi-Calculus

Preliminary research explored the feasibility of using the stochastic pi-calculus as a target language for genetic programming. This would enable genetic programming to evolve stochastic models denoted in the stochastic pi-calculus, or equivalently, automatically construct stochastic pi-calculus expressions. Initial work examined simple stochastic pi-calculus models that had monotonic time-series behaviours. Later, basic circuits such as the repressilator were successfully evolved. These required statistical characterizations of the time series (see Janine Imada's research (2) below), to be used by the genetic programming system during fitness evaluation. In addition, it was found that multi-objective fitness evaluation was ideal for this problem, given that many diverse statistics may be required for describing target network behaviours.


2. Evolutionary Synthesis of Stochastic Gene Network Models using Feature-based Search Spaces

Janine Imada's MSc research involves the automatic synthesis of stochastic bio-networks using genetic programming. A gene gate language proposed by Blossey et al. is used as a target language for genetic programming. These gene circuits are implemented in the stochastic pi-calculus (Phillips 2008), which is a stochastic process algebra. The gene gate language is more amenable to evolution by genetic programming than models written in the raw stochastic pi-calculus (see (1) above). Bio-models are characterized by time-course data, representing varying quantities of agents over time. The stochastic nature of the models means that multiple simulations are required. The overall network behaviour is denoted by examining various statistical features of the time series output. Genetic programming then uses these statistical features as objectives during evolution, in the attempt to construct a bio-model with similar features to the target model. A number of experiments successfully reconstructed target models, including repressilators and gene circuit examples.


3. Evolving Higher-Level Bio-Models with Genetic Programming

Kahramanogullari and Cardelli (2009) propose a higher-level bio-modeling language called PIM. PIM permits the description of biological reactions in terms familiar to biologists. PIM models are then translated into stochastic pi-calculus (SPI) expressions, and can then be simulated with a SPI interpreter.

I addressed the idea of automatically generating PIM models. Genetic programming uses PIM as the target language for evolution. Time series of simulation behaviours are used to guide GP towards models having the desired characteristics. PIM-specific language constraints were necessary, in order to sensibly constrain and optimize the search space of bio-model networks being evolved. Results of this research were positive, as a number of target PIM systems were successfully reverse engineered with genetic programming. Furthermore, the results showed that alternative models having similar behavioural equivalence to the target system can arise. In this way, genetic programming can be a tool for model invention and discovery.

Current research is investigating other bio-modeling languages, such as stochastic logic gate languages. Stay tuned!

Back up: