# CADLIVE (Computer-Aided Design of LIVing systEms) Project

# INTRODUCTION

Our goals are to explore bioalgorithms (design principles), fundamental mechanisms of how a biochemical network generates particular cellular functions and to design cells based on such bioalgorithms at the molecular interaction level. Bioalgorithms create Design Engineering for Cellular Functions.

References
regarding the CADLIVE Project

# Research

## 1 Computational tools for analyzing and designing biochemical networks (CADLIVE)

### 1.1 Definition of graphical notation for biochemical networks

We propose the CADLIVE graphical notation and compared it with other proposals

(See Extended CADLIVE and Its supplementary data)

### 1.2 Dynamic Simulator: Automatic conversion from a biochemical network to dynamic simulation

We first propose the rules of how a complicated biochemical networks are automatically converted into a dynamic model.

Computational tools for optimization, system analysis, S-system and grid computing are available.

Go to a summary of CADLIVE## 2 Dynamic model construction

### 2.1 Forward and reverse engineering

Generally, there are two approaches to build molecular systems: reverse engineering and forward engineering. The very abstract model generally employs reverse engineering, whereas the concrete model adopts forward engineering. Reverse engineering typically requires the use of simplistic parametric models of a large-scale network, e.g., Bayesian networks and Boolean networks, and the parameters of which are adjusted to fit real-world data. In forward engineering, a dynamic model is built based on detailed molecular interactions with exact kinetic parameters to achieve biological reality. This requires extensive knowledge of the system being studied.

Ideally, we would like to gain access to the activities of all-important molecular species including complexes and modified molecules. There is a strong need for methods that can handle concrete and complicated molecular systems at an intermediate level without going all the way down to exact biochemical reactions. A solution for such a requirement is to combine forward engineering and reverse engineering. Forward engineering builds mathematical models with kinetic-related parameters from biochemical maps, and reverse engineering explores the kinetic parameters to fit to experimental data. From this viewpoint, the model would focus on capturing the intrinsic architecture of molecular networks rather than their detailed kinetics, where gene regulatory and metabolic network maps should play a central role in simulating their dynamics. The research of "biochemical maps to dynamics" is a promising field.

Cited from

### 2.2 Parameter estimation (Inverse problem)

2.2.1
Two-phase search algorithm

Dynamic
simulations are essential for understanding the mechanism of how biochemical
networks generate robust properties to environmental stresses or genetic
changes. However, typical dynamic modeling and analysis yield only local
properties regarding a particular choice of plausible values of kinetic
parameters, because it is hard to measure the exact values *in vivo.*
Global and firm analyses are needed that consider how the changes in parameter
values affect the results. A typical solution is to systematically analyze the
dynamic behaviors in large parameter space by searching all plausible parameter
values without any biases. However, a random search needs an enormous number of
trials to obtain such parameter values. Ordinary evolutionary searches swiftly
obtain plausible parameters but the searches are biased. To overcome these
problems, we propose the two-phase search method that consists of a random
search and an evolutionary search to effectively explore all possible solution
vectors of kinetic parameters satisfying the target dynamics. We demonstrate
that the proposed method enables a nonbiased and high-speed parameter search
for dynamic models of biochemical networks through its applications to several
benchmark functions and to the heat shock response model.

## 3. System analysis

### 3.1 Module-based analysis of robustness

Biological systems maintain phenotypic stability in the face of various perturbations arising from environmental changes, stochastic fluctuations, and genetic variations. This robustness, which seems to be an inherent property of such systems, is still poorly understood at the molecular level. At the same time, systems approaches that were used with great success in the study and design of complex engineered systems provide a unique opportunity for investigating the basic tenants of robustness in cellular mechanisms. This is motivated by the fact that at the system level, biology and engineering seem to have a large number of common features despite their extremely different physical implementations.

The heat shock response is one such robust cellular system, which interestingly achieves its seemingly simple objective of refolding or eliminating heat-denatured proteins through a complicated set of interactions. In analogy to engineering control architectures, the complex regulation strategies seem to be a specifically designed solution to generate robustness against different types of perturbations.

Cited from

**Heat shock response**

Using module-based analysis coupled with rigorous mathematical comparisons, we propose that in analogy to control engineering architectures, the complexity of cellular systems and the presence of hierarchical modular structures can be attributed to the necessity of achieving robustness in the heat shock response.

**Hierarchical modular architecture**

### 3.2 Mathematical Analysis of Robustness (MAR): Beyond parameter problems

Dynamic simulations are necessary for understanding the mechanism of how biochemical networks generate robust properties to environmental stresses or genetic changes. Sensitivity analysis of mathematical models allows the linking of robustness to network structure. However, ordinary numerical analysis yields only local properties regarding a particular choice of plausible parameter values, because it is hard to know the exact parameter values in vivo. We need global and firm results that do not depend on particular parameter values.

We propose a mathematical analysis for
robustness (MAR) that combines sensitivity analysis with novel evolutionary
searches that explore many solution vectors of kinetic parameters, thereby
determining critical reactions. We analyze the sensitivity of amplitudes and
periods to changes in kinetic parameters in the *Drosophila* interlocked
circadian clock system and clearly identified the critical reactions
responsible for determining the circadian cycle. This work suggests that the
circadian clock intensively evolves or designs the kinetic parameters so that
it creates a highly robust cycle.

Cited from

## 4 Computer-Aided Rational Design (CARD)

The goals of systems biology are to understand the mechanisms of how biochemical networks generate particular cellular functions in response to environmental stresses or genetic changes, and to rationally design these molecular processes to meet an engineering purpose. To design biological systems at the molecular interaction level, it is essential to identify a biochemical network map, to build a dynamic model of the system, and to perform system analysis. Perturbation analysis is useful for identifying critical parameters that affect the system's performance. CAD is now a key technology to simulate or design the molecular architecture of a genetically engineered cell.

To rationally design a biochemical network, we propose a Computer-Aided Design (CAD) based strategy that consists of biochemical network design, module decomposition analysis, perturbation analysis for a dynamic model and experimental verification.

Assuming that the *E. coli* glucose
phosphotransferase system (PTS) aims at controlling the glucose uptake rate,
the PTS network model was decomposed into hierarchical modules in analogous to
engineering control architectures, and the effect of changes in gene expression
on the glucose uptake rate was simulated to make a plan of how the gene
regulatory network is engineered. Such design and analysis predicted that the *mlc*
knockout mutant with *ptsI* gene overexpression greatly increases the
specific glucose uptake rate, and biological experiments validated the
prediction, thereby demonstrating the feasibility of the proposed strategy.

Cited from

## 5 Mathematical tools for metabolic flux analysis

We propose various mathematical methods to construct and analyze large-scale metabolic network.

### 5.1 Integration proteome into metabolic flux analysis

Network-based pathway analysis facilitates understanding or designing metabolic systems and enables prediction of metabolic flux distributions. Network-based flux analysis requires considering not only pathway architectures but also the proteome or transcriptome to predict flux distributions, because recombinant microbes significantly change the distribution of gene expressions. The current problem is how to integrate such heterogeneous data to build a network-based model.

To link enzyme activity data to flux
distributions of metabolic networks, we have proposed Enzyme Control Flux
(ECF), a novel model that integrates enzyme activity into elementary mode
analysis (EMA). ECF presents the power-law formula describing how changes in
enzyme activities between wild-type and a mutant are related to changes in the
elementary mode coefficients (EMCs). To validate the feasibility of ECF, we
integrated enzyme activity data into the EMCs of *Escherichia coli* and *Bacillus
subtilis *wild-type. The ECF model effectively uses an enzyme activity
profile to estimate the flux distribution of the mutants and the increase in
the number of incorporated enzyme activities decreases the model error of ECF.

Cited from

### 5.2 Elementary mode-based prediction of a broad range of genetically modified mutants

Gene deletion and over-expression are critical technologies for designing or improving the metabolic flux distribution of microbes. Some algorithms including flux balance analysis (FBA) and minimization of metabolic adjustment (MOMA) predict a flux distribution from a stoichiometric matrix in the mutants in which some metabolic genes are deleted or non-functional, but there are few algorithms that predict how a broad range of genetic modifications, such as over-expression and under-expression of metabolic genes, alters the phenotypes of the mutants at the metabolic flux level.

To overcome such existing limitations, we
develop a novel algorithm that predicts the flux distribution of the mutants
with a broad range of genetic modification, based on elementary mode analysis.
It is denoted as Genetic Modification of Flux (GMF), which couples two
algorithms that we have developed: Modified Control Effective Flux (mCEF) and
Enzyme Control Flux (ECF). mCEF is proposed based on CEF to estimate the gene
expression patterns in genetically modified mutants in terms of specific
biological functions. GMF is demonstrated to predict the flux distribution of
not only gene deletion mutants but also the mutants with under-expressed and
over-expressed genes in *Escherichia coli* and *Corynebacterium
glutamicum*. This achieves breakthrough in the a priori flux prediction of a
broad range of genetically modified mutants.

Cited from:

### 5.3 Maximum entropy principle (MEP) for a new objective function

Elementary Mode (EM) analysis is potentially
effective in integrating transcriptome or proteome data into metabolic network
analyses and in exploring the mechanism of how phenotypic or metabolic flux
distribution is changed with respect to environmental and genetic
perturbations. The EM Coefficients (EMCs) indicate the quantitative
contribution of their associated EMs and can be estimated by maximizing

Cited from:

## 6 Statistical analysis of genome-scale networks

### 6.1 Spectral clustering for protein-protein interaction networks

A goal of systems biology is to analyze large-scale molecular networks including gene expressions and protein-protein interactions, revealing the relationships between network structures and their biological functions. Dividing a protein-protein interaction (PPI) network into naturally grouped parts is an essential way to investigate the relationship between topology of networks and their functions. However, clear modular decomposition is often hard due to the heterogeneous or scale-free properties of PPI networks.

To address this problem, we propose a diffusion model-based spectral clustering algorithm, which analytically solves the cluster structure of PPI networks as a problem of random walks in the diffusion process in them. To cope with the heterogeneity of the networks, the power factor is introduced to adjust the diffusion matrix by weighting the transition (adjacency) matrix according to a node degree matrix. This algorithm is named the adjustable diffusion matrix-based spectral clustering (ADMSC). To demonstrate the feasibility of ADMSC, we apply it to decomposition of a yeast PPI network, identifying biologically significant clusters with approximately equal size. Compared with other established algorithms, ADMSC facilitates clear and fast decomposition of PPI networks.

ADMSC is proposed by introducing the power
factor that adjusts the diffusion matrix to the heterogeneity of the PPI
networks. ADMSC effectively partition PPI networks into biologically
significant clusters with almost equal sizes, while it is very fast, robust and
appealing simple.

Cited
from

## 7 Biological Functional Networks

In
synthetic biology and systems biology, a bottom-up approach can be used to
construct a complex, modular, hierarchical structure of biological networks. To
analyze or design such networks, it is critical to understand the relationship
between network structure and function, the mechanism through which biological
parts or biomolecules are assembled into building blocks or functional
networks. A functional network is defined as a subnetwork of biomolecules that
performs a particular function. Understanding the mechanism of building
functional networks would help develop a methodology for analyzing the
structure of large-scale networks and design a robust biological circuit to
perform a target function. We propose a biological functional network database,
named BioFNet, which can cover the whole cell at the level of molecular
interactions. The BioFNet takes an advantage in implementing the simulation
program for the mathematical models of the functional networks, visualizing the
simulated results. It presents a sound basis for rational design of biochemical
networks and for understanding how functional networks are assembled to create
complex, high-level functions, which would reveal design principles underlying
molecular architectures.

Cited
from

BioFNet: biological
functional network database for analysis and synthesis of biological systems