Our goals are to explore bioalgorithms (design principles), fundamental mechanisms of how a biochemical network generates particular cellular functions and to design cells based on such bioalgorithms at the molecular interaction level. Bioalgorithms create Design Engineering for Cellular Functions.
References regarding CADLIVE Project
We propose the CADLIVE graphical notation and compared it with other proposals
(See Extended CADLIVE and Its supplementary data)
We first propose the rules of how a complicated biochemical networks are automatically converted into a dynamic model.
Computational tools for optimization, system analysis, S-system and grid computing are available.
Generally, there are two approaches to build molecular systems: reverse engineering and forward engineering. The very abstract model generally employs reverse engineering, whereas the concrete model adopts forward engineering. Reverse engineering typically requires the use of simplistic parametric models of a large-scale network, e.g., Bayesian networks and Boolean networks, and the parameters of which are adjusted to fit real-world data. In forward engineering, a dynamic model is built based on detailed molecular interactions with exact kinetic parameters to achieve biological reality. This requires extensive knowledge of the system being studied.
Ideally, we would like to gain access to the activities of all-important molecular species including complexes and modified molecules. There is a strong need for methods that can handle concrete and complicated molecular systems at an intermediate level without going all the way down to exact biochemical reactions. A solution for such a requirement is to combine forward engineering and reverse engineering. Forward engineering builds mathematical models with kinetic-related parameters from biochemical maps, and reverse engineering explores the kinetic parameters to fit to experimental data. From this viewpoint, the model would focus on capturing the intrinsic architecture of molecular networks rather than their detailed kinetics, where gene regulatory and metabolic network maps should play a central role in simulating their dynamics. The research of "biochemical maps to dynamics" is a promising field.
Cited from
2.2.1
Two-phase search algorithm
Dynamic
simulations are essential for understanding the mechanism of how biochemical
networks generate robust properties to environmental stresses or genetic
changes. However, typical dynamic modeling and analysis yield only local
properties regarding a particular choice of plausible values of kinetic
parameters, because it is hard to measure the exact values in vivo.
Global and firm analyses are needed that consider how the changes in parameter
values affect the results. A typical solution is to systematically analyze the
dynamic behaviors in large parameter space by searching all plausible parameter
values without any biases. However, a random search needs an enormous number of
trials to obtain such parameter values. Ordinary evolutionary searches swiftly
obtain plausible parameters but the searches are biased. To overcome these
problems, we propose the two-phase search method that consists of a random
search and an evolutionary search to effectively explore all possible solution
vectors of kinetic parameters satisfying the target dynamics. We demonstrate
that the proposed method enables a nonbiased and high-speed parameter search
for dynamic models of biochemical networks through its applications to several
benchmark functions and to the heat shock response model.
Biological systems maintain phenotypic stability in the face of various perturbations arising from environmental changes, stochastic fluctuations, and genetic variations. This robustness, which seems to be an inherent property of such systems, is still poorly understood at the molecular level. At the same time, systems approaches that were used with great success in the study and design of complex engineered systems provide a unique opportunity for investigating the basic tenants of robustness in cellular mechanisms. This is motivated by the fact that at the system level, biology and engineering seem to have a large number of common features despite their extremely different physical implementations.
The heat shock response is one such robust cellular system, which interestingly achieves its seemingly simple objective of refolding or eliminating heat-denatured proteins through a complicated set of interactions. In analogy to engineering control architectures, the complex regulation strategies seem to be a specifically designed solution to generate robustness against different types of perturbations.
Cited from
Heat shock response
Using module-based analysis coupled with rigorous mathematical comparisons, we propose that in analogy to control engineering architectures, the complexity of cellular systems and the presence of hierarchical modular structures can be attributed to the necessity of achieving robustness in the heat shock response.
Hierarchical modular architecture
Dynamic simulations are necessary for understanding the mechanism of how biochemical networks generate robust properties to environmental stresses or genetic changes. Sensitivity analysis of mathematical models allows the linking of robustness to network structure. However, ordinary numerical analysis yields only local properties regarding a particular choice of plausible parameter values, because it is hard to know the exact parameter values in vivo. We need global and firm results that do not depend on particular parameter values.
We propose a mathematical analysis for robustness (MAR) that combines sensitivity analysis with novel evolutionary searches that explore many solution vectors of kinetic parameters, thereby determining critical reactions. We analyze the sensitivity of amplitudes and periods to changes in kinetic parameters in the Drosophila interlocked circadian clock system and clearly identified the critical reactions responsible for determining the circadian cycle. This work suggests that the circadian clock intensively evolves or designs the kinetic parameters so that it creates a highly robust cycle.
Cited from
The goals of systems biology are to understand the mechanisms of how biochemical networks generate particular cellular functions in response to environmental stresses or genetic changes, and to rationally design these molecular processes to meet an engineering purpose. To design biological systems at the molecular interaction level, it is essential to identify a biochemical network map, to build a dynamic model of the system, and to perform system analysis. Perturbation analysis is useful for identifying critical parameters that affect the system's performance. CAD is now a key technology to simulate or design the molecular architecture of a genetically engineered cell.
To rationally design a biochemical network, we propose a Computer-Aided Design (CAD) based strategy that consists of biochemical network design, module decomposition analysis, perturbation analysis for a dynamic model and experimental verification.
Assuming that the E. coli glucose phosphotransferase system (PTS) aims at controlling the glucose uptake rate, the PTS network model was decomposed into hierarchical modules in analogous to engineering control architectures, and the effect of changes in gene expression on the glucose uptake rate was simulated to make a plan of how the gene regulatory network is engineered. Such design and analysis predicted that the mlc knockout mutant with ptsI gene overexpression greatly increases the specific glucose uptake rate, and biological experiments validated the prediction, thereby demonstrating the feasibility of the proposed strategy.
Cited from
We propose various mathematical methods to construct and analyze large-scale metabolic network.
Network-based pathway analysis facilitates understanding or designing metabolic systems and enables prediction of metabolic flux distributions. Network-based flux analysis requires considering not only pathway architectures but also the proteome or transcriptome to predict flux distributions, because recombinant microbes significantly change the distribution of gene expressions. The current problem is how to integrate such heterogeneous data to build a network-based model.
To link enzyme activity data to flux distributions of metabolic networks, we have proposed Enzyme Control Flux (ECF), a novel model that integrates enzyme activity into elementary mode analysis (EMA). ECF presents the power-law formula describing how changes in enzyme activities between wild-type and a mutant are related to changes in the elementary mode coefficients (EMCs). To validate the feasibility of ECF, we integrated enzyme activity data into the EMCs of Escherichia coli and Bacillus subtilis wild-type. The ECF model effectively uses an enzyme activity profile to estimate the flux distribution of the mutants and the increase in the number of incorporated enzyme activities decreases the model error of ECF.
Cited from
Gene deletion and over-expression are critical technologies for designing or improving the metabolic flux distribution of microbes. Some algorithms including flux balance analysis (FBA) and minimization of metabolic adjustment (MOMA) predict a flux distribution from a stoichiometric matrix in the mutants in which some metabolic genes are deleted or non-functional, but there are few algorithms that predict how a broad range of genetic modifications, such as over-expression and under-expression of metabolic genes, alters the phenotypes of the mutants at the metabolic flux level.
To overcome such existing limitations, we develop a novel algorithm that predicts the flux distribution of the mutants with a broad range of genetic modification, based on elementary mode analysis. It is denoted as Genetic Modification of Flux (GMF), which couples two algorithms that we have developed: Modified Control Effective Flux (mCEF) and Enzyme Control Flux (ECF). mCEF is proposed based on CEF to estimate the gene expression patterns in genetically modified mutants in terms of specific biological functions. GMF is demonstrated to predict the flux distribution of not only gene deletion mutants but also the mutants with under-expressed and over-expressed genes in Escherichia coli and Corynebacterium glutamicum. This achieves breakthrough in the a priori flux prediction of a broad range of genetically modified mutants.
Cited from:
Elementary Mode (EM) analysis is
potentially effective in integrating transcriptome or proteome data into
metabolic network analyses and in exploring the mechanism of how phenotypic or
metabolic flux distribution is changed with respect to environmental and
genetic perturbations. The EM Coefficients (EMCs) indicate the quantitative
contribution of their associated EMs and can be estimated by maximizing
Cited from:
A goal of systems biology is to analyze large-scale molecular networks including gene expressions and protein-protein interactions, revealing the relationships between network structures and their biological functions. Dividing a protein-protein interaction (PPI) network into naturally grouped parts is an essential way to investigate the relationship between topology of networks and their functions. However, clear modular decomposition is often hard due to the heterogeneous or scale-free properties of PPI networks.
To address this problem, we propose a diffusion model-based spectral clustering algorithm, which analytically solves the cluster structure of PPI networks as a problem of random walks in the diffusion process in them. To cope with the heterogeneity of the networks, the power factor is introduced to adjust the diffusion matrix by weighting the transition (adjacency) matrix according to a node degree matrix. This algorithm is named the adjustable diffusion matrix-based spectral clustering (ADMSC). To demonstrate the feasibility of ADMSC, we apply it to decomposition of a yeast PPI network, identifying biologically significant clusters with approximately equal size. Compared with other established algorithms, ADMSC facilitates clear and fast decomposition of PPI networks.
ADMSC is proposed by introducing the power
factor that adjusts the diffusion matrix to the heterogeneity of the PPI
networks. ADMSC effectively partition PPI networks into biologically
significant clusters with almost equal sizes, while it is very fast, robust and
appealing simple.
Cited
from