Document Detail

Using topology to tame the complex biochemistry of genetic networks.
Jump to Full Text
MedLine Citation:
PMID:  23277605     Owner:  NLM     Status:  MEDLINE    
Living cells are controlled by networks of interacting genes, proteins and biochemicals. Cells use the emergent collective dynamics of these networks to probe their surroundings, perform computations and generate appropriate responses. Here, we consider genetic networks, interacting sets of genes that regulate one another's expression. It is possible to infer the interaction topology of genetic networks from high-throughput experimental measurements. However, such experiments rarely provide information on the detailed nature of each interaction. We show that topological approaches provide powerful means of dealing with the missing biochemical data. We first discuss the biochemical basis of gene regulation, and describe how genes can be connected into networks. We then show that, given weak constraints on the underlying biochemistry, topology alone determines the emergent properties of certain simple networks. Finally, we apply these approaches to the realistic example of quorum-sensing networks: chemical communication systems that coordinate the responses of bacterial populations.
Mukund Thattai
Related Documents :
24860165 - Motif enrichment tool.
19201555 - A review of research progress of fecb gene in chinese breeds of sheep.
24723265 - What's that gene (or protein)? online resources for exploring functions of genes, trans...
23685275 - An unsolved mystery: the target-recognizing rna species of microrna genes.
25008995 - Genome-wide analysis of the r2r3-myb transcription factor gene family in sweet orange (...
23746215 - Short fiber protein of ad40 confers enteric tropism and protection against acidic gastr...
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't     Date:  2012-12-31
Journal Detail:
Title:  Philosophical transactions. Series A, Mathematical, physical, and engineering sciences     Volume:  371     ISSN:  1364-503X     ISO Abbreviation:  Philos Trans A Math Phys Eng Sci     Publication Date:  2013 Feb 
Date Detail:
Created Date:  2013-01-01     Completed Date:  2013-03-07     Revised Date:  2013-07-11    
Medline Journal Info:
Nlm Unique ID:  101133385     Medline TA:  Philos Trans A Math Phys Eng Sci     Country:  England    
Other Details:
Languages:  eng     Pagination:  20110548     Citation Subset:  IM    
National Centre for Biological Sciences, Tata Institute of Fundamental Research, UAS/GKVK Campus, Bellary Road, Bangalore 560065, India.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Computer Simulation
Gene Expression Regulation / physiology*
Models, Biological*
Models, Chemical*
Proteome / chemistry*,  metabolism*
Signal Transduction / physiology*
Grant Support
500103/Z/09/Z//Wellcome Trust
Reg. No./Substance:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Full Text
Journal Information
Journal ID (nlm-ta): Philos Transact A Math Phys Eng Sci
Journal ID (iso-abbrev): Philos Transact A Math Phys Eng Sci
Journal ID (publisher-id): RSTA
Journal ID (hwp): roypta
ISSN: 1364-503X
ISSN: 1471-2962
Publisher: The Royal Society Publishing
Article Information
Download PDF
© 2012 The Author(s) Published by the Royal Society. All rights reserved.
Print publication date: Day: 13 Month: 2 Year: 2013
pmc-release publication date: Day: 13 Month: 2 Year: 2013
Volume: 371 Issue: 1984
E-location ID: 20110548
PubMed Id: 23277605
ID: 3538440
DOI: 10.1098/rsta.2011.0548
Publisher Id: rsta20110548

Using topology to tame the complex biochemistry of genetic networks Alternate Title:Genetic networks
Mukund Thattai
National Centre for Biological Sciences, Tata Institute of Fundamental Research, UAS/GKVK Campus, Bellary Road, Bangalore 560065, India
Correspondence: e-mail:
[other] One contribution of 17 to a Discussion Meeting Issue ‘Signal processing and inference for the physical sciences’.

1.  Introduction

Genes are physically embodied as a string of nucleotide bases (ATGGCCCTG…) on a self-replicating DNA molecule, contained within the cytoplasm of a prokaryote or the nucleus of a eukaryote. Genes encode proteins, which in turn carry out the processes required for the maintenance of cellular life. During the process of gene expression, the genetic information is first transcribed or copied onto a short-lived messenger RNA (mRNA) molecule. This mRNA is then translated repeatedly into a protein, as specified by the genetic code: a set of three consecutive nucleotides of mRNA uniquely specifies one of twenty possible amino acids, a series of which are strung together to form the protein (the short sequence above, for example, encodes the first three amino acids of the human insulin protein).

This basic description, the ‘central dogma of molecular biology’ (figure 1a), is not the entire story however. Every cell in the human body carries the same complement of genes, yet a heart cell and a brain cell are made up of very different proteins. Even in a single-celled organism such as a bacterium, different proteins are expressed at different times. The bacterium Escherichia coli is able to assemble flagella when it needs to swim, and pili when it needs to anchor itself to a surface; it will produce a metabolic enzyme only when its substrate is present, and synthesize DNA repair proteins only when subject to shock. In short, genes can be turned on and off.

This simple but powerful idea was first proposed by Jacques Monod in the 1940s [1], and the framework he constructed remains essentially unchallenged to this day. The expression of genes is a tightly regulated process [2], ch. 7. Central to this process is a control element known as a promoter—a short stretch of DNA that precedes every gene. The promoter contains a binding site for the RNA polymerase, the protein complex responsible for transcription. Correspondingly, mRNAs contain binding sites for ribosomes, the protein complexes responsible for translation. The rate of transcription at a promoter can be increased or decreased by proteins known as transcription factors that bind DNA in the vicinity of the promoter. In prokaryotes, transcription factors typically bind within a few tens of bases of the promoter, whereas in eukaryotes, long-distance interactions between transcription factors and the RNA polymerase can extend over megabases. Eukaryotes also have additional ‘epigenetic’ mechanisms to regulate transcription, via covalent modifications of the histone proteins on which DNA is wrapped, or modifications of the DNA itself. Once an mRNA molecule is transcribed, its rate of translation can be regulated by proteins that influence the capacity of ribosomes to bind ribosome binding sites, or by protein complexes that degrade specific mRNAs. Additionally, it has become clear that a significant fraction of transcribed RNAs do not encode proteins; rather, many of these non-coding RNAs can themselves regulate the translation of mRNAs to proteins, via the sophisticated machinery of RNA interference [3]. Taking all these effects into account, the central dogma must be modified with a few additional arrows (figure 1b).

These new arrows are loaded with implications: they permit us to assemble complex networks of transcriptional and regulatory interactions. Gene A can activate gene B and gene C, but repress gene D, and so on. There is a compelling case to be made for the existence of such networks in living cells. Consider that a bacterial genome contains about 4000 genes, whereas the human genome contains about 25 000 genes—a surprisingly modest difference at first glance, given that the human body is made up of more than 200 cell types, not to mention higher degrees of organization required to specify a complex tissue such as the brain. A deeper analysis suggests that gene number is not the correct measure of complexity: the properties of a cell are specified by the proteins contained within it; the range of possible cell types is therefore determined by the range of possible combinations of expressed genes, and grows exponentially with gene number. How are all such combinations to be accessed, however? We know that distinct external signals can drive cells to differentiate into distinct types. However, such signals do not directly interact with individual genes, turning them on or off. Once the differentiation process is triggered, various combinations of gene expression must arise through the intrinsic behaviour of the genes themselves. That is, there must be a network of genetic interactions which, based on very few external regulatory cues, is able to produce the correct expression patterns. The manifest complexity of cellular behaviour strongly implies the existence of complex regulatory networks within.

In recent times, we have been able to resolve network architecture in unprecedented detail using high-throughput biochemical experiments, or by inference from gene expression and gene knockout data [48]. For certain well-studied organisms such as Escherichia coli and the yeast Saccharomyces cerevisiae, there is a growing body of detailed information regarding transcriptional and regulatory interactions [912]. When these data are combined, what emerges is a picture of highly structured networks with rich topologies [13], containing recurring motifs or patterns [14,15], very different from randomly connected sets of genes. Just as individual proteins have been selected for function, entire networks seem to be similarly selected. So here is what one might call the central idea of network biology: that the complex behaviour of living cells must be understood as emerging not just from the properties of individual genes, but from the manner in which they are connected.

2.  The control of gene expression

For the purposes of this exposition, we focus on prokaryotic gene regulation via promoters. A promoter is a loosely defined object. We can take it to signify a stretch of DNA, upstream of every gene, which controls whether that gene is expressed or not. The properties of a promoter, like those of a gene, are determined by its DNA sequence. A survey of bacterial promoters reveals a conserved pattern of nucleotides, all variations of a particular consensus sequence. The most conserved regions are two short stretches situated −35 and −10 nucleotides from the site at which transcription begins [2, ch. 7]. These regions are thought to provide the binding site that is specifically recognized by the RNA polymerase protein (figure 2a).

There are in fact numerous proteins that, like the polymerase, are able to recognize and bind specific nucleotide sequences. Their binding sites are typically between six and 20 base pairs in length. Binding is mediated by physical interactions between residues on the protein and on the DNA molecule. Given the structure of a protein we should, in principle, be able to calculate its interaction energy with a particular DNA sequence. The result of such a calculation would be the ‘DNA-binding code’. The search for such a code is an active area of research [1618], but for the time being we can rely on experimental measurements of binding affinities [7,8]. Various classes of DNA-binding proteins are known, grouped according to the structure of their DNA recognition domains. These proteins are often modular, having one domain that binds DNA, and another that is responsible for regulatory interactions. Once bound to DNA, a protein can recruit other proteins to its vicinity, or can prevent them from binding. In particular, a DNA-binding protein can interact with and influence the binding and transcriptional activity of the RNA polymerase. Such molecules are known as gene regulatory proteins or transcription factors. They can be classified as activators (which increase the rate of polymerase binding) or repressors (which prevent the polymerase from binding or block it from transcribing). A given protein might activate or repress transcription depending on the relative position of its binding sequence to that of the RNA polymerase.

The activity of a transcription factor can itself be modulated by the binding of small molecules or by covalent modification [2], chs 7 and 15. For example, the E. coli lac repressor, which blocks transcription at the lac operon, contains binding sites for a sugar called allolactose; when the repressor is bound to allolactose, it is unable to bind DNA, and therefore unable to repress transcription. This type of modulation is a key mechanism by which external signals can regulate gene expression. Many small molecules in the environment can diffuse across the bacterial cell membrane to directly influence intracellular transcription factors. Other types of signalling molecules can bind the extracellular domains of transmembrane proteins known as receptors; this causes a conformational change in the receptor’s intracellular domain, which can drive the subsequent activation or inhibition of transcription factors by phosphorylation. For example, a large number of bacterial ‘two-component systems’, consisting of a membrane-bound sensor and intracellular transcriptional regulator, operate on this principle. As we show later, these types of regulatory inputs influence intracellular network dynamics, allowing cells to sense environmental conditions and respond appropriately.

We can calculate the expression level at a particular promoter from a biophysical model that incorporates the microscopic details just mentioned, using an approach pioneered by Shea & Ackers [19] in their study of the OR control system of bacteriophage λ. To do this, we first list all possible promoter configurations (the combinations in which the promoter binds various regulatory proteins or the RNA polymerase); and we specify the relative free energies of each of these states. Once this information is given, there is a well-defined thermodynamic prescription for calculating system properties. Consider a DNA region D that can bind a set of proteins Xi (i=1,…,n), each with multiplicity mi. Let the cytoplasmic protein concentrations be [Xi]. This binding event can be represented as

[Formula ID: RSTA20110548M2x1]
For simplicity in the discussion that follows, this representation clubs together what are in fact several independent binding events, and includes effective rate constants k+ and k for this clubbed reaction. Indeed, there might be several configurations of the bound state: other combinations in which the DNA can bind these proteins. Let sj represent these various states (including the one in which the DNA is bare). The probability of occurrence of each state in thermodynamic equilibrium is then [19]
[Formula ID: RSTA20110548M2x2]
where k is Boltzmann’s constant, and T is the absolute temperature. The term ΔF is the standard free energy of the given configuration, describing the energetics of interaction between the molecules; for example, bonds between DNA and protein residues can stabilize binding by making ΔF more negative. The concentration terms arise owing to entropy or counting: the higher the concentration of a certain protein, the more ways in which one can pick a single molecule to bind the DNA.

We can give this result a kinetic interpretation, under the simplifying assumption of a clubbed multi-protein reaction. The probability that m1 molecules of X1 enter the reaction volume will be proportional to [X1]m1. More generally, the probability per unit time that the reaction (2.1) occurs from left to right (P+) or right to left (P) is

[Formula ID: RSTA20110548M2x3]
If these were the only possible reactions, then in equilibrium we would have P+=P, giving
[Formula ID: RSTA20110548M2x4]
where K is the equilibrium constant. This result is usually presented as the principle of mass action. The concentration of a given promoter state is the total DNA concentration multiplied by the probability of occurrence of that state. If we agree to measure all free energies as differences from that of the bare configuration, a comparison of (2.2) and (2.4) shows
[Formula ID: RSTA20110548M2x5]
That is, the values of the reaction rate constants are constrained by free-energy differences: their ratio must be consistent with the equilibrium prediction. There is in fact a much more basic constraint on the kinetic constants. Imagine that the DNA is involved in several complexes. In that case the condition P+=P, while sufficient to ensure time-invariance of probabilities, is certainly not necessary. It could be that the depletion of a certain species through one reaction is compensated for, not by the reverse reaction, but by a separate creation pathway. However, detailed balance asserts that in equilibrium such solutions are not acceptable: all forward reactions must be balanced by the corresponding reverse reactions. This fact is not at all evident from a reaction-kinetic formulation. While it will be convenient to work within the kinetic framework of rate constants, we must always bear in mind the constraints imposed by equilibrium considerations.

We can now use these general results to study a few relevant examples, where we now explicitly treat multi-step reactions. Consider a DNA region D to which the protein X can bind. For convenience, let us measure energy in units of kT, and let the free energy of the bare DNA be zero. Suppose the free energy of state DX is εX (figure 2b). The probability that the DNA is bare is given by

[Formula ID: RSTA20110548M2x6]
The concentration of bare DNA is a hyperbolic function of the protein concentration, reaching half-saturation at a value [X]=1/K (figure 2c).

Suppose now that the DNA region D represents a promoter, and that the protein X is a repressor, which acts to prevent transcription by the polymerase P. Let the free energy of the state DP be εP. If the two proteins X and P bind independently, then the free energy of the doubly bound state DXP will be the sum of the individual binding energies. (If the independent-binding assumption is not valid, the energy of the state DXP must be provided as an additional parameter.) The energies of the various bound states in this scenario are indicated in figure 3a. The only state from which transcription can proceed is the state DP. Applying the equilibrium prescription, we find that this state occurs with probability

[Formula ID: RSTA20110548M2x7]
where we have explicitly factorized the expression. This factorization is possible precisely because the proteins X and P bind independently, so the probability that state DP occurs is the probability that P is bound multiplied by the probability that X is not bound (the latter being given by (2.6)). It is instructive to see how the derivation might proceed from the kinetic framework. Applying detailed balance, we can find two expressions for the concentration of the doubly bound state, corresponding to the upper and lower binding paths:
[Formula ID: RSTA20110548M2x8]
The four dissociation constants cannot, therefore, be independently specified. (Note also that, by the independent binding property, K1=K4=eεX and K2=K3=eεP.)

In many instances, transcription factors bind to multiple sites. Suppose the promoter in question contains two sites, A and B, to which X can bind in any order (figure 3b). Let the free energies of the two singly bound states be εA and εB, and that of the doubly bound state be εAB=εA+εB+ΔεAB. These assumptions correspond to the most general situation, of which the following are special cases: if the two sites are identical, then εA=εB; if X binds independently to these sites, then ΔεAB=0. The energy term ΔεAB corresponds to some interaction between the two bound copies of X. If the binding of a single molecule makes it more favourable for another to bind, a condition referred to as positive cooperativity, then ΔεAB<0. Conversely, in a situation of negative cooperativity, ΔεAB>0, and the binding of one molecule interferes with the ability of the other to bind. Positive cooperativity is the norm among transcription factors that act multiply. Let us see what effect this will have. Assume, for simplicity, that |εA|∼|εB|≪|ΔεAB|. In the kinetic framework, this corresponds to K1K2, and K3K4, with the detailed balance condition again as shown in (2.8). We find

[Formula ID: RSTA20110548M2x9]
where the inequalities are obtained by noticing that the concentration of any DNA configuration must be less than that of the total amount of DNA available. This shows that [DXA]≪[Dtot], and similarly, [DXB]≪[Dtot]: the singly bound configurations form a negligible fraction of the population. No sooner has one molecule of X bound DNA, than the second also binds. Therefore, the probability that the DNA is bare is given by
[Formula ID: RSTA20110548M2x10]
The cooperativity of binding gives rise to the quadratic term in the denominator. The binding curve is sigmoidal, meaning that it has an inflection point at [X]=1/K1K2 (figure 2c). In the literature, as a first approximation, binding probabilities are often parametrized as Hill equations
[Formula ID: RSTA20110548M2x11]
with n being the Hill coefficient (a measure of cooperativity) and [X0] being the half-saturation concentration. The hyperbola (2.6) (with n=1) and the sigmoid (2.10) (with n=2) can both be parametrized in this way (figure 2c). These parameters, among many others, are required to provide a detailed biochemical description of any genetic network.

3.  Genetic networks
(a)  The network equation

Single genes are often regulated by multiple transcription factors that interact with one another. A classic example is the lac operon, which is regulated by both a repressor and an activator [20]. In eukaryotes, a single gene could be regulated by dozens of proteins. It is a remarkable fact that, using only thermodynamic constraints of the type we have considered, a promoter can be made to perform a variety of mathematical operations on its regulatory inputs. Specifically, the probability of occurrence of the transcriptionally active promoter configuration can be a complicated function of the concentration of various transcription factors [2126]. These concentrations can themselves change over time owing to regulation of the genes encoding the transcription factors. If we wish to understand the behaviour of the system, we must therefore consider the regulatory network as a whole. We now try to arrive at a general mathematical description of such networks.

The rate of protein creation per promoter, α, is a product of the following terms: the probability that the promoter is transcriptionally active, the rate at which transcription proceeds irreversibly from the active state and the number of proteins translated per resulting transcript. Consider a cell that contains nP copies of a gene encoding protein Xi. If the protein once created does not degrade, then the number of protein molecules ni will obey

[Formula ID: RSTA20110548M3x1]
If the cell volume is V , then the protein concentration xi=[Xi] evolves as
[Formula ID: RSTA20110548M3x2]
where the negative term arises owing to dilution.

Immediately after division, a bacterial cell contains a chromosome that has already begun to replicate. Depending on its position relative to the DNA replication origin, either one or two copies of each gene will be present at this stage. Every gene will be replicated once more before the cell is ready to divide again. The term nP/V can therefore vary by as much as a factor of two over the cell cycle. We will usually ignore this variation, assuming the promoter concentration to be constant, and absorbing it into the quantity αi. We will also assume that cell volume grows exponentially, so V (t)∝eγt. The growth rate γ is related to the cell doubling time TD as . If the protein is subject to degradation in a first-order reaction, the rate constant of that reaction must be added to the dilution rate γ to give the net decay rate γi. Protein degradation and dilution might themselves depend on the concentrations of some subset of proteins present in the system [27]. Finally, we have seen that the expression rate αi can also depend on other protein concentrations. Taken together, these assumptions give

[Formula ID: RSTA20110548M3x3]
In many instances, network topology can be specified by sparse matrices of the form shown below, where only a few direct interactions generate non-zero matrix entries:
[Formula ID: RSTA20110548M3x4]
This apparently simple system of equations describes a typical genetic network. Of course, all the complex biochemistry is hidden within the functions α() and γ().

(b)  The network equation as an extension of Boolean threshold models

Equations of the general form (3.3) were first extensively studied by computational neuroscientists in their attempts to model neural networks [28]. In the neural context, the quantity xi is the activity of a single neuron, and the function α() couples neurons to one another across synapses. The neural activity is a continuous variable, changing continuously over time, analogous to the expression level of a gene. Early models described neurons as binary units, which could perform thresholding operations (the so-called perceptrons [29]). In these models, xi is 0 or 1, and neural activity is updated discretely according to the inputs received:

[Formula ID: RSTA20110548M3x5]
Here, Θ(s) is a step function, equal to 1 if s≥0, and 0 if s<0. The weight matrix wij describes the strength of the interaction between input neuron j and output neuron i. If the weighted input to neuron i crosses the threshold μi, then the neuron is activated.

Starting with this binary description, we can generalize the model in many different ways. First, the synchronous update rule (‘=’) described earlier could be changed to an asynchronous update rule (‘:=’), selecting a random unit to update at each time step. Second, we could convert the binary activity variable to a continuous variable. In order to do this, we would need to select an appropriate function α() to describe how the neuron responds to its inputs. Typically, α is chosen to be a sigmoidal or threshold-like function, to which the step function is an approximation. This gives

[Formula ID: RSTA20110548M3x6]
The dynamical variable is now continuous, but the model still operates in discrete time steps. Essentially, the neurons are assumed to adopt their new activities instantly upon update. Of course, the change of activity might occur gradually, with different neurons relaxing towards the steady state prescribed by (3.6) at different rates γi:
[Formula ID: RSTA20110548M3x7]
We thus arrive at an equation of the form (3.3). Note, however, that the function α() has a very special form, thresholding a weighted sum of inputs, an approximate phenomenological description of neural behaviour.

Moving back to genetic systems, how much can we learn by analogy with neural or electronic networks? It turns out that, when groups of genes are collected into a network, the resulting architecture is markedly different from that of the generic electronic circuit to which it is often compared. In the electronic case, large numbers of simple nodes are connected in complex ways. In the genetic case, the network is likely to be much more shallow, with each node, a promoter, executing more complex operations [14,21]. A single promoter is capable of responding in intricate ways to its inputs, and indeed, it is becoming clear that real single neurons might themselves be capable of sophisticated computations [30]. The simplicity and uniformity of electronic nodes have allowed us to model large electronic circuits very effectively. It is likely that there will never be an equivalent standard framework for the study of genetic systems—too much depends on the unique characteristics of each gene or protein. This is the biochemical complexity that makes the analysis of genetic networks challenging. Nevertheless, as we discuss in §4, topology proves to be a surprisingly useful determinant of network properties.

4.  The emergent properties of networks
(a)  A biological wish-list

Imagine that we need to design a regulatory system to orchestrate one of the most intricate of all known biological processes, the development of a living embryo [31]. What are some of the tasks that need to be carried out, and some of the problems we might encounter along the way? We start with a fertilized egg that has undergone repeated divisions, thus producing a set of undifferentiated cells. Very soon, this embryo will begin to respond to maternal cues, in the form of spatial gradients of signalling molecules called morphogens, causing cells in different positions to express different sets of genes. Gene expression levels will need to vary significantly, as we move across segment boundaries: small changes in the levels of a signalling molecule must be amplified to produce large changes in expression. New transcription factors will be synthesized, triggering a subsequent round of gene expression. Cells will need to respond rapidly to these changes. At this stage, small errors in expression patterns must be avoided, as they would lead to larger and possibly lethal errors in downstream processes. The morphogen signals will eventually start to die away; the cells must nevertheless retain some memory of these signals, remaining firmly committed to their different fates. Developmental processes in different parts of the embryo will need to be synchronized: protein levels will need to oscillate periodically in time. And the list goes on.

The surprising fact is, each of the tasks on our wish-list can be achieved by small networks of interacting genes (figure 4) [32,33]. In §4b, we survey a few simple networks that are able to generate, in principle, these various biologically desirable outcomes. Over the past decade systems such as those discussed here have been explored experimentally by synthetic biologists [3436]: negative feedback for noise reduction [37,38]; positive feedback and the flip–flop for bistability [20,3941]; and hysteretic and ring oscillators [26,4244].

(b)  The dynamics of simple network topologies

Amplification by cooperative activation: consider a gene that encodes a protein Y and is regulated by an activator X (figure 5a). Cooperative interactions can result in a Hill-type dependence of the gene expression level on the activator concentration. Setting x=[X] and y=[Y ],

[Formula ID: RSTA20110548M4x1]
where for notational simplicity, x is measured in units of the half-saturation concentration (compare with (2.11)) and time is measured in units such that the decay rate of y is unity (compare with (3.3)). The value of the steady-state output, , can depend sensitively on that of the input, :
[Formula ID: RSTA20110548M4x2]
At high or low values of , the value of is close to either zero or A and is insensitive to changes in the input. However, near the threshold , a certain fractional change in is amplified to produce an n/2 greater fractional change in : differential input signals will be amplified.

Rapid equilibration and noise reduction by negative feedback: consider what happens when a gene negatively regulates its own expression (figure 5b). Assume that the protein is a repressor that behaves as shown in (2.6):

[Formula ID: RSTA20110548M4x3]
The steady state of the system corresponds to that concentration x at which the rate of creation f(x) and the rate of destruction g(x) balance one another. We see from figure 6a that the negative-feedback system settles into a steady state intermediate between 0 and A (something that cannot be captured in a pure binary description). If the expression level of the system is transiently increased above this steady state, the resulting drop in the creation rate quickly restores equilibrium. In fact, the auto-repressed system equilibrates more rapidly than an unregulated system with the same steady state, as shown in figure 6a; this has the effect of suppressing stochastic fluctuations [45].

Memory and bistability by positive feedback: we next allow the gene to positively regulate its own expression (figure 5b). This can be achieved by closing the loop in (4.1):

[Formula ID: RSTA20110548M4x4]
We see from the binary model that this system can have multiple steady states: a gene that is active will sustain its own expression, whereas one that is inactive will never become activated (figure 5b). In the continuous model, this would correspond to having multiple values of x at which the rates of creation and destruction balance one another. For hyperbolic activation (n=1), we find just one stable expression state. However, for sigmoidal activation (n>1), the system can have two stable states, separated by an unstable state that forms a threshold (figure 6b). Trajectories that begin above this threshold are driven to the high state, whereas those that begin below the threshold are driven to the low state. The behaviour of the system therefore depends on its history, a phenomenon known as hysteresis. Suppose that we begin with a group of cells in the low expression state, then fully induce expression in some of these cells by means of an external signal such as a morphogen. Even once this signal is removed, the induced cells will maintain their high-expression levels. The positive-feedback network thus forms the basis for cellular memory, allowing cells of identical genotype to achieve different phenotypes depending on the external signals received.

Memory and bistability with a flip–flop: a pair of genes that repress one another is similar to a single gene that activates itself (figure 5c). In the context of electronics, such systems are known as flip–flops. The binary version of this system is capable of maintaining two distinct internal states: if we choose one gene to be active, then the other must be inactive. In terms of concentrations,

[Formula ID: RSTA20110548M4x5]
To understand system dynamics, it is useful to examine the curves u(x,y)=0, along which dx/dt=0, and v(x,y)=0, along which dy/dt=0. The fixed points or steady states of the system occur where these curves, known as nullclines, intersect. Once again, we must ask of each fixed point whether it is stable or unstable. In this case, a graphical analysis shows that, for n=1, the system has a single stable fixed point along the diagonal x=y (figure 6c). For n>1, this symmetric fixed point becomes unstable, and two asymmetric stable fixed points are created, one corresponding to high x-expression, and the other to high y-expression (figure 6d). As in the case of the positive feedback network, the flip–flop provides a mechanism for cellular memory.

Hysteretic oscillator: we again look at a system of two genes, but now one of them is an activator, while the other is a repressor (figure 5d). In a sense, this is an extended version of a negative feedback circuit we saw previously, and the binary model predicts that it should oscillate. Importantly, because the feedback now comes with a delay, oscillations can be shown to occur in the corresponding continuous system as well. Consider the following activator–repressor pair:

[Formula ID: RSTA20110548M4x6]
The nullclines intersect at a single fixed point, and the flows suggest oscillatory behaviour. If x is slow to respond to changes in y, this fixed point is stable and any oscillations are damped (figure 7a). However, if x responds sufficiently rapidly, the fixed point becomes unstable, and the system enters a sustained limit-cycle oscillation (figure 7b). Hysteretic oscillators of this kind are known to form the molecular basis for circadian rhythms and other types of periodic phenomena in living cells [46].

Ring oscillator: finally, let us consider a system with three genes, each repressing the next in sequence (figure 5e). The binary system is clearly oscillatory. The continuous analogue may be specified as

[Formula ID: RSTA20110548M4x7]
where i=0 is identified with i=3. The system has a symmetric fixed point xi=x0. For sufficiently high n, this fixed point can become unstable, forcing the system into a limit-cycle oscillation (figure 7c).

5.  Separating biochemistry from topology
(a)  Estimating biochemical and topological complexity

Suppose we are given N distinct regulatable promoters, each of which has binding sites for up to M distinct transcription factors. In addition, we are given Next promoters whose transcriptional outputs can be controlled using extracellular signals. Each promoter can be made to express one or more transcription factors; the same transcription factor might be expressed by multiple promoters, in which case its total level is obtained by summing. We assume that the levels of all transcription factors can be measured. To simplify the discussion, we discretize the system so that all the inputs and outputs can take on any one of the states x∈{0,1,…,Ω−1} with inputs saturating at the maximal level. Reasonable values of these quantities are N, Next approximately 2–10, M∼2–5 [47], and Ω∼10.

A promoter is specified by defining its response to ΩM distinct inputs. For each promoter i, let this information be summarized as a function αi(x1,x2,…,xn). The set {αi|i=1,…,N} represents the biochemical specification of the system. There are ΩΩNM possible biochemistries (though given the continuous and slowly varying nature of a promoter’s input–output function, the accessible biochemical space will in reality be much smaller than this).

We next turn to topology, which involves specifying which of the N+Next promoters is driving each of the M inputs of a given promoter. The M×(N+Next) connectivity matrix for promoter i has the form

[Formula ID: RSTA20110548M5x1]
where the indices j and k run over inputs and promoters, respectively; and each entry can take on values 0 or 1. The set {Ci|i=1,…,N} represents the topological specification of the system and there are approximately 2NM(N+Next) possible topologies (ignoring degeneracies). Notice that the biochemical space explodes much more rapidly than the topological space.

Consider a feedback network constructed with some complicated . Such a network will have NextNext external inputs, and therefore can be put into ΩNext configurations. How completely can we probe the biochemistry of such a system? To get a rough idea, let us make the following simplifying assumptions: for each external configuration, the feedback system achieves a unique steady state; and as we cycle through configurations, a given promoter cycles through a random sample (with repeats) of its ΩM possible states. The probability that a given state is missed over ΩNext samples is . Therefore, the expected number of distinct states sampled by each promoter is . The depth of biochemical characterization is essentially a step function: if Next<M our sampling is extremely sparse; if we hit nearly all possible states; and with ΩM samples our fractional coverage is (1−1/e).

If NextM, we can choose to construct a synthetic genetic network with the trivial feed-forward architecture (as reported in Rai et al. [26]):

[Formula ID: RSTA20110548M5x2]
where 0 is the zero matrix and I is the identity matrix. This allows us to perform a complete biochemical characterization, in which we determine all the functions αi, using exactly ΩM external configurations. Having done the feed-forward characterization we can, in principle, predict the response of any other topology under all of its ΩNext external configurations. An experimental demonstration of this feed-forward-to-feedback predictive procedure was reported by Rai et al. [26]. For Next>M, this type of prediction is clearly efficient: a large number of feedback responses can be predicted from a relatively small number of feed-forward measurements. However, in practice, it is often the case that even ΩM is large in absolute terms, making a complete biochemical characterization unfeasible.

(b)  Case study: bacterial cell-to-cell communication

There are several natural contexts in which bacterial cells in a population stand to benefit by coordinating their actions [48]. Many bacterial species achieve such coordination through chemical communication channels that work on the following principle [49]. Any cell in the population can ‘issue’ a signal using an enzyme designated I; this enzyme generates a molecule known as acyl-homoserine lactone (AHL) that can diffuse freely between cells. Cells ‘receive’ this signal using a transcription factor designated R; when R is bound to AHL it functions as an activator, driving transcription at a promoter henceforth designated pX. The capability of I/R systems to issue and receive signals can have a variety of uses [50]. Because the concentration of AHL in the medium is a readout of the density of cells issuing the signal, one hypothesis is that these systems allow cells to tune their transcriptional response as a function of population density (figure 8)—hence the term ‘quorum sensing’. For example, cells infecting a host can remain quiescent until they reach a critical density, staying hidden from the host’s immune system until they are ready to launch a virulent attack [51]. Topologically, I/R quorum-sensing systems are interesting because they are invariably found in a particular positive-feedback configuration: the enzyme I is expressed downstream of the R-dependent promoter pX [26,52].

A computational and experimental characterization of I/R systems has been reported previously [26]. We revisit those results in the context of the biochemical and topological framework developed here. The key variables are (figure 8): the bacterial cell density ρ; the concentration ϕ of AHL in the medium; and the intracellular concentrations YI and YR of the enzyme I and transcription factor R. AHL levels will be proportional both to the enzyme levels and to cell density: ϕ(t)=μρ(t)YI(t). The transcriptional output of promoter pX is a function of instantaneous AHL and R levels. This biochemistry is summarized:

[Formula ID: RSTA20110548M5x3]
Given two external promoters pA and pB, the system can be wired into the following topologies:
[Formula ID: RSTA20110548M5x4]
where matrices of the format (5.1) specify which promoters are driving which of the two inputs of promoter pX. If the proteins I and R have translation rates QI, QR and decay rates γI, γR, respectively, the feedback systems are described by the following differential equations:
[Formula ID: RSTA20110548M5x5]
Here, αI and αR are control parameters: transcription rates that are constant in time but whose values can depend on external inputs; the function αX() embodies the frozen biochemical parameters; and the structure of the equations indicates the feedback topology. There are evidently two reasons why the responses of R-feedback and I-feedback systems might differ. The first is biochemical: the promoter logic αX(μρYI,YR) is an asymmetric function of its two inputs YI and YR (figure 9a). The second is structural or topological: the input YI is multiplied by the cell density, whereas the input YR is fed in directly (figure 9b,c) causing these two variables to influence the dynamics in completely distinct ways.

If cell density varies slowly compared with intracellular protein concentrations, equation (5.5) can be solved to obtain quasi-steady-state values YI and YR as functions of ρ. Under positive feedback, two distinct classes of responses can arise (figure 10a). For monostable responses (type M; mnemonic sMooth), transcription increases smoothly with cell density. For bistable responses (type B; mnemonic aBrupt), there is a range of cell densities over which two stable transcription levels coexist. For each topology, a bifurcation analysis can be used to obtain regions of parameter space that give rise to the different response types [26], supporting information. Figure 10b shows a two-dimensional slice of the parameter space: a biochemical parameter n (the Hill coefficient of R-DNA binding, which plays a key role in determining the form of αX()) is varied along the x-axis; the control parameters αI or αR are varied along the y-axis. We see that the R-feedback topology is constrained: it is restricted to a single response type independent of the regulator level, once biochemical parameters are frozen. However, the I-feedback topology is versatile: it can be tuned between smooth and abrupt density-dependent response types by varying the regulator alone. This versatility might underlie the observed preference for I-feedback systems among diverse bacterial species: an organism that is able to rapidly modify its response in the face of an uncertain and fluctuating environment gains a crucial fitness advantage. Versatility is a purely topological property of the system, made without reference to specific biochemical parameter values.

6.  Conclusion

There are three types of changes that can be used to modulate the response of genetic networks, operating on completely distinct time scales (figure 11a). Control parameters (such as the transcription rates αI or αR) are the software: they can respond directly and dynamically to external inputs, and vary on time scales from minutes to hours. Biochemical parameters (such as the Hill coefficient n) are the firmware: they can be changed incrementally by mutations, infrequent events that might become fixed in a population only over hundreds of generations. Network topology is the hardware: it is possible to switch topology but this requires rare, potentially disruptive, large-scale DNA rearrangements. The topological hardware and biochemical firmware are essentially frozen, leaving only the regulated software to vary freely at short time scales.

When studying natural genetic networks, the approach to take depends on the extent of available data. If topology is known and key parameters identified, we can use experimental measurements to constrain as many parameters as feasible. Of the remaining parameters we can try to identify a few that are expected to be critical, and investigate all possible system behaviours as their values are varied. This approach is incomplete, however, because of a further unknown that is often ignored: we rarely, if ever, know the a priori distribution of parameter values that are likely to occur in nature. It is therefore impossible to estimate or compare the volumes of regions in parameter space that give rise to any set of specified behaviours (such as A or B; figure 11b). Even in this situation, topology provides a useful organizing framework. Consider the region of parameter space of some genetic network associated with some desired behaviour. If this region in the case of topology-II is completely contained within that in the case of topology-I, then we can be certain that the topology-I is more likely to generate the desired behaviour, without knowing anything about the likelihood of occurrence of parameters (figure 11b). The analysis thus generates a partial ordering among topologies independent of the actual biochemistry, and suggests a means to search the space of all possible topologies for interesting networks. Searching through topologies in this manner might be the only approach possible if the very existence of certain interactions is in doubt. For each topology, we would scan over parameter values to identify the range of possible behaviours. It could be the case that several topologies are consistent with some desired outcomes. In that case, it might be necessary to add additional biologically relevant constraints: robustness to parameter variation; adaptation to external changes; power consumption efficiency; and so on. The approach of searching topological space with constraints is emerging as a powerful means to understand the design principles of complex genetic networks in the absence of detailed biochemical data [5358].

We might soon achieve a nearly complete understanding of certain simple organisms through a systematic analysis of the networks that govern their behaviour. Eventually such techniques might even give us predictive power, allowing us to guess at the inner workings of organisms based solely on the annotated sequences of their genomes. However, on very long time scales, the structure of a network must itself be dynamic: natural selection can be thought of as driving a search through topological space, converging on network architectures that generate biologically useful outcomes [59,60]. As more and more genome sequences enter the databases, we can begin to catalogue regularities in network architecture, or striking differences between different species. Once enough such patterns are known, it might be possible to shift our focus away from the question that concerned us here, of what genetic networks do, towards the broader question of how such networks came to be.


This work was supported in part by a WellcomeTrust–DBT India Alliance Intermediate Fellowship (500103/Z/09/Z). I thank Sandeep Krishna for discussions about the utility of topological descriptions, and Rajat Anand for help in preparing figures. Sections 1–4 and related figures are adapted, with permission, from ‘The dynamics of genetic networks’ by M. Thattai,  2004 Massachusetts Institute of Technology. Portions of §5 and related figures are adapted, with permission, from ‘Prediction by promoter logic in bacterial quorum sensing’ by N. Rai, R. Anand, K. Ramkumar, V. Sreenivasan, S. Dabholkar, K. V. Venkatesh and M. Thattai, PLoS Comput. Biol. 8, e1002361,  2012 Rai et al.

1. Monod J. Year: 1966From enzymatic adaptation to allosteric transition. Science154, 475–48310.1126/science.154.3748.475 (doi:10.1126/science.154.3748.475)5331094
2. Alberts B,Johnson A,Lewis J,Raff M,Roberts K,Walter P. Year: 2007Molecular biology of the cell, 4th edn.New York, NY: Garland Science.
3. Hobert O. Year: 2008Gene regulation by transcription factors and microRNAs. Science319, 1785–178610.1126/science.1151651 (doi:10.1126/science.1151651)18369135
4. Bansal M,Belcastro V,Ambesi-Impiombato A,di Bernardo D. Year: 2007How to infer gene networks from expression profiles. Mol. Syst. Biol.3, 7810.1038/msb4100120 (doi:10.1038/msb4100120)17299415
5. Hu Z,Killion PJ,Iyer VR. Year: 2007Genetic reconstruction of a functional transcriptional regulatory network. Nat. Genet.39, 683–68710.1038/ng2012 (doi:10.1038/ng2012)17417638
6. Bonneau R. Year: 2008Learning biological networks: from modules to dynamics. Nat. Chem. Biol.4, 658–66410.1038/nchembio.122 (doi:10.1038/nchembio.122)18936750
7. Balleza E,Martínez-Antonio A,Resendis-Antonio O,Lozada-Chávez I,Encarnación S,Collado-Vides J. Year: 2009Regulation by transcription factors in bacteria: beyond description. FEMS Microbiol. Rev.33, 133–15110.1111/j.1574-6976.2008.00145.x (doi:10.1111/j.1574-6976.2008.00145.x)19076632
8. MacQuarrie KL,Fong AP,Morse RH,Tapscott SJ. Year: 2011Genome-wide transcription factor binding: beyond direct target regulation. Trends Genet.27, 141–14810.1016/j.tig.2011.01.001 (doi:10.1016/j.tig.2011.01.001)21295369
9. Thieffry D,Perez-Rueda E,Collado-Vides J. Year: 1998From specific gene regulation to genomic networks: a global analysis of transcriptional regulation in Escherichia coli. BioEssays20, 433–44010.1002/(SICI)1521-1878(05)20:5%3C433::AID-BIES10%3E3.0.CO;2-2 (doi:10.1002/(SICI)1521-1878(05)20:5<433::AID-BIES10>3.0.CO;2-2)9670816
10. Lee TI,et al. Year: 2002Transcriptional regulatory networks in Saccharomyces cerevisiae. Science298, 799–80410.1126/science.1075090 (doi:10.1126/science.1075090)12399584
11. Faith JJ,Hayete B,Mogno I,Wierzbowski J,Cottarel G,Kasif S,Collins JJ,Gardner TS. Year: 2007Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol.5, e810.1371/journal.pbio.0050008 (doi:10.1371/journal.pbio.0050008)17214507
12. Zhu J,Zhang B,Drees B,Kruglyak L,Bumgarner RE,Schadt EE. Year: 2008Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nat. Genet.40, 854–86110.1038/ng.167 (doi:10.1038/ng.167)18552845
13. Barabasi A,Albert R. Year: 1999Emergence of scaling in random networks. Science286, 509–51210.1126/science.286.5439.509 (doi:10.1126/science.286.5439.509)10521342
14. Shen-Orr S,Milo R,Mangan S,Alon U. Year: 2002Network motifs in the transcriptional regulation network of. Escherichia coli. Nat. Genet.31, 64–6810.1038/ng881 (doi:10.1038/ng881)
15. Alon U. Year: 2007Network motifs: theory and experimental approaches. Nat. Rev. Genet.8, 450–46110.1038/nrg2102 (doi:10.1038/nrg2102)17510665
16. Alleyne TM,et al. Year: 2009Predicting the binding preference of transcription factors to individual DNA k-mers. Bioinformatics8, 1012–101810.1093/bioinformatics/btn645 (doi:10.1093/bioinformatics/btn645)19088121
17. Siddharthan R. Year: 2010Dinucleotide weight matrices for predicting transcription factor binding sites: generalizing the position weight matrix. PLoS ONE5, e972210.1371/journal.pone.0009722 (doi:10.1371/journal.pone.0009722)20339533
18. Yang S,Li X,Wang J. Year: 2011Correlated evolution of transcription factors and their binding sites. Bioinformatics27, 2972–297810.1093/bioinformatics/btr503 (doi:10.1093/bioinformatics/btr503)21896508
19. Shea MA,Ackers GK. Year: 1985The OR control system of bacteriophage lambda: a physical-chemical model of gene regulation. J. Mol. Biol.181, 211–23010.1016/0022-2836(85)90086-5 (doi:10.1016/0022-2836(85)90086-5)3157005
20. Ozbudak EM,Thattai M,van Oudenaarden A. Year: 2004Multistability in the lactose utilization network of Escherichia coli. Nature427, 737–74010.1038/nature02298 (doi:10.1038/nature02298)14973486
21. Buchler NE,Gerland U,Hwa T. Year: 2003On schemes of combinatorial transcriptional logic. Proc. Natl Acad. Sci. USA100, 5136–514110.1073/pnas.0930314100 (doi:10.1073/pnas.0930314100)12702751
22. Setty Y,Alon U. Year: 2003Detailed map of a cis-regulatory input function. Proc. Natl Acad. Sci. USA100, 7702–770710.1073/pnas.1230759100 (doi:10.1073/pnas.1230759100)12805558
23. Cox RS 3rd,Surette MG,Elowitz MB. Year: 2007Programming gene expression with combinatorial promoters. Mol. Syst. Biol.3, 14510.1038/msb4100187 (doi:10.1038/msb4100187)18004278
24. Kaplan S,Bren A,Zaslaver A,Dekel E,Alon U. Year: 2008Diverse two-dimensional input functions control bacterial sugar genes. Mol. Cell29, 786–79210.1016/j.molcel.2008.01.021 (doi:10.1016/j.molcel.2008.01.021)18374652
25. Tamsir A,Tabor JJ,Voigt CA. Year: 2011Robust multicellular computing using genetically encoded NOR gates and chemical wires. Nature469, 212–21510.1038/nature09565 (doi:10.1038/nature09565)21150903
26. Rai N,Anand R,Ramkumar K,Sreenivasan V,Dabholkar S,Thattai M. Year: 2012Prediction by promoter logic in bacterial quorum sensing. PLoS Comput. Biol.8, e100236110.1371/journal.pcbi.1002361 (doi:10.1371/journal.pcbi.1002361)22275861
27. Tan C,Marguet P,You L. Year: 2009Emergent bistability by a growth-modulating positive feedback circuit. Nat. Chem. Biol.5, 842–84810.1038/nchembio.218 (doi:10.1038/nchembio.218)19801994
28. Hertz J,Krogh A,Palmer RG. Year: 1991Introduction to the theory of neural computation.Reading, MA: Perseus Books.
29. Minsky ML,Papert SA. Year: 1969Perceptrons.Cambridge, MA: MIT Press.
30. Arcas BA,Fairhall AL,Bialek W. Year: 2003Computation in a single neuron. Neural Comput.15, 1715–174910.1162/08997660360675017 (doi:10.1162/08997660360675017)14511510
31. Lawrence PA. Year: 1992The making of a fly.Oxford, UK: Blackwell Scientific.
32. Tyson JJ,Chen KC,Novak B. Year: 2003Sniffers, buzzers, toggles and blinkers: dynamics of regulatory and signaling pathways in the cell. Curr. Opin. Cell Biol.15, 221–23110.1016/S0955-0674(03)00017-6 (doi:10.1016/S0955-0674(03)00017-6)12648679
33. Alon U. Year: 2007An introduction to systems biology: design principles of biological circuits.Boca Raton, FL: Chapman & Hall.
34. Hasty J,McMillen D,Collins JJ. Year: 2002Engineered gene circuits. Nature420, 224–23010.1038/nature01257 (doi:10.1038/nature01257)12432407
35. Purnick PEM,Weiss R. Year: 2009The second wave of synthetic biology: from modules to systems. Nat. Rev. Mol. Cell Biol.10, 410–42210.1038/nrm2698 (doi:10.1038/nrm2698)19461664
36. Khalil AS,Collins JJ. Year: 2010Synthetic biology: applications come of age. Nat. Rev. Genet.11, 367–37910.1038/nrg2775 (doi:10.1038/nrg2775)20395970
37. Becskei A,Serrano L. Year: 2000Engineering stability in gene networks by autoregulation. Nature405, 590–59310.1038/35014651 (doi:10.1038/35014651)10850721
38. Dublanche Y,Michalodimitrakis K,Kümmerer N,Foglierini M,Serrano L. Year: 2006Noise in transcription negative feedback loops: simulation and experimental analysis. Mol. Syst. Biol.2, 4110.1038/msb4100081 (doi:10.1038/msb4100081)16883354
39. Becskei A,Seraphin B,Serrano L. Year: 2001Positive feedback in eukaryotic gene networks: cell differentiation by graded to binary response conversion. EMBO J.20, 2528–253510.1093/emboj/20.10.2528 (doi:10.1093/emboj/20.10.2528)11350942
40. Isaacs FJ,Hasty J,Cantor CR,Collins JJ. Year: 2003Prediction and measurement of an autoregulatory genetic module. Proc. Natl Acad. Sci. USA100, 7714–771910.1073/pnas.1332628100 (doi:10.1073/pnas.1332628100)12808135
41. Gardner TS,Cantor CR,Collins JJ. Year: 2000Construction of a genetic toggle switch in Escherichia coli. Nature403, 339–34210.1038/35002131 (doi:10.1038/35002131)10659857
42. Atkinson MR,Savageau MA,Myers JT,Ninfa AJ. Year: 2003Development of genetic circuitry exhibiting toggle switch or oscillatory behavior in Escherichia coli. Cell113, 597–60710.1016/S0092-8674(03)00346-5 (doi:10.1016/S0092-8674(03)00346-5)12787501
43. Stricker J,Cookson S,Hasty J. Year: 2008A fast, robust and tunable synthetic gene oscillator. Nature456, 516–51910.1038/nature07389 (doi:10.1038/nature07389)18971928
44. Elowitz MB,Leibler S. Year: 2000A synthetic oscillatory network of transcriptional regulators. Nature403, 335–33810.1038/35002125 (doi:10.1038/35002125)10659856
45. Paulsson J. Year: 2003Summing up the noise in gene networks. Nature427, 415–41810.1038/nature02257 (doi:10.1038/nature02257)14749823
46. Tyson JJ,Albert R,Goldbeter A,Ruoff P,Sible J. Year: 2008Biological switches and clocks. J. R. Soc. Interface5(Suppl. 1), S1–S810.1098/rsif.2008.0179.focus (doi:10.1098/rsif.2008.0179.focus)18522926
47. Nam J,Dong P,Tarpine R,Istrail S,Davidson EH. Year: 2010Functional cis-regulatory genomics for systems biology. Proc. Natl Acad. Sci. USA107, 3930–393510.1073/pnas.1000147107 (doi:10.1073/pnas.1000147107)20142491
48. Wingreen NS,Levin SA. Year: 2006Cooperation among microorganisms. PLoS Biol.4, e29910.1371/journal.pbio.0040299 (doi:10.1371/journal.pbio.0040299)16968138
49. Waters CM,Bassler BL. Year: 2005Quorum sensing: cell-to-cell communication in bacteria. Annu. Rev. Cell Dev. Biol.21, 319–34610.1146/annurev.cellbio.21.012704.131001 (doi:10.1146/annurev.cellbio.21.012704.131001)16212498
50. Hense BA,Kuttler C,Muller J,Rothballer M,Hartmann A,Kreft JU. Year: 2007Does efficiency sensing unify diffusion and quorum sensing?. Nat. Rev. Microbiol.5, 230–23910.1038/nrmicro1600 (doi:10.1038/nrmicro1600)17304251
51. de Kievit TR,Iglewski BH. Year: 2000Bacterial quorum sensing in pathogenic relationships. Infect. Immun.68, 4839–484910.1128/IAI.68.9.4839-4849.2000 (doi:10.1128/IAI.68.9.4839-4849.2000)10948095
52. Smith D,et al. Year: 2006Variations on a theme: diverse N-acyl homoserine lactone-mediated quorum sensing mechanisms in Gram-negative bacteria. Sci. Prog.89, 167–21110.3184/003685006783238335 (doi:10.3184/003685006783238335)17338438
53. François P,Hakim V. Year: 2004Design of genetic networks with specified functions by evolution in silico. Proc. Natl Acad. Sci. USA101, 580–58510.1073/pnas.0304532101 (doi:10.1073/pnas.0304532101)14704282
54. Klemm K,Bornholdt S. Year: 2005Topology of biological networks and reliability of information processing. Proc. Natl Acad. Sci. USA102, 18414–1841910.1073/pnas.0509132102 (doi:10.1073/pnas.0509132102)16339314
55. Ciliberti S,Wagner A. Year: 2007Innovation and robustness in complex regulatory gene networks. Proc. Natl Acad. Sci. USA104, 13591–1359610.1073/pnas.0705396104 (doi:10.1073/pnas.0705396104)17690244
56. Avlund M,Sneppen K,Krishna S. Year: 2009Minimal gene regulatory circuits that can count like bacteriophage lambda. J. Mol. Biol.394, 681–69310.1016/j.jmb.2009.09.053 (doi:10.1016/j.jmb.2009.09.053)19796646
57. Ma W,Trusina A,El-Samad E,Tang C. Year: 2009Defining network topologies that can achieve biochemical adaptation. Cell138, 760–77310.1016/j.cell.2009.06.013 (doi:10.1016/j.cell.2009.06.013)19703401
58. Burda Z,Krzywicki A,Zagorski M. Year: 2011Motifs emerge from function in model gene regulatory networks. Proc. Natl Acad. Sci. USA108, 17263–1726810.1073/pnas.1109435108 (doi:10.1073/pnas.1109435108)21960444
59. Babu MM,Luscombe NM,Aravind L,Gerstein M,Teichmann SA. Year: 2004Structure and evolution of transcriptional regulatory networks. Curr. Opin. Struct. Biol.14, 283–29110.1016/ (doi:10.1016/
60. Oikonomou P,Cluzel P. Year: 2006Effects of topology on network evolution. Nat. Phys.2, 532–53610.1038/nphys359 (doi:10.1038/nphys359)


[Figure ID: RSTA20110548F1]
Figure 1. 

(a) The central dogma of molecular biology. (b) A more accurate representation of the central dogma, with filled arrows representing potential regulatory interactions.

[Figure ID: RSTA20110548F2]
Figure 2. 

Protein–DNA interactions. (a) Genes are DNA regions that are transcribed into mRNA, and eventually translated into proteins. Promoters are DNA regions upstream of genes where the RNA polymerase molecule (RNAP) binds and initiates transcription. Transcription factors (TFs) can bind near promoters and interact with the polymerase, exerting regulatory control. (b) The binding of a single protein is shown in the reaction-kinetic (left) and energetic (right) representations. (c) Free DNA as a function of protein levels. The graphs are of Hill functions, showing hyperbolic (n=1) as well as sigmoidal (n=2, n=6) binding curves. Higher Hill coefficients produce more threshold-like functions. The half-saturation concentration is one in each case.

[Figure ID: RSTA20110548F3]
Figure 3. 

Transcriptional regulation by DNA-binding proteins. (a) Independent binding of a repressor (X) and the polymerase (P). The free energy of the doubly bound state is the sum of the individual binding energies. (b) Cooperative binding. The binding of a single molecule of X increases the likelihood that a second molecule will bind.

[Figure ID: RSTA20110548F4]
Figure 4. 

The emergent properties of networks: amplification by cooperative activation; rapid equilibration and noise reduction by negative feedback; memory and bistability by positive feedback and the flip–flop; oscillations by hysteretic and ring oscillators.

[Figure ID: RSTA20110548F5]
Figure 5. 

Simple binary networks. (a) Basic interactions between binary genes. Interactions are shown in bold if the regulator is active. (b) Feedback networks. The binary negative-feedback network does not have a self-consistent steady state. The binary positive-feedback network has two steady states, either active or inactive. (c) Flip–flop. If the first gene is active, then the second is inactive, and vice versa. As in the case of positive feedback, the system has two steady states. (d) Hysteretic oscillator. The dotted arrow represents transitions in time. The system cycles between states of high activator and high repressor expression. (e) Ring oscillator. The three genes cycle through high-expression states in succession.

[Figure ID: RSTA20110548F6]
Figure 6. 

Continuous feedback networks. (a) Negative feedback. (i) Protein creation and degradation rates. (1) f(x)=2/(1+x) for the auto-repressed system. (2) f(x)=1 for the unregulated system. (3) g(x)=x. (ii) Solid lines show timecourses for the auto-repressed system; dashed lines show timecourses for the unregulated system. Negative feedback produces more rapid equilibration. (b) Positive feedback. (i) Protein creation and degradation rates. (1) For f(x)=2x/(1+x), the system has a single stable fixed point. (2) For f(x)=2x4/(1+x4), the system has two stable fixed points, separated by an unstable fixed point. (3) g(x)=x. (ii) Timecourses. Systems initialized at x>1 are driven to the high state, whereas those initialized at x<1 are driven to the low state. (c,d) Flip–flop. Graphs in xy space show nullclines (solid) and trajectories (dashed) for equation (4.5) with A=5. (c) For n=1, the system has one stable state. (d) For n=4, the system has two stable states, one at high-x low-y, and the other at high-y low-x.

[Figure ID: RSTA20110548F7]
Figure 7. 

Continuous oscillators. (a,b) Hysteretic oscillator. We show results for equation (4.6), with vx=0.1, vy=0.0, Ax=4.0, Ay=2.0. (i) Shows nullclines (solid) and trajectories (dotted) in xy space. (ii) Shows y(t). (a) For γx=3.0, oscillations are damped and the system eventually reaches the fixed point. (b) For γx=5.0, the fixed point is unstable, and the system enters a limit cycle oscillation. (c) Ring oscillator. We show results for equation (4.7), with A=4 and n=4. The graph shows the values of x1, x2 and x3 over time. The system eventually enters a limit cycle.

[Figure ID: RSTA20110548F8]
Figure 8. 

Schematic of an I/R quorum-sensing system. Cells have number density ρ. The intracellular enzyme I synthesizes the chemical signal AHL, which diffuses into the medium and subsequently into other cells. The transcription factor R, when bound to AHL, activates transcription of mRNA at the promoter pX. For clarity, we have separated the ‘issuing’ and ‘receiving’ of the chemical signal, but these processes happen simultaneously within each cell.

[Figure ID: RSTA20110548F9]
Figure 9. 

I/R feedback systems. (a) The input–output function of pX: the output transcription rate as a function of YI and YR at a fixed cell density ρ. The contour plot shows the value of αX(μρYI,YR), as measured in Rai et al. [26]. (b,c) Feedback topologies. Either R or I is controlled externally, while the other protein is expressed from the promoter pX with transcription rate αX(μρYI,YR). The same promoter can also drive further outputs. The two topologies are different because the function αX() is asymmetric, and because it is only the term YI that is multiplied by the cell density ρ. (b) R-feedback. (c) I-feedback.

[Figure ID: RSTA20110548F10]
Figure 10. 

Density-dependent responses. (a) Four types of responses: (M) monostable, where transcription smoothly increases with cell density; (B+) bistable, with a threshold density at which transcription abruptly increases; (B±) bistable and hysteretic at the terminal density, where high and low transcription states coexist; (B−) bistable but uninduced even at the terminal density, since the potentially bistable region is never reached. (b) Regions of {α,n} space that generate each response type; α represents the external control parameter, whereas n represents the Hill coefficient based on a parametrization of the input–output function αX(μρYI, YR) [26].

[Figure ID: RSTA20110548F11]
Figure 11. 

Using topology to tame biochemistry. (a) Regulation, biochemistry and topology can each be used to modulate the response of a genetic network, but on successively longer time scales. (b) We show a slice of biochemical space in which two network topologies (I and II) can potentially generate two different types of responses (A and B) within the regions indicated. Grey dots represent the unknown, a priori distribution of parameter values. Although region I-B appears larger than region I-A, topology-I is much more likely to generate type A responses compared with type B responses because of the increased density of dots in region I-A. However, because region I-A completely contains region II-A, we can say that topology-I is more likely to generate type A responses than topology-II is, regardless of the density of dots.

Article Categories:
  • Articles
Article Categories:
  • Review Article
Article Categories:
  • 1009
  • 30

Keywords: synthetic biology, feedback loops, Boolean threshold models.

Previous Document:  Transdimensional inference in the geosciences.
Next Document:  Random time series in astronomy.