CoCalc -- journal.pbio.0020164.txt

OANC_GrAF / data / written_2 / technical / plos / journal.pbio.0020164.txt
³⁹⁶⁷³ views
1

2
  
3
    
4
      
5
        
6
        Why is the sky blue? Any scientist will answer this question with a statement of
7
        mechanism: Atmospheric gas scatters some wavelengths of light more than others. To answer
8
        with a statement of purpose—e.g., to say the sky is blue in order to make people
9
        happy—would not cross the scientific mind. Yet in biology we often pose “why” questions in
10
        which it is purpose, not mechanism, that interests us. The question “Why does the eye have
11
        a lens?” most often calls for the answer that the lens is there to focus light rays, and
12
        only rarely for the answer that the lens is there because lens cells are induced by the
13
        retina from overlying ectoderm.
14
        It is a legacy of evolution that teleology—the tendency to explain natural phenomena in
15
        terms of purposes—is deeply ingrained in biology, and not in other fields (Ayala 1999).
16
        Natural selection has so molded biological entities that nearly everything one looks at,
17
        from molecules to cells, from organ systems to ecosystems, has (at one time at least) been
18
        retained because it carries out a function that enhances fitness. It is natural to equate
19
        such functions with purposes. Even if we can't actually know why something evolved, we care
20
        about the useful things it does that could account for its evolution.
21
        As a group, molecular biologists shy away from teleological matters, perhaps because
22
        early attitudes in molecular biology were shaped by physicists and chemists. Even
23
        geneticists rigorously define function not in terms of the useful things a gene does, but
24
        by what happens when the gene is altered. Molecular biology and molecular genetics might
25
        continue to dodge teleological issues were it not for their fields' remarkable recent
26
        successes. Mechanistic information about how a multitude of genes and gene products act and
27
        interact is now being gathered so rapidly that our inability to synthesize such information
28
        into a coherent whole is becoming more and more frustrating. Gene regulation, intracellular
29
        signaling pathways, metabolic networks, developmental programs—the current information
30
        deluge is revealing these systems to be so complex that molecular biologists are forced to
31
        wrestle with an overtly teleological question: What purpose does all this complexity
32
        serve?
33
        In response to this situation, two strains have emerged in molecular biology, both of
34
        which are sometimes lumped under the heading “systems biology.” One strain, bioinformatics,
35
        champions the gathering of even larger amounts of new data, both descriptive and
36
        mechanistic, followed by computerbased data “mining” to identify correlations from which
37
        insightful hypotheses are likely to emerge. The other strain, computational biology, begins
38
        with the complex interactions we already know about, and uses computer-aided mathematics to
39
        explore the consequences of those interactions. Of course, bioinformatics and computational
40
        biology are not entirely separable entities; they represent ends of a spectrum, differing
41
        in the degree of emphasis placed on large versus small data sets, and statistical versus
42
        deterministic analyses.
43
        Computational biology, in the sense used above, arouses some skepticism among
44
        scientists. To some, it recalls the “mathematical biology” that, starting from its heyday
45
        in the 1960s, provided some interesting insights, but also succeeded in elevating the term
46
        “modeling” to near-pejorative status among many biologists. For the most part, mathematical
47
        biologists sought to fit biological data to relatively simple mathematical models, with the
48
        hope that fundamental laws might be recognized (Fox Keller 2002). This strategy works well
49
        in physics and chemistry, but in biology it is stymied by two problems. First, biological
50
        data are usually incomplete and extremely imprecise. As new measurements are made, today's
51
        models rapidly join tomorrow's trash heaps. Second, because biological phenomena are
52
        generated by large, complex networks of elements, there is little reason to expect to
53
        discern fundamental laws in them. To do so would be like expecting to discern the
54
        fundamental laws of electromagnetism in the output of a personal computer.
55
        Nowadays, many computational biologists avoid modeling-as-data-fitting, opting instead
56
        to create models in which networks are specified in terms of elements and interactions (the
57
        network “topology”), but the numerical values that quantify those interactions (the
58
        parameters) are deliberately varied over wide ranges. As a result, the study of such
59
        networks focuses not on the exact values of outputs, but rather on qualitative behavior,
60
        e.g., whether the network acts as a “switch,” “filter,” “oscillator,” “dynamic range
61
        adjuster,” “producer of stripes,” etc. By investigating how such behaviors change for
62
        different parameter sets— an exercise referred to as “exploring the parameter space”—one
63
        starts to assemble a comprehensive picture of all the kinds of behaviors a network can
64
        produce. If one such behavior seems useful (to the organism), it becomes a candidate for
65
        explaining why the network itself was selected, i.e., it is seen as a potential purpose for
66
        the network. If experiments subsequently support assignments of actual parameter values to
67
        the range of parameter space that produces such behavior, then the potential purpose
68
        becomes a likely one.
69
        For very simple networks (e.g., linear pathways with no delays or feedback and with
70
        constant inputs), possible global behaviors are usually limited, and computation rarely
71
        reveals more than one could have gleaned through intuition alone. In contrast, when
72
        networks become even slightly complex, intuition often fails, sometimes spectacularly so,
73
        and computation becomes essential.
74
        For example, intuitive thinking about MAP kinase pathways led to the long-held view that
75
        the obligatory cascade of three sequential kinases serves to provide signal amplification.
76
        In contrast, computational studies have suggested that the purpose of such a network is to
77
        achieve extreme positive cooperativity, so that the pathway behaves in a switch-like,
78
        rather than a graded, fashion (Huang and Ferrell 1996). Another example comes from the
79
        study of morphogen gradient formation in animal development. Whereas intuitive
80
        interpretations of experiments led to the conclusion that simple diffusion is not adequate
81
        to transport most morphogens, computational analysis of the same experimental data yields
82
        the opposite conclusion (Lander et al. 2002).
83
        As the power of computation to identify possible functions of complex biological
84
        networks is increasingly recognized, purely (or largely) computational studies are becoming
85
        more common in biological journals. This raises an interesting question for the biology
86
        community: In a field in which scientific contributions have long been judged in terms of
87
        the amount of new experimental data they contain, how does one judge work that is primarily
88
        focused on interpreting (albeit with great effort and sophistication) the experimental data
89
        of others? At the simplest level, this question poses a conundrum for journal editors. At a
90
        deeper level, it calls attention to the biology community's difficulty in defining what,
91
        exactly, constitutes “insight” (Fox Keller 2002).
92
        In yesterday's mathematical biology, a model's utility could always be equated with its
93
        ability to generate testable predictions about new experimental outcomes. This approach
94
        works fine when one's ambition is to build models that faithfully mimic particular
95
        biological phenomena. But when the goal is to identify all possible classes of biological
96
        phenomena that could arise from a given network topology, the connection to experimental
97
        verification becomes blurred. This does not mean that computational studies of biological
98
        networks are disconnected from experimental reality, but rather that they tend, nowadays,
99
        to address questions of a higher level than simply whether a particular model fits
100
        particular data.
101
        The problem this creates for those of us who read computational biology papers is
102
        knowing how to judge when a study has made a contribution that is deep, comprehensive, or
103
        enduring enough to be worth our attention. We can observe the field trying to sort out this
104
        issue in the recent literature. A good example can be found in an article by Nicholas
105
        Ingolia in this issue of 
106
        PLoS Biology (Ingolia 2004), and an earlier study from Garrett Odell's
107
        group, upon which Ingolia draws heavily (von Dassow et al. 2000).
108
        Both articles deal with a classical problem in developmental biology, namely, how
109
        repeating patterns (such as stripes and segments) are laid down. In the early fruit fly
110
        embryo, it is known that a network involving cell-to-cell signaling via the Wingless (Wg)
111
        and Hedgehog (Hh) pathways specifies the formation and maintenance of alternating stripes
112
        of gene expression and cell identity. This network is clearly complex, in that Wg and Hh
113
        signals affect not only downstream genes, but also the expression and/or activity of the
114
        components of each other's signaling machinery.
115
        Von Dassow et al. (2000) calculated the behaviors of various embodiments of this network
116
        over a wide range of parameter values and starting conditions. This was done by expressing
117
        the network in terms of coupled differential equations, picking parameters at random from
118
        within prespecified ranges, solving the equation set numerically, then picking another
119
        random set of parameters and obtaining a new numerical solution, and so forth, until
120
        240,000 cases were tried. The solutions were then sorted into groups based on the predicted
121
        output—in this case, spatial patterns of gene expression.
122
        When they used a network topology based only upon molecular and generegulatory
123
        interactions that were firmly known to take place in the embryo, they were unable to
124
        produce the necessary output (stable stripes), but upon inclusion of two molecular events
125
        that were strongly suspected of taking place in the embryo, they produced the desired
126
        pattern easily. In fact, they produced it much more easily than expected. It appeared that
127
        a remarkably large fraction of random parameter values produced the very same stable
128
        stripes. This implied that the output of the network is extraordinarily robust, where
129
        robustness is meant in the engineering sense of the word, namely, a relative insensitivity
130
        of output to variations in parameter values.
131
        Because real organisms face changing parameter values constantly—whether as a result of
132
        unstable environmental conditions, or mutations leading to the inactivation of a single
133
        allele of a gene—robustness is an extremely valuable feature of biological networks, so
134
        much so that some have elevated it to a sort of sine qua non (Morohashi et al. 2002).
135
        Indeed, the major message of the von Dassow article was that the authors had uncovered a
136
        “robust developmental module,” which could ensure the formation of an appropriate pattern
137
        even across distantly related insect species whose earliest steps of embryogenesis are
138
        quite different from one another (von Dassow et al. 2000).
139
        There is little doubt that von Dassow's computational study extracted an extremely
140
        valuable insight from what might otherwise seem like a messy and ill-specified system. But
141
        Ingolia now argues that something further is needed. He proposes that it is not enough to
142
        show that a network performs in a certain way; one should also find out why it does so.
143
        Ingolia throws down the gauntlet with a simple hypothesis about why the von Dassow
144
        network is so robust. He argues that it can be ascribed entirely to the ability of two
145
        positive feedback loops within the system to make the network bistable. Bistability is the
146
        tendency for a system's output to be drawn toward either one or the other of two stable
147
        states. For example, in excitable cells such as neurons, depolarization elicits sodium
148
        entry, which in turn elicits depolarization—a positive feedback loop. As a result, large
149
        depolarizations drive neurons to fully discharge their membrane potential, whereas small
150
        depolarizations decay back to a resting state. Thus, the neuron tends strongly toward one
151
        or the other of these two states. The stability of each state brings with it a sort of
152
        intrinsic robustness— i.e., once a cell is in one state, it takes a fairly large
153
        disturbance to move it into the other. This is the same principle that makes electronic
154
        equipment based on digital (i.e., binary) signals so much more resistant to noise than
155
        equipment based on analog circuitry.
156
        Ingolia not only argues that robustness in the von Dassow model arises because positive
157
        feedback leads to network bistability, he further claims that such network bistability is a
158
        consequence of bistability at the single cell level. He strongly supports these claims
159
        through computational explorations of parameter space that are similar to those done by von
160
        Dassow et al., but which also use strippeddown network topologies (to focus on individual
161
        cell behaviors), test specifically for bistability, correlate results with the patterns
162
        formed, and ultimately generate a set of mathematical rules that strongly predict those
163
        cases that succeed or fail at producing an appropriate pattern.
164
        At first glance, such a contribution might seem no more than a footnote to von Dassow's
165
        paper, but a closer look shows that this is not the case. Without mechanistic information
166
        about why the von Dassow network does what it does, it is difficult to relate it to other
167
        work, or to modify it to accommodate new information or new demands. Ingolia demonstrates
168
        this by deftly improving on the network topology. He inserts some new data from the
169
        literature about the product of an additional gene, 
170
        sloppy-paired , in Hh signaling, removes some of the more tenuous
171
        connections, and promptly recovers a biologically essential behavior that the original von
172
        Dassow network lacked: the ability to maintain a fixed pattern of gene expression even in
173
        the face of cell division and growth.
174
        Taken as a pair, the von Dassow and Ingolia papers illustrate the value of complementary
175
        approaches in the analysis of complex biological systems. Whereas one emphasizes simulation
176
        (as embodied in the numerical solution of differential equations), the other emphasizes
177
        analysis (the mathematical analysis of the behavior of a set of equations). Whereas one
178
        emphasizes exploration (exploring a parameter space), the other emphasizes the testing of
179
        hypotheses (about the origins of robustness). The same themes can be seen in sets of papers
180
        on other topics. For example, in their analysis of bacterial chemotaxis, Leibler and
181
        colleagues (Barkai and Leibler 1997) found a particular model to be extremely robust in the
182
        production of an important behavior (exact signal adaptation), and subsequently showed that
183
        bacteria do indeed exhibit such robust adaptation (Alon et al. 1999). Although Leibler and
184
        colleagues took significant steps toward identifying and explaining how such robustness
185
        came about, it took a subsequent group (Yi et al. 2000) to show that robustness emerged as
186
        a consequence of a simple engineering design principle known as “integral feedback
187
        control.” That group also showed, through mathematical analysis, that integral feedback
188
        control is the only feedback strategy capable of achieving the requisite degree of
189
        robustness.
190
        From these and many other examples in the literature, one can begin to discern several
191
        of the elements that, when present together, elevate investigations in computational
192
        biology to a level at which ordinary biologists take serious notice. Such elements include
193
        network topologies anchored in experimental data, fine-grained explorations of large
194
        parameter spaces, identification of “useful” network behaviors, and hypothesisdriven
195
        analyses of the mathematical or statistical bases for such behaviors. These elements can be
196
        seen as the foundations of a new calculus of purpose, enabling biologists to take on the
197
        much-neglected teleological side of molecular biology. “What purpose does all this
198
        complexity serve?” may soon go from a question few biologists dare to pose, to one on
199
        everyone's lips.
200
      
201
    
202
  
203

204
Product

Resources

Company