New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
Computation with biomolecules
Related Article
 Molecular computation: RNA solutions to chess problems Feb 15, 2000
In 1994 Leonard Adleman used DNA strands to encode cities and flights to show that itineraries satisfying special conditions could be constructed and isolated in the laboratory (1). The abstract of this seminal paper concludes, “This experiment demonstrates the feasibility of computation at the molecular level.” Within months Richard Lipton argued (2) that fundamental problems concerning the truth of logic statements also could be addressed in this new way. In this issue of PNAS, Faulhammer et al. (3) use laboratory techniques with relatively low error rates in an experiment realizing Lipton's earlier design.
However, within a year after Adleman's article, it already was known such approaches can require unrealistic quantities of materials, leading Stemmer (4) to suggest using existing laboratory procedures to evolve realistically sized molecular populations by inducing variation and selection. Such evolutionary computation is an established paradigm for conventional computers. Interest is beginning to focus on DNAbased implementations (5–7) of evolutionary computation because of its alleged robustness in the presence of errors and its ability to exploit the massive parallelism and memory inherent at the molecular level.
By its success in the laboratory, Adleman's work (1) sparked intense excitement and marked the birth of a new field, DNA computation. Adleman's insight was that the ability of singlestranded DNA to seek and bind to a complementary strand allows DNA to carry out massively parallel computation. Although it seems doubtful that biologically based computers will be suitable for generalpurpose applications, there are some especially difficult problems where conventional computers lack the massive parallelism and huge memory capacity inherent in molecular computation. For example, the socalled “NPcomplete” problems that apparently require exponentially increasing computing time with a linear increase in problem size are notoriously difficult for silicon computers to solve. Thus, are there special applications where biomolecular computation can outperform siliconbased computers? During the past few years there have been many proposals seeking new methods for using DNA to solve NPcomplete problems or to construct programmable computers (8–10). Examples include: implementing blocked cellular automata by selfassembling DNA tiles (11), carrying out DNA computation on solid support surface rather than in solution (12), breaking the Data Encryption Standard (13), and boolean circuits (14). Guarnieri et al. (15) have developed a procedure to implement the addition of two nonnegative binary numbers through a chain reaction, and Klein et al. (16) have demonstrated the implementation of logically reversible computation. Unfortunately, there are still few experimental results.
Before Adleman's article, molecular computation had been extensively studied (see ref. 17 for a survey). Also, DNA and associated laboratory techniques had been specifically proposed to implement computations of formal language theory (18). However, Adleman's computations for the Hamiltonian path problem (traveling salesman problem) seem to have caught most people's imagination. The Hamilton path problem is a famous intractable problem in computer science (19). This problem can be paraphrased as “find a flight itinerary for visiting each of the cities shown on an airline map exactly once” (Fig. 1).
Briefly, Adleman's experiments followed these six steps:
1. Encode cities and flights into DNA.
2. Assemble itineraries at random.
3. Select itineraries from initial city to final city.
4. The correct number of cities must be visited.
5. No city can be left out.
6. Is anything left?
These experiments were performed in two phases. First, randomly generate all potential solutions to the problem at hand. Second, isolate the correct solution through repeated removal or dilution of incorrect solutions. At the end of this series of separation steps it is only necessary to detect whether there are any DNA strands left. These necessarily represent correct solutions.
In 1995, Lipton (2) suggested a method for generating the set of all binary strings of DNA with length n. Lipton's procedure involves first creating a graph such that a path through the graph is a realization of a binary string of length n. The graph is constructed by the Adleman protocol and is constrained so that all paths start at a_{1}, end at a_{n+1}, and encode an n bit binary string (Fig. 2). At each stage a path has exactly two choices: If it takes the vertex with an unbarred label, it will encode a 1; if it takes the vertex with a barred label, it will encode a 0. Therefore, the path [a_{1} X_{1} a_{2} X̄_{2} a_{3} … a_{n} X_{n} a_{n+1}] encodes the binary string [1, 0, … ,1] (as shown in Fig. 2A).
Lipton proposed that such binary encoding of DNA could be used to solve logic satisfaction problems. Such a problem has the form, “If a given logic statement contains n variables, is there a way to make the whole statement true by choosing the n variables to each be true or false?” Clearly such a choice corresponds to some binary string of length n. The only method known to verify logic satisfaction problems in all cases is simply to try all possibilities. This would quickly become intractable even for supercomputers when there are hundreds of variables. Lipton pointed out that in theory one could create strands of DNA (trillions or more molecules) representing all possible solutions of satisfiability problems having some tens of variables and then search through them all in parallel (2).
In the paper appearing in this issue of PNAS (3), implementing Lipton's earlier proposal, Faulhammer et al. show they can solve a 9bit instance of a “knight problem,” which is a type of logic satisfaction problem, by constructing a 10bit binary RNA library that encodes all of the 1,024 possible 10bit binary strings, then systematically performing operations that destroy strings that fail to satisfy some parts of the problem (3). Faulhammer et al. successfully indicate that Adleman's approach to finding a solution to the Hamiltonian path problem could be generalized to solve the problem of finding a satisfying assignment for a general logic statement. It is notable that with a destructive algorithm (20), introducing complementary DNA oligonucleotides to selectively mark RNA strands for RNase H digestion (this enzyme requires a RNA–DNA duplex as substrate), Faulhammer et al. could destroy strands that did not fit the constraints of the desired solution. Thus, instead of a method that depends on an extraction procedure with potentially high error rates as used by Adleman (1), Faulhammer et al. show that their “mark and destroy” method can recover a set of “winning” molecules representing all of the solutions of the “knight problem” with low error rates. Fortyfour distinct candidate solutions were isolated from the 10bit RNA binary library, one of which was incorrect (3).
Both Adleman (1) and Faulhammer et al. (3) demonstrate the feasibility of carrying out computation at the molecular level. In this new field where the theoretical foundation has been strong, yet very few experimental results have been reported, the paper by Faulhammer et al. certainly has moved the molecular computational field to another milestone. However, it is important to notice that even though the experimental designs were elegant, both experiments have addressed only very small instances of two different problems. Their methods are not suitable for solving large instances of NPcomplete problems. For example, Hartmanis (21) estimates that the same method Adleman used would, on a 200node graph, require an amount of DNA weighing more than the Earth. Therefore, both demonstrations mentioned above are important not because they “solve” hard problems, but rather because they show a way to use the immense parallelism of molecular interactions.
Conventional computers often solve problems by using the paradigm of evolutionary computation, based on analogies with natural evolution. Evolution computation is particularly used for solving problems involving many interacting variables such as timetable construction, machine learning, and graph drawing layout (22). For example, running an airline involves a complex timetable construction problem. The most stable aspects involve scheduling of flights, arrival gates, crews, aircraft, maintenance, training, etc. Timevarying aspects are even more difficult to incorporate: weather delays, equipment breakdowns, strikes, etc.
With conventional computers, evolution computation for a particular problem generally maintains a population of candidate solutions, often represented as a string of binary bits. In each generation, less promising candidates are likely to be eliminated. Their replacements are likely to be generated from relatively promising candidates by cloning and/or mutation and/or recombination. Less frequently, other genetic analogies may be used: inversion, gene hopping, multiple parents, etc. Evolutionary computation comes in many, many different varieties (see ref. 23 and http://alife.santafe.edu/∼joke/encore/www/). Most are variations and/or elaborations on the following very loose outline also illustrated in Fig. 3.
Begin with an initial population of candidate solutions, perhaps chosen randomly.
Repeat the following steps until convergence:
1. Evaluate fitness of candidates.
2. Select among more fit candidates to breed and among less fit to be replaced.
3. (Optional) Induce variation by breeding.
DNA molecules seem particularly suited to certain versions of evolutionary computation, because they can encode populations of bit strings suitable for both crossover and pointwise mutation. Also, there are known molecular biological protocols (24–27) for manipulating DNA in ways likely to be useful for implementing evolutionary computation with DNA. Three other advantages of using DNA to implement evolutionary computation are: First, there is the massive information storage potential of DNA molecules. For example, a gram of DNA contains about 10^{21} bases. The information content is approximately 2 × 10^{21} bits, greatly exceeding the 200petabyte storage of all digital magnetic tape produced in 1 year (R. Williams, http://www.ccsf.caltech.edu/∼roy/dataquan). Second, with DNA molecules one might process, in parallel, populations billions of times larger than is usual for conventional computers. One may process grams of DNA per day, with each gram encoding about 10^{21} bits of information. A stateoftheart teraflop supercomputer performs about 10^{12} operations per sec, so it would need about 100 days to process 10^{21} bits in 100bit blocks. Third, biolaboratory operations on DNA molecules inherently involve errors. Such errors are sometimes tolerable when executing evolutionary computation (28, 29), unlike when executing deterministic algorithms.
Laboratory work requires effort and money. At the current state of DNA computing, despite recent encouraging theoretical advances, there are still very few biochemical experiments reported. However, most would agree that the future of DNA computing certainly will depend on whether the experimental challenges can be met. Is DNA computing doomed to be useless? We believe this is unlikely. Even though it may be impossible to implement a programmable DNA computer capable of running most applications faster than a conventional computer, the massively parallel nature of DNA computing could be essential for some practical problems, such as evolutionary computations. Therefore, we believe that molecular computation could complement rather than replace silicon as a computational medium in the future.
Acknowledgments
Our research is partially supported by Defense Advanced Research Planning Agency/National Science Foundation Grant 9725021 and National Science Foundation Grant 9980092.
Footnotes
 Copyright © 2000, The National Academy of Sciences
References
 ↵
 Adleman L M
 ↵
 Lipton R J
 ↵
 Faulhammer D,
 Cukras A R,
 Lipton R J,
 Landweber L F
 ↵
 Stemmer W P C
 ↵
 Deaton R,
 Murphy R C,
 Rose J A,
 Garzon M H,
 Franceschetti D R,
 Stevens S E Jr

 Landweber L,
 Wintree E,
 Lipton R,
 Freeland S
 Bäck T,
 Kok J N,
 Rozenberg G
 ↵
 Gifford D,
 Wintree E
 Wood D H,
 Chen J,
 Antipov E,
 Lemieux B,
 Cedeño W
 ↵
 Puaun G,
 Rozenberg G,
 Salomaa A
 ↵
 Pool R
 ↵
 ↵
 ↵
 Lipton R,
 Baum E
 Boneh D,
 Dunworth C,
 Lipton R J
 ↵
 ↵
 ↵
 ↵
 Conrad M
 ↵
 ↵
 Garey M R,
 Johnson S D
 ↵
 Baum E B,
 Lipton R J
 Amos M,
 Gibbons A,
 Hodgson D
 ↵
 Hartmanis J
 ↵
 Zbigniew M
 ↵
 Bäck T,
 Fogel D B,
 Michalewicz Z
 ↵
 Beaudry A A,
 Joyce G E
 ↵
 ↵
 ↵
 Goldberg D E,
 Deb K,
 Clark J M