One Hand Clapping, page 2
Let’s unpack. The “information” to which Crick refers is genetic information or, simply, genes: the heritable sequence of “letters” of DNA.* This information gets passed from one generation to the next by copying said DNA. But then, in each generation, the same information is also “passed into protein.” Proteins are life’s principal molecular machines, whimsical nanorobots with which everything in a living organism is done. Genes are basically blueprints for these proteins: the sequence of DNA letters is a code that is used to assemble them. This is what “information passing into protein” means. And here’s the dogma: once the code is in, it can’t get out. You can’t extract the blueprint from the protein and make another protein based on that.
Note that the statement that Crick chose to encapsulate the central dogma is not about what DNA can do (pass its contents from generation to generation and serve as a blueprint for proteins), but about what proteins cannot do: proteins cannot copy themselves. They are an informational dead end. Every protein eventually gets destroyed—either because it is no longer needed or simply due to wear and tear. When that happens and a protein falls apart, the information inside of it dies, and a new protein must be made again using a DNA blueprint. A protein by itself cannot carry a gene onward into the future. The reason this is so significant—and puzzling—is because in a living organism, proteins do nearly everything else.
A protein is not just one specific substance; it is, rather, a type of molecule. Proteins are all the different chains that can be made from the same set of much smaller molecules called amino acids, which are strung together in sequence, like beads. There are a total of twenty different amino acids that occur in proteins (think twenty colors of beads). Amino acids are all simple molecules but quite different from each other in their chemical properties. A typical protein is made of several hundred of them—assembled in a particular sequence, it contorts into an intricate three-dimensional shape spattered with chemical groups operating as the gears and cogs of a machine. Humans have roughly twenty to twenty-five thousand different proteins in total,2 and there is a gene for each one, a corresponding region of DNA that serves as an instruction for assembling this specific sequence of amino acids. Each cell decides for itself which subset of these twenty thousand to produce, in what quantities, and at what time according to its needs.
These diverse proteins rule the living organism. Like workers of different professions, they do anything and everything there is to be done. We digest food using proteins, breathe in oxygen using proteins, and move using proteins. Proteins identify viruses, proteins synthesize the cell membrane, and when long-term memories are formed, proteins in the hippocampus are using proteins to send protein signals to other proteins in the cerebral cortex.
And in the most intriguing twist, proteins are also in charge of copying DNA—that very thing they cannot do for themselves.
Like proteins, DNA is a chain of molecules, in this case called nucleotides, strung together in a sequence. DNA chains are larger and clunkier than protein chains, and although they contain instructions for making proteins, they cannot do very many things on their own. Most of the time, DNA just floats there, while proteins climb all over it, making stuff happen—reading the sequence, repairing the sequence, copying the sequence. Without these proteins, DNA is almost helpless and certainly could not organize its own replication. So proteins are responsible for propagating DNA, which in turn holds the key to the existence of proteins. DNA needs proteins so that it gets copied, but proteins need DNA so they are re-created in each generation. That’s the ultimate chicken-and-egg.
Why this bizarre arrangement? If proteins are such universally capable molecules, if they are so much better as molecular devices than DNA, why can’t everything, including inheritance, just run on proteins? This, by the way, was the predominant theory until the 1950s, when the role of DNA was definitively proven. The reason for this is a key feature that DNA possesses and proteins lack. It is called complementarity, and it is the axis on which the circle of life spins.
DNA has only four different nucleotides, which is a whole lot fewer building blocks than in proteins. These blocks, nucleotides, are also not as diverse as different amino acids are—chemically, they are more or less similar molecules. What they have instead is this key property, complementarity, also known as base pairing. The four DNA nucleotides, known as A (adenine), T (thymine), C (cytosine), and G (guanine), are organized into pairs that stick to each other: A sticks to T; C sticks to G. This doesn’t seem like much, but it means everything.
Because each nucleotide has a complementary counterpart, any sequence of nucleotides also has a complementary version—for example, ATTCG is complementary to TAAGC, like a positive and a negative. If you have one sequence, you can create the other using the first as a template, and vice versa. A typical DNA molecule carries both a “positive” strand and a “negative” strand—two complementary chains stuck to each other and wound into a double helix. Unwind the helix—and you have two complementary chains. A special DNA-weaving protein, or DNA polymerase, comes along and rebuilds the missing strands—a positive to a negative, and a negative to a positive, according to the same simple rule: A to the T, C to the G. Voilà—you end up with two identical double helixes. This is how DNA replicates, and it is only possible thanks to complementarity.
It is humbling to think that the continuity of generations, sustained for billions of years, connecting each living creature to our common ancestors and the very origin of life on Earth, hinges on four small molecules, the letters of the genetic alphabet, sticking to each other in pairs. In a way, the complementary chains of DNA represent the very essence of life. Think about it: it is only in biology that multiplication and division are the same thing, thanks to DNA. To multiply, living beings divide. That doesn’t happen when, say, snowflakes multiply in the air or when dirty dishes multiply in the sink. But new living organisms always in some way bud off already existing ones, ultimately—because DNA is copied by splitting the original in two.
Proteins don’t have anything comparable. Amino acids don’t come in complementary pairs, so there is no way to make a replica of an already existing protein: information contained within them “cannot get out,” per Crick’s formulation of the central dogma.
In other words, proteins don’t have access to eternity. That—eternity—is why they need DNA, whose paired nucleotides provide just that.
On the other hand, DNA without proteins is inert and lifeless. It is only thanks to their extraordinary abilities that DNA can take advantage of its complementarity, replicate, and impose its genetic will on the living organism. So what DNA needs proteins for is their nanorobot-like chemical versatility—which ultimately boils down to their tool kit of diverse amino acids.
So, in their very chemical nature, nucleotides and DNA embody continuity, whereas amino acids and proteins embody functionality. These essences can be separated, but one cannot exist without the other, and life as we know it cannot exist without either of the two.
One might think of DNA and proteins as equal partners in the industry of life. Actually, the relationship between these two great molecules of nature, and between the essences they embody, is more complex. DNA and proteins are not equals. A random change in DNA, as in a mutation, means a change in all proteins encoded in it, a change that could persist forever. But a random defect in a protein is as short-lived as the protein itself. Once the defective molecule falls apart, it does not affect DNA, or future generations, which continue to produce the same protein without any alteration. At the end of the day, it is DNA that controls proteins, not vice versa.
That’s really quite tragic. Proteins, these marvelous molecular machines capable of almost anything except self-replication, are forced, because of this deficiency, to labor for the benefit of the genes, controlled by their needs, subject to their whims. That which reproduces holds the power. Later in this book, we see this rule play out again and again: in the relationship between worker ants and their queen, in the relationship between the body and its sex cells, even in the relationship between individual experience and culture. Here, in the mutual arrangement of a few atoms in amino acids and nucleotides, the same essence is embodied in its purest, primordial form.
The World before the Dogma
The central dogma—the rule that genes “flow” from DNA to proteins and not vice versa—is usually represented in biology classrooms with a flowchart containing an extra level in the middle: DNA, to RNA, to proteins. RNA is DNA’s cousin, a similar molecule made out of slightly different nucleotides—the “NA” in both acronyms stands for “nucleic acid,” and the first letters—“R” for “ribo-,” “D” for “deoxyribo-”—refer to these small differences in the chemistry of RNA and DNA’s building blocks. RNA sits in the middle of the central dogma flowchart for no obvious reason. DNA is for inheritance; proteins do the jobs. You could imagine DNA directly converting into proteins. But that’s not how it happens. In reality, DNA is converted into proteins through the medium of RNA. First, a region of DNA must be transcribed (essentially, printed out) into its RNA equivalent. Then the printout must be translated—the sequence of RNA converted into a sequence of amino acids.
This two-step conversion, in itself, does not make a lot of sense. It seems like one of those things that you are just required to accept in a science class without asking “why.” That’s just how it is! But I can’t think of a better place to ask “why” and to look for a reason RNA exists. Because the answer explains not just how molecules work—it tells us how our story on this planet begins. The thing is, RNA might have been the original form of life. The reason it’s still there is because everything else grew around it.
Let’s back up a little. To get from a gene to a protein, first, you have to transcribe the gene into an equivalent string of RNA. This is straightforward because RNA and DNA are so chemically similar. Just like you can copy DNA using complementarity (A to the T, C to the G), you can transcribe it into RNA in the same way (except RNA has a slightly different nucleotide, U instead of T, though the two are functionally equivalent). All you have to do is find a gene you are interested in on the long meandering helix of DNA, unwind that part of the helix, and duplicate one of the strands using RNA nucleotides, like a photocopy of a page in a book. The photocopy—RNA—then peels off the book—DNA—and prepares to be converted into protein.
This next step, however, is much more complicated. Proteins and nucleic acids are totally different. There’s nothing like “A to the T, C to the G” to guide the assembly of one molecule based on another. You have to convert a sequence of nucleotides into a sequence of completely unrelated molecules—amino acids, which make up the proteins. It is like translating one language into another, and it requires some high-grade molecular trickery. It happens at an all-important cellular factory called the ribosome, a large, oddly shaped molecule that takes in the RNA printout of a gene and converts it, letter by letter, into the amino acid sequence of a protein.
This is where it gets especially interesting. Here we have the ribosome, a critical element of a living organism, a wondrous protein-making factory that interconverts two molecular languages. Virtually all molecular machines in nature are proteins: proteins do all the jobs, including even copying DNA, which can’t achieve replication on its own. You would think that the job of creating proteins would also belong to proteins. And yet, surprisingly, this is not so. Instead, the ribosome employs RNA—hence the “ribo” in its name.
This is extremely unusual. Generally speaking, nucleic acids are not good at doing jobs—any jobs. DNA is especially inert and would never consider anything as flamboyant as being part of a protein-making machine. RNA is not that much better—it is also made of nucleotides, and like any nucleotide chain, it is big, awkward, and nowhere near as versatile as a protein. Nucleic acids are almost always used as information carriers, not as components of functional devices, which, in turn, are almost always protein based.
But there are exceptions to the rule. Sometimes RNA acts surprisingly like a protein: it doesn’t just carry a code in its sequence, but actually does something as a molecule, a clumsy visitor from the high-society nucleic acids, getting its hands dirty with protein-esque manual labor. Some RNA molecules even look like proteins: instead of long formless strands of the DNA kind, they fold into compact three-dimensional shapes, very much in the protein fashion. The ribosome is a perfect example—the most notable enterprise organized by such protein-like RNAs. This protein-making factory is actually a conglomerate of different molecules, but RNA runs the show: it does the most critical jobs of selecting the appropriate amino acids and connecting them to one another during the protein assembly process.
What is going on with RNA, this protein wannabe in the nucleic acid family? You can imagine the central dogma flowchart with DNA and proteins alone, and yet RNA stubbornly intervenes and in fact holds the key to the entire “flow of information into protein.” Why is it there at all? The reason, say biologists who favor the “RNA world” theory, is that RNA was the original, most ancient life-form that had ever existed on our planet and, simultaneously, the prototype of both proteins and DNA, which evolved later as more specialized extensions of RNA’s abilities. In other words, RNA is a relic of the origin of life, much like the cosmic microwave background is a relic of the Big Bang.
At first glance, RNA compares unfavorably with both DNA and proteins. It’s not great for storing genes in the long run because it is less stable than DNA. In today’s world, only some viruses with very simple genomes are able to store their genes in the medium of RNA, and those viruses (COVID and flu, to name a couple) mutate and evolve a lot faster than viruses that opt for DNA (for example, rabies—this is why getting vaccinated once is good for life). RNA is also not as good as proteins at doing jobs because its nucleotides are chemically inferior to proteins’ amino acids. So DNA is a better archive, and proteins are better machines than RNA.
But what is profoundly unique about RNA is that it can be an archive and a machine at the same time. It can embody the essences of DNA and proteins—continuity and functionality—within a single physical unit.
It is for this reason that evolutionary biologists love RNA as much as they do. Since RNA can both do things and be replicated, the easiest way to imagine the advent of the “central dogma world”—today’s world with its codependent trio of DNA, RNA, and proteins—is to start with self-sufficient RNA that replicates itself.
Maybe it exists alone, multiplying only its own sequence, and its many copies gradually diversify. Maybe it altruistically replicates every random RNA it can find. In either case, over time, many different RNAs are created and replicated together. They take on a variety of molecular jobs that aid their collective reproduction. Then, eventually, comes the greatest milestone in the history of this “RNA world”: the invention of proteins.3 The advent of the ribosome—in its original form, a complex and folded RNA machine—enables RNAs themselves to transform their sequences—today known as genes—into protein sequences, producing an unlimited number of amino acid–based nanorobots. This invention opens a new world of possibilities for RNAs to create novel functions and optimize already existing ones. Almost all work is then relegated from RNA to proteins, save for a few rare instances (such as in the ribosome itself). A great variety of new jobs is created: proteins learn to replicate RNA, mint nucleotides, harvest and store energy, and eventually to create cell membranes and all the other essential components of a living organism. Finally, proteins create a new, superior, highly stable archive for storing genes: double-stranded DNA. The “central dogma world” as we know it is complete.
All in all, if we accept that RNA world is how things started, given a billion years or so plus some imagination, you can probably get from there to everything else.
But how do you get to this presumed starting point—a self-sufficient, self-replicating RNA? Is it even possible? The answer appears to be yes. Scientists have been able to artificially create an RNA system that can self-replicate indefinitely without any help from proteins.4 The nuance is that this system is not just a single self-sufficient RNA molecule making copies of itself, but rather several molecules that achieve their replication collectively. To simplify somewhat, molecule A replicates molecule B, B replicates C, and C replicates A, so no single molecule is sufficient, but the combination is locked in a loop that replicates all its members.
This scenario actually seems even more realistic vis-à-vis the origin of life. In nature, almost everything begins with an accident, but almost every accident ends in nothing. For there to be anything useful, there must have also been many RNAs doing many random things that never took off. Rather than imagining that one day among them a single Promethean benefactor started replicating everybody else, we are invited to imagine a soup of random and diverse RNA chains minding their business and going about their own agenda set by their randomly assembled sequences, until one day this soup finds itself interconnected through mutual replication. One day just the right combination of biochemical activities falls into place and forms something like the self-replicating collective the scientists created: A happens to make more B, B to make more C, and C to make more A—and there could have been thousands more RNAs involved in this collective cycle. Call this the “loop in the soup” model. Once the replication loop is formed, the circle of life starts spinning and continues to this day.
But just because this is achievable deliberately in the lab does not necessarily mean it is achievable spontaneously in real life. What could this primordial RNA brewery possibly be, in a real, physical sense? In search of the answer, most experts look to the deep sea.
Warm Little Vent
