Four nucleotides just aren't enough these days.

31 January 2008

DNA, the molecule underlying every form of life on this planet, is in essence a very long chain of sugar and phosphate molecules connected end to end ('long' being a relative term, of course - a molecule 5 centimeters long is gargantuan when you take into account the fact that it's only about 2.4 billionths of a meter in diameter). Each link in the chain is called a nucleotide, and is comprised of one of four possible compounds, adenine, thymine, cytosine, and guanine. Adenine bonds with thymine and cytosine to guanine; each pairing has two possible orientations, for example A-T and T-A. Seems pretty simple at first scratch - you could liken it to a four-bit encoding scheme for information if you wanted to. Yesterday, however, chemical biologist Floyd Romesberg working at the Scripps Research Institute in California announced that he and his team had figured out how to create two new possible nucleotides called dSICS and dMMO2, which means two new base pairs (dSICS-dMMO2 and dMMO2-dSICS) are possible. As if that weren't enough, they didn't have to re-work the replication or transcription mechanisms of DNA to do this because the processes already implemented in the cells of carbon-based life can manipulate them normally (no word yet on how well existing error correction mechanisms would work with the new base pairs, though). The success comes from a decade's worth of work at Scripps Research involving nearly two hundred tweaks of existing nucleotides, most of which weren't compatible with cellular mechanisms.

It should be noted in all fairness that all of this work wasn't done in the context of living cells, but in vitro with enzymes and an environment replicating the nuclei of cells. Actually implementing these new base pairs inside of cells is something that hasn't been done yet, though other recent advances in genetic engineering make such a thing an attractive prospect, to be sure. The upshot of this is that this is a way to add two new values to the character set of DNA, which means that more complex 'concepts' or informational expressions can be encoded in a smaller physical space with a different 'letter'. While we don't have a use for it just yet, this represents an essential step toward manipulating DNA to specific ends in the future.