Genetic jiggery-pokery.

12 July 2016

It's long been known that DNA encodes information in a four-bit pattern which can be read and processed like any other bitstream. Four different nucleotides, paired two by two, arranged in one of two configurations side by side by side in a long string of letters, many times longer than the size of the cell containing the full DNA strand. Every cell in every single lifeform contains the same DNA sequence, regardless of what the cell actually does. So how, many have asked, does a cell know if it should help produce hair, or skin, or pigments, or something else? As it turns out, there is more than one layer of information encoding at work in DNA - the way in which DNA is folded in three dimensions also encodes information used by the cell. Inside of every cell the DNA is tightly wound around a cluster of eight proteins called histones, which provide a superstructure to support the two meter long molecule. The question then becomes, how are the specific parts of the DNA molecule directly involved in what a given cell does, called nucleosomes kept accessible to the rest of the cellular machinery? Hypotheses to this effect have been going around since the 1980's but only recently has computational simulation been feasiable to put them to the test. As it turns out, the loops, twists, bends, curves, and folds that DNA undergoes around the histone octomers keep keep those functional nucleosomes exposed so that they can be acted upon. The simulations randomly pushed, pulled, prodded, and twisted virtual DNA strands to see what would happen, and they noted that nucleosomal configurations were in fact impacted. Those simulation results were then verified through laboratory observation of two species of common yeasts. It was also confirmed that point mutations can also influence the folding of DNA, which can result in changes in the frequency of synthesis of proteins due to change in accessibility of those nucleosomes. The entire (highly technical) paper (it gave me a headache on the first readthrough, okay?) is available in its entirity on PLOS ONE under a Creative Commons By Attribution v4.0 International license.

And now a bit more about my favorite genetic engineering gizmo, CRISPr. We already know that CRISPr can be used to make amazingly precise edits to DNA, effectively acting like a hex editor for the genetic code of living cells. What has just been confirmed by a research team at Harvard University is that bacteria use subtypes of CRISPr/Cas (Cas1 and Cas2, to name two of them) to snip out pieces of DNA from attacking viruses called oligomers, and incorporate them into their own genomes to act as a memory of what they've gone up against in the past and how they can counteract the incursion. Due to the fact that bacteria are, all things considered relatively simple organisms (they consist of one cell only, after all), this means that they pass their entire library of genetic memory (I know, I know, that phrase has a lot of baggage attached to it and almost none of it accurate, but in this context I think it works decently well) on every time they reproduce through binary fission. This is, incidentally, how antibiotic resistance became such a problem: All it takes is one or two survivors with a gene that ensured their survival to rebuild the population. Additionally, the incorporation of oligomers does not appear to be scattershot; quite the opposite, actually. Genetic sequencing showed that oligomers are added sequentially, in historical order. So, by analyzing a bacterium's DNA you can in a literal sense read the history of everything it's been through from start to finish (though not necessarily with any precision as humans measure time, the best you'd be able to get is "This happened after that"). The hypothesis was tested by exposing bacterial cultures to specific snippets of DNA, waiting for them to incorporate the oligomers, and sequencing their DNA. As predicted by the hypothesis, the DNA sequences showed parts of the test DNA snippets they were exposed to in chronological order. This implies that genetically engineered bacteria could be deployed as sensors of a sort, recording chemical patterns in their DNA as legible sequences and later sampled and analyzed to read the stored information. This also has implications for using DNA as an information storage medium.

A few months back I wrote about a cross-research institution R&D project that develops a programming language called Cello, which makes it possible to code digital logic circuits into bacterial DNA that process information in a practical manner. As it turns out they're not the only R&D team working on something like this. A little known fact: Information can be processed with things other than digital circuitry; prior to the invention of the transistor analog electrical computing was used in early electrical computers and for a good period of time after. For the curious, here's a brief textbook on the subject. A research team at MIT recently developed a technique which implements both digital and analog computing elements in living cells and uses the two paradigms together to carry out very complex operations. Analog measurement and comparison elements are used to take readings in the cell's environment, and the readings are measured and acted upon in a decidedly digital fashion. If this-and-such is between foo and bar, one thing happens (such as the release of a hormone); if that-and-so is between bar and baz, a different hormone is released. You get the idea. The upshot of this is that analog mechanisms can be implemented with many fewer components than corresponding purely digital versions. One way to look at it is the difference between a strain gauge and a couple of hundred transistors. Or in this case, a few receptor sites on a cell's membrane that control the expression of a gene that synthesizes a recombinase that results in another DNA sequence being snipped out, flipped end-for-end, and reinstalled, sort of like a switch. Presto: You now have a bit. Next step: Figure out how to use this technique to turn on and off more complex genes, such as those that synthesize insulin, glucagon, or one of the thyroid hormones, and regulate their levels automatically.