Calculating entropy with Python.

Sep 13 2020

Fun fact: There is more than one kind of entropy out there.

If you've been through high school chemistry or physics, you might have learned about thermodynamic entropy, which is (roughly speaking) the amount of disorder in a closed system.  Alternatively, and a little more precisely, thermodynamic entropy can be defined as the heat in a volume of space equalizing throughout the volume.  But that's not the kind of entropy that I'm talking about.

Information theory has its own concept of entropy.  One way of explaining information theory is that it's the mathematical study of messages as they travel through a communications system (which you won't need to know anything about for the purposes of this article).  In the year 1948.ev Claude Shannon (the father of information theory) wrote a paper called A Mathematical Theory of Communication in which he proposed that the amount of raw information in a message could be thought of as the amount of uncertainty (or perhaps novelty) in a given volume of bits (a message) in a transmission.  So, Shannon entropy could be thought of as asking the question "How much meaningful information is present in this message?"  Flip a coin and there's only one bit - heads or tails, zero or one.  Look at a more complex message and it's not quite so simple.  However, let's consider a computational building block, if you will:

One bit has two states, zero or one, or 21 states.  Two bits have four possible states: 00, 01, 10, and 11, or 22 possible states.  n bits have 2n possible states, which means that they can store up to n bits of information.  Now we bring in logarithms, which we can think of in this case as "what number foo would we need in 2foo to represent the number of bits in a message?"

Neologism: Onboarding suppository

Aug 11 2018

onboarding suppository - noun complex - The massive volume of data that a new hire has to assimilate and comprehend before they can understand what they're supposed to be working on to any meaningful extent.

Memetic warfare in America.

Dec 04 2016

The current state of anyone's capacity to get any useful information in the United States these days, which is to say next to impossible due to the proliferation of fake news sites and pro-trolls doing their damndest to lower the signal-to-noise ratio to epsilon, is the logical end result of the following progression of cliches:

"You can't believe everything people tell you."

"You can't believe everything you read in books."

"You can't believe everything you see on TV."

"You can't believe everything your friends tell you."

"You can't believe everything your teachers tell you."

"You can't believe everything you read in magazines."

"You can't believe everything in your textbooks; they're written by people with agendas."

"You can't believe anything in newspapers."

"You can't believe everything you read on the Internet."

Genetic jiggery-pokery.

Jul 04 2016

It's long been known that DNA encodes information in a four-bit pattern which can be read and processed like any other bitstream. Four different nucleotides, paired two by two, arranged in one of two configurations side by side by side in a long string of letters, many times longer than the size of the cell containing the full DNA strand. Every cell in every single lifeform contains the same DNA sequence, regardless of what the cell actually does. So how, many have asked, does a cell know if it should help produce hair, or skin, or pigments, or something else? As it turns out, there is more than one layer of information encoding at work in DNA - the way in which DNA is folded in three dimensions also encodes information used by the cell. Inside of every cell the DNA is tightly wound around a cluster of eight proteins called histones, which provide a superstructure to support the two meter long molecule. The question then becomes, how are the specific parts of the DNA molecule directly involved in what a given cell does, called nucleosomes kept accessible to the rest of the cellular machinery? Hypotheses to this effect have been going around since the 1980's but only recently has computational simulation been feasiable to put them to the test. As it turns out, the loops, twists, bends, curves, and folds that DNA undergoes around the histone octomers keep keep those functional nucleosomes exposed so that they can be acted upon. The simulations randomly pushed, pulled, prodded, and twisted virtual DNA strands to see what would happen, and they noted that nucleosomal configurations were in fact impacted. Those simulation results were then verified through laboratory observation of two species of common yeasts. It was also confirmed that point mutations can also influence the folding of DNA, which can result in changes in the frequency of synthesis of proteins due to change in accessibility of those nucleosomes. The entire (highly technical) paper (it gave me a headache on the first readthrough, okay?) is available in its entirity on PLOS ONE under a Creative Commons By Attribution v4.0 International license.