Some time ago, I found myself using a Kryoflux interface and a couple of old floppy drives that had been kicking around in my workshop for a while to rip disk images of a colleague's floppy disk collection. It took me a day or two of screwing around to figure out how to use the Kryoflux's software to make it do what I wanted. Of course, I took notes along the way so that I would have something to refer back to later. Recently, I decided that it would probably be helpful to people if I put those notes online for everyone to use. So, here they are.
UPDATE: 20170131 - The Eventbrite page for this event has gone live! Sign up!
I haven't had time to write about #datarefuge yet, in part because people a lot closer to the matter have been doing so, and much better than I could at the moment. An entire movement has arisen around scientific data being 451'd because it's politically inconvenient, and not many of us know if it's being erased or just shut down. We also don't know for certain if it's being copied elsewhere for safekeeping so we're doing it ourselves. To do my part, I've been communicating with some of the organizers and having Leandra suck down data as fast as my home link will permit to store it on her RAID array. But, the important thing:
On 11 February 2017, the Datarescue SF Bay event will held at the Berkeley Institute for Data Science from 0900 PST until 1500 PST. That day, everybody at the event will identify data sets at risk of vanishing, work out how to best mirror them, and download them as fast as possible so they can be archived elsewhere. Bring your drives, bring your boxen, and get ready to burn up bandwidth.
UPDATE - 20170302 - Added Firefox plugin for the Internet Archive.
UPDATE - 20170205 - Added Chrome plugin for the Internet Archive.
Note: This article is aimed at people all across the spectrum of levels of experience with computers. You might see a lot of stuff you already know; then again, you might learn one or two things that hadn't showed up on your radar yet. Be patient.
In George Orwell's novel 1984, one of his plot points of the story was something called the Memory Hole. They were slots all over the building in which Winston Smith worked, into which documents which the Party considered seditious or merely inconvenient were deposited for incineration. Anything that the Ministry of Truth decided had to go because it posed a threat to the party line was destroyed. This meant that if anyone wanted to go back and double check to see what history might have been, the only thing they could get hold of were "officially sanctioned" documents written to reflect the revised Party policy. Human memory's funny: If you don't have any static representation of something to refer back to periodically, eventually you come to think that whatever people have been telling you is the real deal, regardless of what you just lived through. No mind tricks are necessary, just repetition.
The Net's a lot like that. There are literally piles and piles of information everywhere you look, but most of it resides on systems that aren't yours. This blog is running on somebody else's server, and it wouldn't take much to wipe it off the face of the Net. All it would take is a DMCA takedown notice with no evidence (historically speaking, this is usually the case). This has happened in the past a number of times, including to an archive maintained by Project Gutenberg and documents explicitly placed into the public domain so somebody could try to make a buck off of them. This is a common enough thing that the IETF has made a standard HTTP error code to reflect it, Error 451 - Unavailable for legal reasons.