Happy "Oh, gods, I have to go back to work?!" day, everyone.

31 January 2007

Wait a minute... ex-president Gerald Ford died?!

Lyssa pointed me at an article that brought up something that never occurred to me - how libraries manage the limited amount of space they have for all of their materials. This is to say, they keep track of how often each book is checked out (much easier to do since card catalogues and patron records went digital in the mid 1990's) and if it isn't touched for longer than a certain time, they either throw the books out (dumpster diving at the local library is how I got most of my books when I was a kid) or put them up for sale during the yearly fundraisers. At the very least, there is a chance that someone will buy a copy and keep it, which keeps the information inside the text available in some fashion, but a lot of books get thrown in the trash and are lost. When you think about how rare (or how expensive) some books are, this is a painful waste. I think this would be an ideal place for e-books to fit into the informational ecosystem - in the same physical space taken up by a hardback textbook, you can fit about a dozen DVD-ROM disks containing several thousand texts in .pdf, .chm, or .html format apiece. The problem there, of course, lies in converting the text into electronic format. OCR (optical character recognition) software is good, but far from perfect. Also, scanning books can be a long and involved process. The easiest thing to do by far, is scan every pair of pages into .tiff or .jpg files, cut them in half with a graphics programme, and then stitch them into a .pdf file. This also has its drawbacks (such as lack of searchability, and the fact that most people don't take the time to develop a table of contents for them, but it also preserves the information inside the text.

Then there's the whole matter of copyright. Fair use says that you can scan a book into an electronic document for personal use only but you can't legally redistribute them, and libraries potentially do just this (just look at the bank of photocopiers in your average library for more information).

There isn't a fast or easy way to preserve texts that are being thrown out and possibly lost for all time. Painful to admit, but true. You also can't rely upon the goodness of most people to help out if one happened to say, "Hey, scan in that book to preserve it before you throw it out," because most people will not have the time, the energy after work, or simply won't care to OCR a book before they get rid of it. It's simply not feasible.

It seems that the brains and brawn of Los Alamos National Labs are getting fed up with the random drug and polygraph testing policies going into action, and they're not going to take it. It's been proven in labs several times over the years (since their inception, actually - the FBI has released files on the research) that polygraphs aren't a very good way to figure out what kind of person someone is (which is part of the purpose of taking a poly for a clearance), and the prospect of being roped into one at any time for no good reason doesn't really make for a good work environment. This says nothing of how it can derail your train of thought when you're on a roll in the lab...

Note to self: Be very, very careful when using the --newuse option of emerge. This is going to take a while...

And, at this rate, migrating to a real weblogging system is going to kill me. Does anyone know of a good package that won't force me to rip out my entire website and re-enter everything (over five years of work)?

The UK is having kittens over the new traveller information sharing directives that the United States demanded: The way the laws were written, if they want to fly to the US the credit card information and e-mail traffic, if available, have to be turned over for inspection and archival. Failure to turn over this information can cause a traveller to be refused entry to the United States. It remains to be seen how they're going to send the US the e-mail traffic that goes to and from the e-mail address you gave to your travel agency, but considering that the European Union has passed laws that require ISPs to record all network traffic to and from every IP address they issue, this might prove easier than expected.

The obvious way around this is to use a webmail address only for buying things, and thus separating the information that is necessary for a company's bookkeeping from your personal life.

To quote Bugs Bunny, "Did you ever get the feeling that you wuz bein' watched?"

The Month of Apple Bugs has begun.

Gee.. ya think he's a browncoat?