I'm still alive. No, I didn't party too much on my birthday. Just about all of last week consisted of twelve hour days of nothing but meetings with several times the number of people I'm accustomed to handling simultaneously. Additionally, I was working on a music review for Vampire Step-Dad, which required a pair of studio grade noise-cancelling headphones and listening to tracks repeatedly. I seem to have given myself a case of sensory overload, because now I feel numb all over... I also attended Pantheacon last weekend, which did a number on me. I realize that I could (and should) have holed up in my hotel room with a pair of earplugs in to recuperate, and there was no shortage of signs on Saturday morning that I should have done so. Signs, I hasten to add, that I disregarded in a perhaps inadvisable attempt to push my capabilities a bit farther than normal.
Minor repairs are required for parts of my exocortex as a result of pushing myself too far.
I have a timed post or two set to go up this week, but I'll be spending as much time as I can offline to recuperate.
"It implements new policy designed to deter illegal immigration and facilitate the detection. apprehension. detention. and removal of aliens who have no lawful authority to enter or remain in the United States."
"Additional agents are needed to ensure operational control of the border. Accordingly, the Commissioner of CBP shall immediately begin the process of hiring 5,000 additional Border Patrol agents and to take all actions necessary to ensure that such agents enter on duty and are assigned to appropriate duty stations as soon as practicable."
"Section 287(g) of the Immigration and Nationality Act authorizes me to enter into an agreement with a state or political subdivision thereof, for the purpose of authorizing qualified officers or employees of the state or subdivision to perform the functions of an immigration officer."
"... I am directing the Director of ICE to engage with all willing and qualified law enforcement jurisdictions for the purpose of entering into agreements under section 287(g) of the INA. Additionally, I am directing the Commissioner of CBP and the Director of ICE to immediately engage with the Governors of the States adjacent to the land border with Mexico and those States adjoining such border States for the purpose of entering into agreements under section 287(g) of the INA to authorize qualified members of the State National Guard, while such members are not in federal service, or qualified members of a state militia or state defense force under the command of the Governor, to perform the functions of an immigration officer in relation to the investigation, apprehension, and detention of aliens in the United States."
Has this memo been implemented yet? No.
Is it true that the White House did not draft such an order? No. Here it is, with appropriate citations. It exists and was timestamped late January of 2017, but it's not operational yet.
Let's say that you want to mirror a website chock full of data before it gets 451'd - say it's epadatadump.com. You've got a boatload of disk space free on your Linux box (maybe a terabyte or so) and a relatively stable network connection. How do you do it?
--continue - If you have to re-run the command, pick up where you left off (including the exact location in a file).
-e robots=off - Ignore robots.txt because it will be in your way otherwise. Many archive owners use this file to prevent web crawlers (and wget) from riffling through their data. Assuming this is sufficiently important, this is what you want to use.
--wait 30 - Wait 30 seconds between downloads.
--random-wait - Actually wait for 0.5 * (value of --wait) to 1.5 * (value of --wait) seconds in between requests to evade rate limiters.
http://epadatadump.com/ - The URL of the website or archive you're copying.
If the archive you're copying requires a username and password to get in, you'll want to add the --user=<your username> and --password=<your password> to the above command line.
Happy mirroring. Make sure you have enough disk space.
As I've mentioned a few times in the past, diverse parts of my exocortex monitor many different aspects of the world. One of them, called Ironmonger, constantly data mines the global stock markets looking for anomalies. Ordinarily, Ironmonger only triggers when stock trading events greater than three standard deviations hit the market. On Monday, 6 Feb at 14:50:38 hours UTC-0800 (PST), Ironmonger did an acrobatic pirouette off the fucking handle. Massive trades of three different tech companies (Intel, Apple, and Facebook) his the US stock market within the same thirty second period. By "massive," I mean that 3,271,714,562 shares of Apple, 3,271,696,623 shares of Intel, and 2,030,897,857 shares of Facebook all hit the market at the same time. The time_t datestamps of the transactions were 1486421438 (Intel), 1486421431 (Apple), and 1486421442 (Facebook) (I use time.is to convert them back into organic-readable time/date specifiers). I grabbed some screenshots from the Exocortex console at the time - check them out:
The tall blue slivers at the far right-hand edges of each graph represent the stock trades. I waited a couple of hours and took another set of screenshots (Intel, Apple, Facebook) because the graph had moved on a bit and the transaction spikes were much more visible. While my knowledge of the stock market is limited, I have to admit that I've never seen multi-billion share stock trades happen before. Out of curiosity, I took a look at the historical price per share of each of those stocks to see what those huge offers did to them. The answer, somewhat surprisingly, was "not much." Check out these extracts from Ironmonger's memory: Facebook, Intel, and Apple.
Because I am a paranoid and curious sort, I immediately wondered if there was a correlation with the large spike in the Bitcoin transaction fee earlier that day (at 13:19:16 UTC-0800, to be precise). The answer is... probably not. A transaction fee of 2.35288902 BTC (approximately $2510.93us as of 22:32 hours UTC-0800 on 7 February 2017, as I write this article ahead of time), while a sizeable sum that would certainly guarantee that someone's transaction made it into a block at that very instant does not mean that it was involved. There just isn't enough data, but it stands on its own as another anomaly that day. I wish I knew who put those huge blocks of stock up for sale all at once. The only thing they seem to have in common is that they're all listed on the Singularity Index, which is mildly noteworthy.
A couple of weeks ago I ran into some of the functional limits of my web search bot, a bot that I wrote for my exocortex which accepts English-like commands ("Send me top 15 hits for HAL 9000 quotes.") and runs web searches in response using the Searxmeta-search engine on the back end. This is to say that I gave my bot a broken command ("Send hits for HAL 9000 quotes.") and the parser got into a state where it couldn't cope, threw an exception, and crashed. To be fair, my command parser was very brittle and it was only a matter of time before I did something dumb and wrecked it. At the time I patched it with a bunch of if..then checks for truncated and incorrect commands, but if you look at all of the conditionals and ad-hoc error handling I probably made the situation worse, as well as much more difficult to maintain in the long run. Time for a rewrite.
I knew from comp.sci classes long ago that compilers use things called parsers and grammars to interpret code so that it can be converted into an executable. I also knew that the parser Infocom used in its interactive fiction was widely considered to be the best anyone had come up with in a long time, and it was efficient enough to run on humble microcomputers like the C-64 and the Apple II. For quite a few years I also ran and hacked on a MOO, which for the purposes of this post you can think of as a massive interactive fiction environment that the players can modify as well as play in; a MOO's command parser does pretty much the same thing as Infocom's IF parser but is responsive to the changes the user's make to their environments. I also recalled something called a parse tree, which I sort-of-kind-of remembered from comp.sci but because I'd never actually done anything with them, I only recalled a low-res sketch. At least I had someplace to start from so I turned my rebooted web search bot loose with a couple of search terms and went through the results after work. I also spent some time chatting with a colleague whose knowledge of the linguistics of programming languages is significantly greater than mine and bouncing some ideas off of him (thanks, TQ!)
But how do I do something with all this random stuff?
The Magick Poke - noun - When you touch a failing appliance, light bulb, or other gizmo in the just the right way as you're replacing it, and it spontaneously starts working again. This usually saves it from the trashcan or dumpster. Comes from the POKE command in Commodore BASIC which could let you do some pretty strange things by putting just the right value into just the right memory location, usually by fat-fingering a value.
UPDATE - 20170205 - Added Chrome plugin for the Internet Archive.
Note: This article is aimed at people all across the spectrum of levels of experience with computers. You might see a lot of stuff you already know; then again, you might learn one or two things that hadn't showed up on your radar yet. Be patient.
In George Orwell's novel 1984, one of his plot points of the story was something called the Memory Hole. They were slots all over the building in which Winston Smith worked, into which documents which the Party considered seditious or merely inconvenient were deposited for incineration. Anything that the Ministry of Truth decided had to go because it posed a threat to the party line was destroyed. This meant that if anyone wanted to go back and double check to see what history might have been, the only thing they could get hold of were "officially sanctioned" documents written to reflect the revised Party policy. Human memory's funny: If you don't have any static representation of something to refer back to periodically, eventually you come to think that whatever people have been telling you is the real deal, regardless of what you just lived through. No mind tricks are necessary, just repetition.
The Net's a lot like that. There are literally piles and piles of information everywhere you look, but most of it resides on systems that aren't yours. This blog is running on somebody else's server, and it wouldn't take much to wipe it off the face of the Net. All it would take is a DMCA takedown notice with no evidence (historically speaking, this is usually the case). This has happened in the past a number of times, including to an archive maintained by Project Gutenberg and documents explicitly placed into the public domain so somebody could try to make a buck off of them. This is a common enough thing that the IETF has made a standard HTTP error code to reflect it, Error 451 - Unavailable for legal reasons.