GSCA - acronym, verb - Using grep, sed, cut, and awk on a Linux or UNIX box to chop up, mangle, or otherwise process data on the command line prior to doing anything serious with it. This is not to preclude the use of additional tools (such as sort).
If you've been squirreling away information for any length of time, chances are you tried to keep it all organized for a certain period of time and then gave up the effort when the volume reached a certain point. Everybody has therir limit to how hard they'll struggle to keep things organized, and past that point there are really only two options: Give up, or bring in help. And by 'help' I mean a search engine of some kind that indexes all of your stuff and makes it searchable so you can find what you need. The idea is, let the software do the work while the user just runs queries against its database to find the documents on demand. Practically every search engine parses HTML to get at the content but there are others that can read PDF files, Microsoft Word documents, spreadsheets, plain text, and occasionally even RSS or ATOM feeds. Since I started offloading some file downloading duties to yet another bot my ability to rename files sanely has... let's be honest... it's been gone for years. Generally speaking, if I need something I have to search for it or it's just not getting done. So here's how I fill that particular niche in my software ecosystem.
Some time ago I wrote an article of suggestions for archiving web content offline, at the very least to have local copies in the event that connectivity was unavailable. I also expressed some frustration that there didn't seem to be any workable options for the Chromium web browser because I'd been having trouble getting the viable options working. After my attempt at fixing up Firefox fell far short of my goal (it worked for all of a day, if that) I realized that I needed to come up with something that would let me do what I needed to do. I installed Chromium on Windbringer (I'm not a fan of Chrome because Google puts a great deal of tracking and monitoring crap into the browser and I'm not okay with that) and set to work. Here's how I did it:
First I spent some time configuring Chromium with my usual preferences. That always takes a while, and involved importing my bookmarks from Firefox, an automated process that took several hours to run. I also exported everything I had cached in Scrapbook, which wound up taking all night. I then installed the SingleFile Core plugin for Chrome/Chromium, which does the actual work of turning web pages open in browser tabs into a cacheable single file. I restarted Chromium, which I probably didn't need to do but I really wanted a working solution so I opted for caution and then installed PageArchiver from the Chrome store and restarted Chromium again. This added the little "open file folder" icon to the Chromium menu bar. The order the add-ons are installed in seems to matter, add SingleFile Core first if you do nothing else.
Now get ready for me to feel stupid: If you want to store something using PageArchiver, click on the file folder icon to open the PageArchiver pop-up, click "Tabs" to show a list of tabs you have open in Chromium/Chrome, click the checkboxes for the ones you want to save, and then hit the save button. For systems like Windbringer which have extremely high resolution screens, that save button may not be visible. You can, however, scroll both horizontally and vertically in the PageArchiver pop-up panel to expose that button. I didn't realize that before so I never found that button. That's all it took.
Here's what didn't work:
I can't import my Scrapbook archives because they're sitting in a folder on Windbringer's desktop as a couple of thousand separate subdirectories, each of them containing all of the web content for a single web page. I need to figure out what to do there. It may consist of writing a utility that turns directories full of HTML into SQL commands to inject them into PageArchiver's SQLite database which, by default, resides in the directory $HOME/.config/chromium/Default/databases/chrome-extension_ihkkeoeinpbomhnpkmmkpggkaefincbn_0 (the directory name is constant; the jumble of letters at the end is the same as the one in the Chrome Store URL) and has the filename 2 (yes, just the number 2). You can open it up with the SQLite browser of you choice if you wish and go poking around. Somebody may have come up with a technique for it and I just haven't found it yet, I don't know. I may not be able to add them in any reasonable way at all and have to resort to running an ad-hoc local web server with Python or something if I want to access them, like this:
[drwho@windbringer ~]$ python2 -m SimpleHTTPServer 8000
I've been promising myself that I'd do a series of articles about tools that I've incorporated into my exocortex over the years, and now's as good a time as any to start. Rather than jump right into the crunchy stuff I thought I'd start with something that's fairly simple to use, straightforward, and endlessly useful for many purposes - a wiki.
Usually, when somebody brings up the topic of wikis one either immediately thinks of Wikipedia or one of the godsawful corporate wikis that one might be forced to use on a daily basis. And you're not that off the mark, because ultimately they're websites that let one or more people create, modify, and delete articles about just about anything one might be inclined to by using only a web browser. Usually you need to set up or be given an account to log into them because wiki spam is to this day a horrendous problem to fight (I've had to do it as parts of previous jobs, and I wouldn't wish it on my worst enemy). If you've been around a while, when you think of having a wiki you might think of setting up something like WikiWikiWeb or Mediawiki, which also means setting up a server, a database, web server software, the wiki software, configuring everything... and unless you have a big, important project that necessitates it, it's kind of overkill and you go right back to a text file on your desktop. And I don't blame you.
There are other options out there that require much less in the way of overhead that are also nicer than the ubiquitous notes.txt file. For the past couple of years (since 2012.ev at least) I've been using a personal wiki called Tiddlywiki for most of my projects which requires just a fairly modern web browser (if you're using Internet Explorer you need to be running IE 10 or later) and some room on your desktop for another file.