Another site migration

29 November 2020

It seems that I still I can never leave well enough alone (as anyone who's known me for a while an attest to). While on Thanksgiving break I found myself needing to tinker more once I'd gotten my other projects out of the way. So. I decided to do something about upgrading my website.

As much as I've enjoyed using Bolt to manage my site over the last couple of years, the v4 series is going in a direction that I'm not entirely sure that I can work with. My knowlege of PHP is, to be honest, minimal at best and one of the things on the Bolt development roadmap is to stop supporting flattened versions of their CMS - the specifics of this elude me, but basically it has to do with how it's deployed at Dreamhost and how the other content on my site is made available. I don't have any particular faith in my ability to figure out how to make it work and, to be honest, I've been getting a little frustrated with the lack of responsiveness of PHP applications at Dreamhost for a while. So, I decided to raid the Awesome CMS list to look at some of my options for a new system.

After some reading (and taking a lot of notes) I eventually settled on using Pelican for the latest iteration of my site. It's written in Python (like a lot of stuff I like to use), has a bunch of plugins that seem handy, good documentation, and for the most part uses tools that I already use for managing and backing up other data. This means that I'll (eventually) be able to set up an automated workflow for posting articles.

Yeah, that's one thing I'm giving up by moving to a static site generator - I can't write a post online, post-date it, and have a background task automatically take it live when the post-date lines up with the actual date. I'll have to figure out how to do that soon but have some ideas. On the other hand, it didn't take much work to figure out how to configure file locations and suchlike, Pelican supports RSS and ATOM feeds, and it lets you build your site locally for testing.

Bolt also has the advantage of letting you dump the contents of your site's database to a file for processing. This is what I did when migrating my site so I didn't have to figure out how to hack a MySQL database dump because it's way easier to parse a JSON file:

drwho@server: $ app/nut database:export -f ~/ -vv

Once in this form I was able to doodle some research code in my wiki (I'll explain in a later post) and figure out how to split that JSON document into separate Markdown files. If you've never used Markdown before but you habitually make to-do lists on paper, it's not that different. Additionally, there are so many different Markdown parsers out there that it would be fairly easy to turn it into other file formats. Pelican can use the Python Markdown module to do this. In case anyone's curious, here's what I did to split everything into separate files:

import json

file = open("", "rb")
blog = json.load(file)

pages = blog["pages"]
entries = blog["entries"]

blog_base = "/home/drwho/"
pages_base = blog_base + "/pages/"

# Write the pages out into the correct directory.
for page in pages:
    print("Writing file:" % page["slug"])
    file = open(pages_base + page["slug"] + ".md", "w")
    print("Title: " + page["title"], file=file)
    print("Date: " + page["datepublish"], file=file)
    print("Status: " + page["status"], file=file)
    print("Slug: " + page["slug"], file=file)
    print("\n", file=file)
    if page["teaser"]:
        print(page["teaser"], file=file)
    if page["body"]:
        print(page["body"], file=file)

# Now the fun part: The blog posts.
for post in entries:
    print("Writing file:" % post["slug"])
    file = open(blog_base + post["slug"] + ".md", "w")
    print("Title: " + post["title"], file=file)
    print("Date: " + post["datepublish"], file=file)
    print("Status: " + post["status"], file=file)
    print("Slug: " + post["slug"], file=file)

    # Assemble the list of tags before writing them out.
    # It's an array of hashes.
    if post["tags"]:
        tags = ""
        for tag in post["tags"]:
            tags = tags + ", " + tag["name"]
        # Yuck.
        tags = tags.strip(",")
        tags = tags.strip()
        print("Tags: " + tags, file=file)

    print("\n", file=file)
    if post["teaser"]:
        print(post["teaser"], file=file)
        print("\n", file=file)
    if post["body"]:
        print(post["body"], file=file)

Unfortunately, there is bound to be some cruft in the migrated content (as much as I dislike using that word) but at least I can make creative use of GSCA to fix a great deal of it. More such inconsistencies will be fixed (and dumped HTML will be converted into Markdown) as time goes on.

If the overall appearance of my new site looks familiar, it's because someone ported the HTML5up Striped theme to Pelican as well as Bolt in the past. I'm already familiar with how to tinker with this one in particular which is why I decided to go with it. Maybe it'll change in the future, I haven't decided yet.

As for source code and command line formatting I stumbled across something called prism.js late last night. Like the Bootstrap framework, you can pick and choose the parts you want and download a customized version. After that it was a simple matter to add the prism.css and prism.js files to my fork of the theme. Much to my surprise it just worked without any messing around. But I'm getting off topic.

If you're reading this, then my workflow as it stands now is working. Nifty.

Per usual, if you notice anything broken or weird, please contact me through the usual channels and I'll see to it.