Frankensteins's ___.sh 2 - Blog

2014/08/28

Frankensteins’s ___.sh 2

Or why build a static website engine in 84 lines of bash.

Before

The previous version of the engine “powering” this website was nnmc, written in php, running on Apache 2.x and making use of Apache’s mod_php and mod_rewrite. The initial setup needed to make it run properly on my local OSX machine using Apple-provided software was a little annoying (edit /etc/httpd.conf to activate mod_php and mod_rewrite) but usage was straightforward after that: start Apache by activating the “web sharing” in the Settings.app, edit the files in a content folder and go to http://localhost/~user/nnmc/ to see the changes.

The end of the big cats

When I upgraded to Mountain Lion, two years ago, Apple changed the default Apache setup and removed the option to start “Web Sharing” from the user interface. I had to figure that out when I wanted to preview a post the first time after the install.

Last year, Apple released the new version of OSX, Mavericks. As the previous release, installing it broke my local Apache & nnmc setup (I think it overwrote my httpd.conf). This time I didn’t feel like trying to fiddle with the bloody setup again, only to have to go through it all over again for the next release.

Recently

For a while now, I also wanted to turn www.niconomicon.net into a statically generated website. Part of the reason being that “baked” sites were a fun fad, like the “fixie” bicycle, but also because statically generated sites are fundamentally simpler to deal with. They are also easier to optimize (not that nnmc has any problem handling the stray traffic it gets).

nnmc was already simple. I had removed the edit capabilities, all the pages were based on directory listings or markdown files. However there is still the need to deal with newer versions of PHP, locally or on the host, changing version or names of apache modules etc… It also was my first “real” php project¹, as well as a heavy modification of a (now abandoned) wiki. I hadn’t updated it in years even though I kept meaning to add an RSS feed for the blog. So, past time for a full replacement.

Jekyll and Hyde

Static websites generators, like Jekyll, Hyde ², Pelican or Octopress are conceptually simple. You install them on your local computer, you set up a content directory of plain text files that you want to publish. Their script use them to create a bunch of static HTML files, which you then upload to your webhost.

Plain text files are easy to back up, easy to edit, very cross platform and editor agnostic. Static files are the simplest thing to serve, so there is no security issue, no weird caching problems, no weird crashes because someone fed your site a strange url or your page import the wrong module (and no need to keep up with Wordpress security advisories).

One of the costs is that it is usually very complex to update your site when you’re away from your computer³, unless you use something like github pages ⁴. If you want to see what your new content looks like, there needs to a local webserver and when the content changes the site needs to be re-generated, but most packages will have an embedded scripts that deals with that.

___.sh

At the beginning, the site for my iPhone app, Displayator.com was just 4 HTML pages. I had already decided that I didn’t need the full power of Jekyll or Hyde, mostly because I didn’t want to deal with Python and its eggs or Ruby and its gems (and I definitely did not want to have to keep a Wordpress install up to date). But I didn’t want to write full HTML pages by hand either, so I wrote a few script that just cat-ed and multimarkdown-ed some files together. And it worked fine, until I had some news to announce.

As using simple bash scripts worked, I decided to spend an evening or two to see if I could generate a full blog with just a simple script. After a few evenings, ___.sh 0.1 was created and powered the Displayator blog. I tweaked it a bit until I had a 1.0, and later added an Atom feed generator.

What’s in a name

After reading about the static site generators Jekyll and Hyde, two names out of the 1800’s fantastic literature, I remembered the famous novel, “Frankenstein, or the Modern Prometheus”.

At the beginning of the story (chapter 4 & 5), the scientist Frankenstein obtains dead body parts and joins them together to assemble his creature, before infusing it with life. And being horrified by (at first) the appearance of his creation ⁵.

There is some easy parallel to ___.sh, which I created by joining together bits of 70’s, 80’s and 00’s tech, in a way that might appear unsightly, but turns out to be quite effective (it’s also using macro, assembling commands before ~~bringing them to life~~ executing them)⁶. Further parallels can also be drawn between the story and the way the script itself assemble a bunch of ~~dead~~ static files and bring them together into a (potentially) live website, much like the crazed scientist does to the bits he gathered.

I looked for the name of the creature and found out it was unnamed, variously referred as “it”, “the creature” or “the monster”. I thus decided to name my script ___.sh. I tried to name the whole project ___ but it turns out Google won’t find you if you do, so the project was named “Frankenstein’s”.

___.sh 2.0

When I was ready to replace nnmc, I realized that ___.sh was only 42 lines and already had a blog with a feed. And surely, the blog was the most complex part of the website. I just needed to add a few more tricks⁷ to ___.sh and I’d be set with a fully home-grown statically generated website!

So I did! It doubled the line count⁸, took a few more evening than anticipated but it is now replacing ten times as much PHP⁹ “code”. It fulfills the same function but with what feels like less hassle, and it definitely has better¹⁰ separation between code and HTML¹¹. As I also wanted the blog feed to work nicely with Flipboard, I changed it from Atom to RSS. Another change from 1.0 to simplify the code was to use absolute URLs in the links, both for generated ones and for the CSS import.

The only functionality of nnmc that I missed was the next/previous navigation in blogs, but I added that with a bit of vanilla JavaScript after I released 2.0. I also miss a little the clean urls provided by mod_rewrite, but so far not enough to try deal with it again¹². To avoid losing my web traffic, I added mod_redirect directives in the .htaccess to point the old clean URLs to the new dirty ones.

An anatomy

There are 4+1 types of contents on my site: the blog posts, the notes, the projects, the free pages + the homepage:

the blog is the usual list of articles organized in reverse chronological order.
notes are just a set of articles sorted by folder and name.
projects are a set of short and long descriptions arranged thematically.
free pages are unsorted articles.
the homepage is a combination of a free page and the latest blog post (plus the titles of the last few posts).

All these different types of content are written in markdown and stored in individual files.

___.sh transform each markdown file into a set of HTML files that appropriately link to each other to form a website, without using loops.

The basic idea is to apply multiple kinds of transformation to each file and directory according to the desired output. Most transformations involve multimarkdown and various bits of HTML which get concatenated together to form a web page. The common HTML bits are stored in _[part].html files that you have to customize to get a pretty output¹³ (There is also _feed-top.xml for the RSS feed).

To avoid (explicit) ‘for’ loops, it uses pipes (|) and sed to

transform the output of ls or find into a set of commands that is then evaluated by bash. Each of the content categories require variations, but the principle is the same:

[list files] | [create a set of commands] | bash

I’ll explain the details in a series of posts over the coming months. If you want to read them as soon as they’re published, subscribe to the brand new RSS feed, or follow me on Twitter or App.net.

Usage

If you want to test “Frankenstein’s ___.sh”, installing and using it require familiarity with the command line, but should be otherwise straightforward. Usage is better described in the README but here is the “tl;dr” overview.

The non-OS prerequisites are git and multimarkdown somewhere on your path. The OS prerequisite is the usual shell environment commands (bash, cat, cut, echo, find, grep, head, ls, sort, uniq) and, if you want to use the blog feed generation, the BSD version of date (running date -j -f "%Y/%m/%d %H%M" "2014/08/27 0633" "+%a, %d %b %Y %H:%M:%S %z" should work), which means you have to run a *BSD or Mac OS X.

If you’ve cleared the above hurdles, open a Terminal or console and follow these instructions:

Check out the “Frankenstein’s” project on your computer:

git clone https://github.com/nicolasH/frankensteins.git
Go into it the freshly checked out directory and run ___.sh init to create the empty subdirectories and copy all the default page components in the content drectory.
Execute ___.sh new "Some Title" to generate a new, mostly empty, blog post to fill. Fill it, with Lorem Ipsum if you must.
In another terminal, start a webserver in the content directory:

python -m SimpleHTTPServer 9001

or

ruby -rwebrick -e'WEBrick::HTTPServer.new(:Port => 9001, :DocumentRoot => Dir.pwd).start'
In the first terminal, run ___.sh gen¹⁴. This will generate the site. There will be error messages because the only content is a single blog post, but at least the index and blog pages will work.
Navigate in a web browser to http://localhost:9001.
Recoil horrified at the wretchedness you have brought forth upon this earth.
With all due haste, edit the _*.html , _*.xml and _*.css files in the content directory to remove the scaffolding and add something more to your taste (you might want to create pages/home.md to liven up the home page, see README).
GOTO 5 until 7 does not apply anymore.

Once you are satisfied with the results, you just need to upload the contents of the content directory to your site (an exercise left to the reader). Happy tweaking!

Once again, feel free to subscribe to the brand new RSS feed, or follow me on Twitter or App.net if you want to be informed of new posts, not all of which will be about ___.sh.

I didn't have much experience with php best practices. ↩
Hydes seems to be dead, but pelican seems to be alive and well. ↩
an option is to have an always–on computer that monitors a Dropbox folder, and automatically runs a script that regen & republish the site when the folder changes. ↩
github pages hosts your jekyll site off a github git repo. Anytime you push to the repository, Github regen your Jekyll site. You can edit your repository from the Github web interface, so you can update your site from any connected computer. ↩
This is but half a chapter of the book. ↩
Hopefully I won't be hunted down by this particular creation of mine. ↩
a better home page, support for notes, projects listing and unclassified pages. ↩
from 42 to 84. 36 lines are printouts, empy or comments. I'll have to look if I can shrink the total further. There are also 154 lines in template files that get cat–ed by ___.sh. ↩
948 at last count. the php files contains all the html , strewn all over the place (It was my first “big” php project after all). ↩
there is a little bit of html & xml in ___.sh ↩
readability is about the same though (it is a bash script using sed to write bash scripts after all), but maintainability of the html part is much better, and all the logic is in one file. ↩
I wrote a bunch of mod_redirect directives from the clean to the .html urls. ↩
Most proper static website generators use one or more templating engine, but with the goal of only one dependency (multimarkdown), Frankenstein's only use raw bits of html. ↩
For information, generating the niconomicon.net website ___.sh gen, with 70+ pages takes 1.7s on my MacBook Air ↩

Archives / RSS