Sigurdhsson

I: The Great Reset

A new way of living in the blogosphere

The internet is evolving. New technology enables us to do things we never thought possible only five or ten years ago. But as it evolves, our old habits and methods are pushed to the limit. Poor caching principles that bypass the built-in behaviour of the web servers make bandwidth an issue. Unoptimised server scripts bring servers to their knees during unexpected traffic surges. Sometimes, pages are even regenerated for no good reason at all, other than updating third-party content (which may not have changed). More recently, even good old URIs are being thrown out the window.

In this modern environment, when content is expected to appear instantly (or suffer from the notorious impatience of the average visitor), caching and proper optimisation of dynamic content is key. With new technology, there is little need to serve pages dynamically, much less in a monolithic manner. Most pages, barring the need for third-party content or user contributions, need no dynamic generation at all, and even with the need for such content dynamic generation can be kept at a minimum.

We should be entering a new era here. Be not afraid.

Old ways and new technology

Back in the early days of internet fame, before Google, before the dot-com-bubble and before blogging became an industry, dynamic pages was something high-tech that few felt the need to touch. Most content was statically published, each page rewritten from scratch. Of course, this wasn’t very convenient, which is why more convenient methods were invented. These methods were soon followed by CGI, and the popularity of dynamic content generation was a fact.

But the truth is, most of our web pages could do nicely without server-side scripting. In fact, most of our web pages (at least those in the blogosphere) only utilise server-side scripting for a few basic tasks:

But do we really need server-side scripts to do these things? What if there were technologies that enabled us to go back to the old ways of publishing static files while still being able to do all the aforementioned things? Well, guess what — there is!

Including (non-essential) dynamic content can easily be done using asynchronous javascript-based requests now that practically all browsers in use provide adequate support, provided that you use some kind of library to do the grunt work for you (for bonus points, load these libraries from a huge corporate CDN to save bandwidth). Make a request using a well-defined RESTful API, fiddle around with it and insert it into the DOM — suddenly, we’ve eliminated one need for server-side scripts.

Receiving comments can be done in a similar way. Of course, one cannot avoid server-side scripting here (unless the commenting system is outsourced, which frankly is a fully adequate solution for most blogs), but using those asynchronous requests we can at least avoid regenerating the entire blog post in the event that someone posts a comment. Generate only what is absolutely necessary on the server, and let the browser do most of the work.

That leaves us with only one reason to use server-side scripting; templating. Fortunately, a plethora of tools designed to solve this very problem has emerged in the last year or so. These tools allow you to create templates separate from your content (optionally stored in more content-friendly formats than HTML), which you then use to “compile” your site into a final, static product, ready to serve by any web server.

Combining these technologies allows us to minimise load on the server while keeping the dynamic experience we all know and love. One minor drawback, of course, is the lack of revision management that some blogging tools provide. As we shall see, this functionality is easily replicated.

Bringing it all together

So, how does one set up such a system? Well, most of it (all bits concerning asynchronous requests) will be implemented in the templates of your statically published blog. Therefore, what you’ll have to set up is the template, the static publisher (I’ve grown very fond of nanoc) and possibly some kind of automated process to push your compiled blog to the server; a rudimentary way of doing this is simply using rsync, but I find that a Mercurial-based system is more elegant, and that way I also get the aforementioned revision management of both content and template.

My approach was simple. I started by creating a new nanoc site in a new Mercurial repository. This repository was set up on my server and cloned onto my laptop, meaning I can edit and write on my laptop and then push my changes to the server. Server-side, Mercurial was set up with a hook to copy the final product to the appropriate folder on my server. Thus, my work flow when adding content to my blog is as follows (where tasks 2–4 are performed using a rakefile):

  1. Create a new file containing metadata and content for a new blog post
  2. Compile the site using nanoc co
  3. Add files and commit to Mercurial using hg ci -A -m'<message>'
  4. Push the changes to the remote repository

This procedure can’t get simpler; I edit a file (using anything but vim, of course), followed by typing rake in my terminal — and hey presto, a new blog post!

The benefits of a read-only blog

Having established that it is unnecessary to use server-side scripts, and that it is remarkably easy to publish sites without them, it is time to highlight some of the benefits of having a statically published blog. They are few but significant.

Firstly, your performance will increase by a magnitude. When Apache (or your web server of choice) no longer has to start up CGI scripts or fiddle around with modules like mod_php, it will be able to serve your pages almost instantly. Every request is reduced to a simple file system access, and your web server can even take advantage of advanced caching techniques such as in-memory caching as well as the more conventional conditional GET methods.

And since you’re serving static HTML files, you know that your blog won’t go down because of some unexpected scripting error, perhaps caused by an update of some sort. Static files give you stability you can only dream of when using dynamically generated content; if something breaks, it will break before you publish your files, giving you plenty of time to fix it.

Finally, static publishing gives you security. Since your visitors will only be doing safe requests with no side effects, there’s no way for anyone to exploit a security risk in on any of your pages. Security risks are reduced to those provided by your web server (hopefully none) and third-party services such as Disqus (which won’t affect your data). Unrivalled security.

Final thoughts

Of course, this method of publishing doesn’t suit everyone. I find that it’s especially suited for blogs and project sites (you know, the ones you get your mac apps from), maybe even the odd magazine, but would argue that more advanced web experiences (web-apps and those social network things) and communities (both forums, image boards and circle-jerks) have needs that extend far beyond the current and, most likely, future possibilities of static publishing.

What I’m really trying to tell you here is to use the right tool for the right job — there’s no need to bring out the sledgehammer when you’re trying to put up a painting. Use the hammer. Or your shoe.

Comments

powered by Disqus