The Partially Dynamic Web

8 Dec 2024

Tags: web, ocaml

Background

I have three websites (this one, my personal site, and one for my luthiery endeavours), and despite each starting out with a different technology stack, for the last few years I’d migrated them all to the Hugo static site generator, as a way of making it easier for me to mess around with. Without a fixed database, I could more easily readily structure the content as I wanted it, I had more freedom over templating, and ultimately it’s less resource intensive to compile the site occasionally and just serve static files than keep dynamic infrastructure running for what is a set of low traffic websites. At least in theory, we’ll come back to this last point.

Like most static site generators, Hugo uses a system called Front matter, where you store each page as a markdown file of content, and some YAML at the top of that file to store metadata, such as page title and publication date, which isn’t something markdown supports. With these two parts Hugo can generate your website based on where files are in directories, and with the appropriate bits from the front matter. Hugo will use a template system to turn your markdown into HTML files, roughly following the structure of the folders you store the markdown files in.

The templates for my sites I’d made by hand myself, which I think is a key part to unlocking the power of Hugo. Because not only can you decide how your markdown looks from your template, but you can also query the front matter, and so change how your page looks based on the metadata. I heavily used this feature, using it a bit like a database entry for each page. This let me add a synopsis to each page, or a title image with alt-text that becomes the thumbnail on the list views. For photos I store all the EXIF data in there too.

The final feature of Hugo that is very powerful is that it lets you go beyond the standard markdown to HTML rendering by adding short codes - so in addition to standard markdown notation for links or images, you can add your own. So I added some to embed YouTube videos and audio, and I made my own image tag that gave me more control over rendering and let me specify an alt text etc.

Between all this, Hugo worked pretty well for me, and was much lower maintenance than running a dynamic site that requires a database to store all the content in etc. But in the end I’ve replaced it with my own semi-static-but-actually-dynamic system, and I wanted to make some notes as to why.

Motivation

Firstly, let’s talk about the resource usage. In general, I still think a static site is going to have lower overall resource requirements than a dynamic website, and I think that’s true for two of my three websites still. However, for my personal site, it was demonstrably worse. My personal website has content going back over 20 years, and contains a lot of high resolution media in my photos sections. I have about 10k pages, but then when you add thumbnails and display sized images and all other things, that goes up to about 70k resources which Hugo had to prepare for display. I’m not that famous or interesting, and so most of those pages are never going to be looked at in a given unit of time, yet if I make a change to my templates, they all get recalculated, and that’s a lot of files to generate and copy to my server just for no one to look at them.

I treat the source material for my website as an archive, and I keep in it all the images and video data at the highest resolution I have and then scale it down for the website at compile time. Even though I’m keeping very high resolution primary data, my source directory is 8GB of data, the compiled static website is 12.5 GB currently. That’s a lot of bytes that no one is going to look at. And I have to keep both a copy of the raw site and the compiled site so I don’t need to rebuild all of it every time, so I’m over 20GB on disk.

So, in terms of resources alone for my personal website, I think it’s safe to say that even with the sensible caching that Hugo does, the static site is somewhat wasteful at this scale.


The next motivation for change is how Hugo handles Taxonomies. That is to say, alternative structures to present data from the raw “here is a list of things over time”. An easy example of that is albums of photos. I have my main feed of photos in the website, but I also like to group them into thematic albums. Hugo lets me express this, but the way it does this has to be constrained by the fact it’s compiling to raw HTML files. So I can generate a page for the album, but when you click through to an item in the album, there is only a single page for the photo, so the previous and next links are just to the global feed previous and next, not the album version. This makes somewhat sense, otherwise it’d have to generate copies of each page, and for my photos that’d cause that 12.5 GB to shoot up even more, clearly not desirable. But the fact that albums can’t have forward/back buttons that keep you in the album annoys me, because this sort of arrangement is something I do a lot across all my sites. The correct solution is you need to know at page render time how the visitor got to a page to generate the right forward/backward links, and so you need a dynamic renderer.


A minor one, but I've been learning Swedish, so I have a small number of posts that are in both Swedish and English. Being a static site, I can't have a page served in either Swedish or English, I have to give each their own URL, but then linking becomes challenging if I don't want to duplicate each page,


The final point I hit up with on Hugo was just it’s built to do one style of website and do it well: one with lists of pages that you drill down into. It’s absolutely great for that, but I found that at times I wanted to say generate a list of items and not have a page associated with them, just the list, and to do that I’d still have to make the page and not link to it. Sort of virtual pages that are data driven is something Hugo seems to be slowly coming to, but ultimately Hugo needs some structure and folk like me will always find corner cases where it doesn’t work for them.

Rolling my own semi-dynamic site

So, whilst I still would recommend Hugo as an excellent static site generator, as someone who likes to play with their websites and cares about how the content is structured, I’ve decided to make my own dynamic website renderer, called simply Webplats.

The design goal is that I want a dynamically evaluated static site. This means the content will be stored just as it was for Hugo, with a series of frontmatter and markdown files on disk for all the content, but Webplats will do the rendering on demand, so I don’t need to generate thousands of pages and thumbnails and resized images that are never going to be viewed. It also means I can take over the mapping from content to URL, and so I can fix that problem with having many views on the same content, with the content rendering being aware of how the viewer got there.

My other goal is that I’m not trying to make something that will work for everyone. How I use Hugo was highly customised, and that’s what I’m going to support. All there of my websites use the same set of tricks and extensions I’d built using shortcakes and custom template logic on top of Hugo, so I’ll build something generic enough to support them, but I’ve no interest in maintaining some general purpose bit of software for people like this. It’s open source to act as inspiration to others perhaps, but that’s it. I think the power here is that I can tailor this to just what I want, and keep the foot print small and manageable as a hobby platform.

Webplats

To my surprise, getting something up and running took a few hours, and then it took me another week or so to get it to where I deployed it, just doing an hour or so a day. As a spare time project, I'm quite amazed how fast I went from idea to having my personal site deployed with this.

It's not the cleanest of implementations: the code is in flux as I'm still figuring out things, and it'll be interesting as I move my other sites over to it, as I've hardcoded certain things for my personal website. But on the other hand that site is by far the most complicated one, so it's a good place to start. In the transition I’ve certainly still got a bunch of things that are broken, but it’s not a huge amount, and I already have improvements in terms of say album links working correctly now. This is the joy of doing it on my personal website, which is low traffic and low expectation - a small amount of regression really won’t be noticed, and if it is it’s not really that important.

I’m using the Dream library for OCaml, which has both built in routing and templating. I made sure to keep the URL layout that Hugo used as best I could, so in theory the transition shouldn’t be noticed by most people, as all the content remains at the same URL it was.

Using a functional language for this kind of work actually maps very nicely: all I’m doing is taking data in one format and presenting it in another, so functional transforms are what I need. The way the website is stored on disk for a static site generator means I'm mostly doing a translation of that structure into the URIs for the website, so I was starting from a good place for this project.

Thanks to Hugo encouraging me to use shortcodes for all resources in a page (I never used the markdown image tags), it was low effort to ensure all resources in a page have their own URL to render them on demand, as I don’t ever need to parse the markdown myself beyond pulling out shortcodes. For images I’m just using Camlimages which is quite an old library and doesn’t support all the image formats I have acquired over 20 plus years, but it’s enough to get started with. Performance wise, this will be a regression, as images are resized and stored in a small cache the first time they’re viewed, but given most people consume my site via RSS, when I add a new page and look at it myself to check it works, it’ll mean for most folk they don’t see that.

The aim so far has just been to get as close to the Hugo version as I can without changing the data on disk. What I'm looking forward to doing now I've switched is making changes to the on disk representation to let me simplify the OCaml code, and add some new fun features.

Digital Flapjack Ltd, UK Company 06788544