Weeknotes: 8th December 2025

Last week

AOHs of some trees

Off the back of the STAR meeting the week before I was asked if I could use my STAR pipeline to generate a set of AOH maps for some folk at the BGCI (Botanic Gardens Conservation International). They had a list of 1500 trees that are in the IUCN Redlist, and so it was relatively easy for me to extract those and send them over.

At least in theory...

Elevation maps

For a while my STAR implementation has lagged behind the official one maintained by Chess Ridley for the IUCN in that I'm using an older elevation map that is based on this set of elevation data used by the original STAR workflow the IUCN used, but since then Chess has moved to using a newer and more accurate elevation map, FABDEM. However, whilst this map is improved, it has some challenges:

The base tile set for FABDEM is very large and only accessible via a data portal at the University of Bristol. Even accessing it from within JANET (the network that UK universities reside on) it takes the better part of a day to download, or at least did last time I tried. This means that unlike the previous elevation map which is available on Zenodo and downloads relatively quickly, I don't feel I want this in an automated pipeline script.
There are some corrupted tiles in the original dataset around Florida's Forgotten Coast, and the corrected tiles are only available via Google Drive 🤦
FABDEM is based on the Copernicus GLO-30 digital elevation model, which at the time was missing tiles around Azerbaijan and neighbours, and so there is a gap in FABDEM around that area. Chess's solution to this was to patch in the GLO-30 data into FABDEM, but this means we have another data source to download, and FABDEM and GLO-30 are at different resolutions so some transformations will need to occur.

All of which is why I'd dragged my heals on this. The overall intent was (and is) that Chess will publish her modified FABDEM as a layer at some point on Zenodo, as an aid to reproducibility, and I was waiting on that which will likely drop with a manuscript she's working on currently, but now I needed to actually have the map so I could generate those AOHs for the BGCI, it was time to finally do the thing I'd been putting off.

Thankfully I had two head-starts on assembling my own version of this modified FABDEM map.

I had already downloaded FABDEM a year ago, saving me a day of nursing a download script.
My geospatial library Yirgacheffe has a concept of a "GroupLayer" where you can provide an ordered list of tiles and it'll take data from them as if they were a single large map, making combining the three data layers quite trivial.
I have access to a machine with 768 cores and 3TB of RAM to do all the processing on, as the FABDEM map is about 2.5TB of uncompressed data

Before I generated the full map, which is 1296000 x 504000 pixels, I first did a check to make sure that my pre-processing of the GLO-30 layer to merge it into FABDEM worked well.

A screenshot of QGIS software showing a monochromatic elevation map filling the window. Most of the map is greyscale, but in the middle there is a rectangular area that is shaded green. The hill ranges visible in the elevation map seamlessly transition between the two coloured areas.

I wrote a small script to pull the GLO-30 tiles from Open Topography, which I'd never used before, as they had the data that was up to date with the latest GLO-30 release and didn't require any API keys.

With that bit done, it was simply a case of merginging all this data: the corrected tiles over FABDEM over the processed GLO-30 tiles, all of which is now in a PR for my STAR pipeline, but things are never this simple...

Shared memory fun

I mentioned above that:

I was going to use Yirgacheffe to do the merging of the three datasets into a single map.
I was using a Very Big Computer™ to do so.

This all went well until I ran out of shared memory 🤦

Yirgacheffe has a parallelism option, whereby when building the output raster it'll split the job into chunks and farm them out over a number of CPU cores. You can specify how many CPUs to use if you have some insight into your workload, otherwise it'll default to using as many as it thinks it'll need, and given the size of this raster it opted to use all 768 cores on the Very Big Computer™.

Under the hood, Python's parallelism is somewhat problematic, but Yirgacheffe does its best to hide that. Python will use separate child processes rather than in-process threads to do the work, which is technically worse but Python has other limitations that make threads a bad choice for Python specifically (in Go or Ocaml or Swift I'd be using in-process threads for this sort of thing). This fact means that you have to think carefully about how you move data around, as Python has a very narrow and simple channel over which it can move data from the parent to the child and back again, and inefficiency here could potentially negate the benefits of the parallelism.

I solve this in Yirgacheffe by following in the footsteps of others that have had to solve this problem already (e.g., PyTorch), and I allocate a shared memory region that both the child and parent can access, one per child, so each child will calculate its area of the final raster and write that into the shared memory region, wait for the parent to say it's read it, and then go do the next bit of the map (as there are typically more chunks of work than CPU cores).

So far, so good, but as I alluded to, we have a very big map and although 768 CPU cores is a lot, the 650 billion pixels we need to calculate to build this raster is a much larger number. Thus Yirgacheffe needs to break down this map into smaller chunks, each child process will then take a list of chunks to work on, and slowly we build up the final image. Yirgacheffe's chunking algorithm is very naive, but it's taken until now for it to be a problem. It basically takes the target image, splits its area into 512 row chunks, and farms those out as the unit of work. The number 512 was picked a long time ago based on some performance measurements calculating AOHs of species using a much older version of the LIFE pipeline, but clearly should be workload dependant, as here we end up with quite large chunks, as this image is over a million pixels wide! So each chunk is generating 2.5GB of uncompressed image data (allowing 4 bytes per pixel as the result is a float32).

This is where things fell apart: 2.5GB * 768 is around 2TB of RAM. Which in theory is fine, as the Very Big Computer™ has 3TB of RAM, but by default Ubuntu sets a limit of 50% of memory being available for shared memory, and so I ran out of shared memory!

This in theory though should be fine: Yirgacheffe should be able to detect that as an error, and scale back accordingly. But unfortunately, due to how Python manages shared memory, when we run out it fails ungracefully, and just crashes with a SIGBUS, meaning Yirgacheffe can't do much to prevent a crash if it relies on the Python standard library. What happens, and thanks to Anil for explaining this, is that Linux will quite happily let you set up all the shared memory regions, over committing as it we go, and it only checks the space when you write to them, so at some point it notices that you don't have the space required, and has to crash out. If they used a different system call to allocate the memory (fallocate rather than ftruncate as they currently do), then we'd fail at allocation time and all would be well and we could scale back the number of child workers to fit within our means.

To unblock the AOH generation I simply manually scaled back the number of child workers Yirgacheffe generated, but I now have a ticket to go fix this properly, which will probably just be me monkey patching the standard library to do the correct thing until the ticket on Python itself for this issue is resolved.

Details in rasterisation of polygons

This wasn't the only set of issues that beset me generating this AOHs, I would another gnarly and interesting one to do with map projections and rasterizing polygon data, but that will be its own blog post in the near future, and I think you've all read enough for one blog post :)

Comparing elevation maps

Having made the new elevation map layer, I did a bit of sperlunking just to try understand what had changed. My gut instinct was that there wouldn't be much different between the two as both are reasonably recent and the earth doesn't change that much, but I was surprised about how in some areas we do see meaningful difference that was impacting the AOHs I was generating. Some tree species in Colombia for instance had quite a different AOH due to the addition of some ravines that weren't there in the original.

The main difference was in Greenland, which was quite striking, though not actually that important for this use case, as the species on the IUCN Redlist in Greenland are all costal and the coast was pretty similar. But here is Greenland in the original elevation map I was using:

A monochromatic elevation map showing Greenland. The coasts look like one might expect, but the core of the country has very evident striping (presumably from multiple satellite passes) and noise in the data.

And from FABDEM:

A monochromatic elevation map showing Greenland. The coasts look like one might expect, but the core of the country is very blurred as if it has been overly smoothed.

Both show what look to me as different processing errors: the original map seems to show a lot of noise and you can see the individual satellite passes in it, and the later looks very blurred and smoothed.

As I say, it doesn't hugely signify, as if we look at a species richness map I generated for STAR you can see that there are no species covered in the core of Greenland:

A map showing Greenland and surrounding countries, with black areas indicating species and white none. Most contries are solid black, but for Greenland youi can see just the coast line is indicated.

But in the future that might change as the Redlist species coverage expands.

Functional programming for atoms

I had a brief play with OpenSCAD, after I learned that it is a functional programming language designed to make 3D models, and I thought this might be a nice intersection of two areas of interest for me. Alas, it was not what I had hoped for, which I have documented on one of my other blogs.

This week

I've had a stack of interesting papers come across my screen that I need to read:
- Mapping area of habitat for inland wetland species by Ridley et al
- High-resolution land-use maps from 1960 to 2100 by Woodman et al
- VLCs: Managing Parallelism with Virtualized Librarie by Yan et al
We have a meeting of people interested in AOH Validation on Monday, which I'm quite excited about. There's so many areas where we either need or would be made easier by access to better automated assessment of AOHs and what is currently best practice feels like it could be improved upon in many ways. How do I bring CI quality checking mentality to all these pipelines, not at the code level, but at the result level.
Anil asked me to write up some bits about making our pipelines run more like CI and why I can't do that today
I'd like to do a full AOH validation run using what there is using the two elevation maps I mentioned earlier, as I know there are differences, but it'd be interesting to see how they show up in the current validation techniques.

Tags: copernicus, fabdem, dem, yirgacheffe

Tech notes by Michael Winston Dales