Weeknotes:11th August 2025

Last week

LIFE

I did another LIFE run with a new base map, using the habitat map from Tom B's food paper, and this again highlighted a need for better tooling around summarisation and visualisation of these large aggregate results. When I looked at the maps generated, they looked similar but slightly different from the original maps, in a way that I felt could be explained by what I know of what Tom did for his updated base habitat map. However, when I just summed the values over the two datasets, they were markedly different, and so somewhere in between the two there is something unexpected going on. Which is fine, but then when you've had a computer work for a day aggregating 120K species worth of data, sifting through that is a slow process. Indeed, I've developed a slight fear when I generate updated maps of the "why is this pixel like this" question that inevitably follows, as it seems the job of finding that out falls upon the shoulders of the person who ran the code (understandably, as it's quicker for them to know where stuff is, but it's an unfortunate correlation).

For this one I started trying to throw things into Excel to generate graphs of what changed between the two runs:

A screenshot of an line graph in Excel, with a dark blue line showing a nice straight trend, and then an orange 'line' that follows it roughly but is also so noisy as to fill in the area under the curve rather than show a smooth trend.

But mostly this is not a great workflow, and that graph is still to hard to read as the orange area under the blue curve is so dense that the points appear as a filled area, and it doesn't really help with the why.

I still dream of having a pipeline that generates a report of intermediary results that can be drilled down into, as a lot of the data is already there, just needs to be found, and I have added a lot of summerisation data tables to the pipeline that tell you for instance why a species was rejected etc. This is a lot like what Patrick did for the TMF CI system, but obviously that was just tied to the CI system. Jody also did something similar for TMF outputs, but not in a way that let you query the intermediary results. Somewhere in there I feel there's a way forward, but it's also something that no one has the time/priority to do.

Playing with topogmesh

UROP student Finley (who's weeknotes you can find here) pushed a release of his "raster to 3D-printed model" code, called Topogmesh, whilst I was away, so last week I had a play with it, and started by trying to reproduce the High Coast area of Sweden where I'd spent time the preceding week.

This version of Topogmesh you can drive by giving it either a single elevation raster or set of elevation tiles, and a GeoJSON polygon defining an area of the world which you want to select. To this end I found outline polygons for the High Coast national park on the Arctic SDI site, and I used the FABDEM elevation dataset as it's fairly high resolution and I have a download of all the tiles. I don't think I really want FABDEM for this application, as one of the main points of FABDEM is that it is a surface map where they've tried to remove trees and other things from the height data, and an important point of the High Coast is that it is covered in trees, but for now I just wanted to get something working.

So I threw it all together and I got my first 3D-printable model!

However, something is wrong with this model: it's way to long and thin based on my recollection of the maps of the High Coast I spent time pouring over whilst on vacation. Looking at the features in the 3D-print I could see that the area selected was correct, but that everything has been stretched and distorted. For example, if we look at the island of Trysunda and its neighbours on a local map and on the 3D print you can see the stretching happening more clearly:

The problem here is that notoriously troublesome topic of map projections. Most datasets you find will come in Mercator projection, whereby each pixel in the data is ascribed a latitude and longitude position, and the relationship between the pixels is a fixed offset in each direction (i.e., the next pixel along will just add or subtract a fixed value). This is easy for humans to understand, and is indeed how standard maps you see (at least in the Western world) are put together, but it is a terribly distorted map, that stretches things out towards the poles. This is because you have as many pixels at the equator as you do at the pole, but much less actual land to cover, so things nearer the poles become quite distored, and is why Greenland looks huge compared to say Brazil on a world map despite being smaller.

The same thing is happening to the High Coast here. This is it in the standard WGS84 projection, which is a Mercator projection:

And this is it if I correct it to a more locally accurate map projection:

Whilst I was aware of how bad Mercator projections are, I didn't think this would be far north enough to see the distortion so obviously.

To get the locally accurate projection I used the UTM projection system that Patrick and I had had to use before when adding buffer zones to polygons to ensure that the buffers were uniform on all sides (and I think hat tip to Amelia for introducing Patrick and myself to that).

So, now I knew what the problem was, I needed to fix it by re-projecting both the GeoJSON for the high coast to the appropriate UTM zone, and the FABDEM elevation raster data. The GeoJSON was easy, you've already seen the results of that, but the raster data was more problematic because I caused me to hit some issues with my geospatial Python library Yirgacheffe.

The FABDEM data comes as a series of tiles, each 1 degree of latitude high and 1 degree of longitude wide, all the same size and nicely aligned on a grid, thanks to Mercator. However, when I projected them into the local UTM projection, I noticed some odd gaps appearing:

In that view I'd scaled the visualisation range top let you see the elevation data more clearly, but the gaps are more obvious in the original view:

Here I noted in QGOS that the dark bands have a elevation value of -9999, and that's a clue as to what's going on. I used gdalinfo to check the metadata on the re-projected tiles, and we can see that this number is in fact the NODATA value for them:

Band 1 Block=2396x1 Type=Float32, ColorInterp=Gray
  Min=-0.500 Max=442.330
  NoData Value=-9999

NODATA is a GDAL extension to the GeoTIFF specification, and is used to indicate areas in the image where there is just no data. That is different from say a zero value, but rather the data in the image isn't valid and shouldn't be considered at all. Why is that occurring here? Well, its because when I convert the FABDEM tiles from WGS84 to the local UTM projection, the resulting images are skewed relative to the pixel grid, so there are areas of the new image where it can't work out what should go there, so it has to mark them as invalid. However, when I do this for several tiles, they all overlap with each other, and the invalid areas should be filled from the other tiles. For the six tiles here, you can kinda see the NODATA zones if I add them without considering the NODATA values:

So once I taught Yirgacheffe about NODATA values, I could then finally get a nicely joined up set of tiles:

So now I have everything I need in theory to generate a better mode of the High Coast using topogmesh, only because all the data Finley had seen to date was in WGS84, it makes certain assumptions about using latitude and longitude that don't hold in projections like UTM, which is keyed in other units, so I'll have a go again next week!

As I say, I didn't think that we'd be far enough from the equator for the distortion to be that significant, but it does make me wonder if we need to automatically do the UTM dance in Topogmesh. Alternatively, given we're 3D-printing, why do we need to project to a flat surface at all? We could in theory map all the elevation coordinates to the globe proper, and then take a surface cut of that globe - I suspect on any small scale that'll effectively be flat, but actually remove all the projection dancing.

Towards Yirgacheffe 2.0

I have to point out Finley wasn't the only one to have made WGS84 based assumptions, as I did find in this process a couple of places where Yirgacheffe made the same incorrect assumption that I have now fixed whilst I was under the hood solving the NODATA thing.

The other thing that I did was merge a bunch of my work in progress for the forthcoming 2.0 release of Yirgacheffe, for two reasons. Firstly, I knew that I'd made a lot of changes to the unit test side of things that would have clashed with new tests I added for the NODATA change, and I didn't want to have to rewrite the NODATA tests down the line, but also because the newer 2.0 API calls on that branch are so much nicer than the current ones that I didn't want to write new code with the old APIs. I guess this is a sign that the newer APIs are doing something right.

These changes are currently just additive - adding newer nicer APIs, and the old APIs haven't yet been removed. I do expect 2.0 to break backwards compatibility for some portion of old code as I tidy up the public API and make it more like other data-science libraries (e.g., pandas) rather than being more OO heavy, but I felt happy adding some of the new APIs as a WIP to a 1.5 release so long as I didn't stop old code working.

That said, after I merged them I did spot a way to make the APIs even cleaner, and so I suspect they might change again in 2.0. Sad, but given I'd made my mind the APIs would break in 2.0, not the end of the world, and they should only change to be simpler.

Experimenting with Slurm

Mark and I have been wanting to experiment with Slurm for a while now, given the increasing number of problems we get with our compute cluster being a free for all; that worked when it was only a few users, but it is becoming more untenable as time goes on in my opinion, and so Mark set up a trial Slurm instance for me to dogfood.

Some quick first impressions:

It's a lot more lightweight than I expected - running a job and collecting results is quite easy, and avoids having to set up a screen session etc. that I'd normally have to do. I think I am biassed against things like Slurm in that it's yet another thing in my way, but in practice I suspect the overhead is quite minimal given what you get for that.
It feels a little messy in terms of having to build tools you want on Slurm if they're not installed by the admin. It works, but feels a little fragile if there's say a heterogeneous pool of client nodes.
We rely on scratch space on our worker nodes for the large in-progress results, and I worry about how I'd access that if necessary for debugging. Lacking the immediate visibility made me feel somewhat naked, but perhaps that says more about how I fail to make my code sufficiently robust than it does about Slurm?
Mark and I don't really understand what stops users being greedy and asking for all the resources "just in case", effectively making each node have serialised access. I guess we just rely on social pressures for that?

I only did a small amount of testing last week, but I need to do some more LIFE number crunching this week, so I'll hopefully get to kick the tires some more.

Next week

More Slurm testing
Look into AoH edge effects a little more
See if I can do something with Tessera
Add a full-history of posts feed to my websites for Anil (probably based on supporting next/prev on the Atom feed as per RFC 5023).

Tags: 3D-printing, life, yirgacheffe, slurm