Weeknotes: Working at the edges

15 Apr 2026

Slightly catching up on weeknotes, mostly as this was something that should have been simple, but was instead a little frustrating due to hitting issues with GPUs.

Last year in the LIFE team we started trying to consider edge-effects on species habitats. The theory is that if you have a species that say likes forests, we typically calculate it's area of habitat assuming it will occupy all the forest pixels within its range (assuming they're also within its elevation preferences too). However, it's documented for some species at least that if that forest borders with say an urban area, that the species will avoid going within a certain distance of the edges of the forest. This means then that we can't just assume that all the pixels within a species range that are forest are actually habitable, we need to discount those that are on the interface between habitats the species likes and the habitats it doesn't like.

This is important as it means that when preserving land, you might say well we'll convert so much of a natural area to farmland, let's say 50%. If we don't consider this edge effect, then it means that regardless of whether we convert it into one single contiguous area of farm and one single contiguous area of natural land, or if we do a 1 metre squared checkerboard of farmland and natural land we'd get the same result for habitable area. But if we do consider these edge effects, there's a good chance that in the (admittedly impractical, but you get the idea) checkerboard pattern there would be no habitable area for some species, vs close to 50% in the original contiguous blocks version.

One of the challenges here is that we don't have good data on what is a reasonable assumption on edge effects per species, and whether it applies to all habitat boundaries or not (e.g., forest to urban might be less tolerated than forest to grasslands). However, given that land use changes do tend to lead to a fragmented landscape, it's something we want to investigate a little.

To that end, after talking about various options, we decided to do an analysis of fragmentation of the landscape of Brazil this way, using the more detailed Mapbiomass land cover maps. Normally we operate at a global scale of maps around 100m downsampled to 1km, but because edge effects are typically measured in tens to hundreds of metres, we wanted to use a known high quality and high resolution land cover map for our initial investigation to this.

In theory, this was meant to be a simple task for me to generate all the AOHs for the species that overlap with Brazil, as I generate AOHs day in day out - it's probably the third thing I'm most popularly known for, after guitars and meese. Even the fact that Mapbiomass works at 30m didn't concern me that much, as normally we're operating globally, and so I felt that the reduction in area to just cover Brazil would counter that. But in the end it was more "interesting" work that I'd expected.

Firstly, the polygon rasterization of the species ranges was generally slower, and I actually had to spend some time looking into ways to improve the performance of GDAL's rasterization of the polygons. What I believe was the cause here is that because we're zoomed in, the size of the ranges is generally larger as a percentage of the area of interest, so there's more pixels to fill. Brazil is also problematic because it has a complex coastline and a lot of species ranges follow the coast lines. I've written (or ranted) in the past about how I think this is bad practice, so I won't go into it again here, but those familiar with me will know that I get wound up over this one :)

The other problem was that Yirgacheffe is a little naive when it comes to working out if it needs to do any work with a range. I'd initially just decided to not filter the species for those that overlap with brazil, and just rely on Yirgacheffe to not do any work when asked to calculate the AOH for a habitat map that just covered Brazil. But it turned out that I was getting bit a lot by this:

A black and white image showing bits of the world highlighted. On the top left is the shape of Alaska and to the right is most of Europe, Russia, and down into India. The bottom left of the image, which would be over most of north and south america, is blank.

Here you can see that the bounding box for this species happens to overlap with Brazil (which should cover the lower left of the image) but not in a useful way. Now there's enough information for Yirgacheffe to work out that this is the case, but it's not been a problem before so I've never done so, but my guess is that with central and norther parts of South America being a hot spot biodiveristy wise I was hitting more cases of this than I'd assumed, so I had to actually filter the species beforehand, which meant the 34K species of terrestrial vertebrates dropped down to 7K.

I will address this in a future Yirgacheffe update, but now wasn't the time as I'm already in the middle of a re-write for Yirgacheffe 2.0. Having Yirgacheffe aware of non-contiguous regions actually would make sense for group layers, where Yirgacheffe lets you take a bunch of rasters and treat them like tiles - that will have similar inefficiency problems to this also.

The other thing that bit me was that I've started having serious issues with MLX, the numpy-like framework that I use on Apple Silicon to take advantage of the GPU. I've known about this for a little while, but it really came to the fore doing this task.

In the past I've been somewhat protected from stressing out MLX too much due to limitations within Yirgacheffe, both down to the API design and how I processed things internally. But now that Yirgacheffe has grown more expressive and optimised, it's putting more pressure on the MLX layer, as more of the underlying calculation is getting to MLX unfiltered. I also suspect, though can't confirm this, that at the same time MLX has got better in some ways, as the issues I'm about to go into are less frequent in older versions of MLX.

Ultimately, the main issue I've been hitting until now is watchdog timeouts from the operating system when a GPU task takes too long. The reason I'm hitting this is because I'm now able to pass more work to MLX, and it seems MLX is building bigger kernels now maybe, and so whereas before Yirgacheffe was feeding a the GPU a lot of small tasks, it's now doing fewer larger tasks, but the GPU doesn't really expect to take five or more seconds for a single job, and I get exceptions from the OS that then take out my process.

And it's in one of the most simple calculations that I'm getting bit by this issue, the species richness calculation. In this we take all the Area Of Habitat maps we've generated, one per species, and add them together. In older Yirgacheffe I just kept an in-memory raster that I accumulated the result to, one AOH at a time, which meant MLX only saw a single image addition, which is a bit wasteful really given how much a GPU can do. But now Yirgacheffe can just be told to add up seven thousand rasters, and it tries to do this, but because MLX is lazily evaluating the work that Yirgacheffe is also doing lazily, we end up with MLX trying to do too much at once and us timing out.

The MLX documentation does tell you that you should perhaps keep to under a thousand operations in a single expression, and I'd just never paid attention to this as it had never been a problem before, but now Yirgacheffe is doing a better job at taking the user's problem as its own, rather than making the user work harder, MLX is getting given too much to do at once.

I took a quick stab at trying to have Yirgacheffe checkpoint the expressions, but I think then I hit another problem, which is that I ended up spamming the system with too many different shapes of calculation as I divided up problems into blocks based on a weak heuristic of cost I'd implemented, that the OS got upset with so many different GPU kernels being requested, that it actually caused my machine to become unusable (as I guess the GPU is also handy for drawing the macOS interface :). In the end I reverted this attempt to solve the problem, as I was just rushing it given I was already juggling the edge-effect task and the 2.0 re-write and my lidar work. The other thing here is CPU parallelisation: because MLX was being underutilised in the past I could mix more CPU and GPU parallelism, but now that MLX is carrying more of the work I need to pick one approach and stick with it, as I'm able to prepare more work than the GPU can consume by doing CPU parallelism to hand stuff to the GPU.

In the end I just accepted the slowness of having to do things on the CPU for now and my species richness calculations took an embarrassingly long time to calculate. I should really have just reverted to my old code and an older version of Yirgacheffe, but I'd had enough messing around, and just left it to run over the weekend.

I guess an example of two rights (Yirgacheffe and MLX both getting better) making a wrong.

Now that the edge-effect results are sent off for closer inspection by my ecology colleagues and Yirgacheffe 2.0 is pretty much wrapped up, I'll have to turn my attention back to how I schedule work in Yirgacheffe to better balance that.

Tags: weeknotes, yirgacheffe, mapbiomass, edge effects

Tech notes by Michael Winston Dales

Weeknotes: Working at the edges