Weeknotes: 24th November 2025
Last week
I continued to be unwell for all of last week, with my brain feeling like it was made from cotton wool, only really clearing up now, so it wasn't the most productive of weeks. This coming week is busy, and I'm behind on where I wanted to be by now, so I find myself somewhat on the back foot at the start of this week, and so these notes shall again be brief.
Hybrid habitat maps
I've talked about the hybrid habitat maps I've been analysing this last couple of weeks, and I finally shipped out my report to the team middle of last week (news that was met with the sound of crickets chirping, so so much for me feeling under pressure to get this out). The tl;dr is that the new maps perform better based on a mix of intuition, comparison with global metrics, and under AOH validation techniques, but I did show that they are unstable due to the randon number generator used, with small rare species coming and going on how you roll the dice. This is why you should always seed your random number generators and publish that seed with your work. In fact, I'd go as far as to say that random number generators for data-science languages should not come with a default seed, but that's another discussion.
The main outcome for me has been this analysis has let me use the Dahal et al validation method for AOHs in anger, and now I want to look at how we might improve upon that to let us build this sort of analysis into future pipelines. To that end I've scheduled a meeting with Stuart Butchart, one of the original authors, and a few other interest folk for early December.
Yirgacheffe off by one, or the deep refactoring of geospatial
I got a report from Tom Ball that he was trying to use my hybrid map code (unrelated to the analysis), and that he spotted it was generating an odd intermediary layer, where all the pixels were shifted one down and across from where they should be.
Not really what I wanted to hear after doing all that analysis, though thankfully this doesn't change the overall message from my analysis, and obviously I've very glad it was found now before we did more work on this pipeline.
I did a look into why this was, and the root cause is down to Yirgacheffe's attempts to deal with the fact that geospatial layers are often not aligned properly. So, for example, in this instance we have two input raster layers that are of the same pixel dimension but have a slightly different origin:
Layer 1:
- Size is 4320, 2160
- Origin = (-180.000823370733258,90.000411685366629)
- Pixel Size = (0.083333333333333,-0.083333333333333)
Layer 2:
- Size is 4320, 2160
- Origin = (-180.000000000000000,90.000000000000000)
- Pixel Size = (0.083333333333333,-0.083333333333333)
You can see that although notionally the same extent and resolution, one of these layers is "ideal" on the EPSG:4326 grid, and the other is not, and the amount it is off is just a small fraction of the actual pixel size, and so the right thing here it so do nearest neighbour matching. Which Yirgacheffe does try and resolve, but the math doing the rounding was optimising for another alignment challenge, and so got this one wrong :/
At least I know that I'm not alone in this sort of issue, as this tale of a weirdness in the Half Life 2 re-release shows!
Anyway, I'm two days into the fix for this, and writing it up will have to wait until next week, when I'm 1) hopefully finished and 2) have more time.
I will leave you with another example of floating points being something computers do poorly which I stumbled over whilst trying to deal with the aforementioned issue: what is the missing answer here when you ask a computer?
>>> 20 % 10
0
>>> 20 % 1
0
>>> 20 % 0.5
0
>>> 20 % 0.1
????
And not to pick on Python here, both OCaml and Excel do the same thing, which to me is unexpected, but I assume is a result of the CPU implementation of IEEE floating point numbers.
Oh, and if I ask Python in a different way, which I can't use in my code, I will get the expected answer 🤦 In general I do think the types we use in computers have gotten us a long way but are very not suited to a lot of the things we use them for, be it data-science or video games as the HL2 example shows. We've thrown more and more silicon at solving other issues with CPUs, but we seem happy to stick with some poor level of accuracy that was difficult for computers thirty years ago, and indeed for GPUs made it slightly worse. If I struggle with this as someone who understands the hardware involved, how on earth are vernacular developers in other domains meant to cope?
Webplats
I did a bunch of tidying interfaces on the OCaml framework I use to host this and my other sites, which was good brain off work. This was meant to be a prelude to adding search, but I never got there due to brain fog. I implemented my own search system in Swift for the sites back when they were static sites, and the plan is just to port that to OCaml.
This week
- This week I have a two day workshop with some from from the IUCN Redlist, trying once more to help then run my STAR implementation, and to chat about other method refinements.
- Finish the fixup to that issue in Yirgacheffe.
Tags: yirgacheffe