Weeknotes: 3rd February 2025

Last week

STAR

I finally got time to sit down and catch up with trying to recreate STAR, the IUCN's Species Threat Abatement and Restoration metric, which is similar to, but different from, the LIFE metric that I work on. Before Christmas I'd been working with Chess Ridley who maintains and runs the current STAR pipeline for the IUCN, trying to match up my species selection process with hers. Since then I'd found out some more nuances to both the IUCN redlist data and to my ability to write SQL that queries JSON fields in Postgres, and I'm now sitting closer to Chess's results, but not quite there yet:

  • AMPHIBIA
    • Total Chess: 3249
    • Total Michael: 3260
    • In Chess data, not in Michael data: 0
    • In Michael data, not in Chess data: 11
  • AVES
    • Total Chess: 2180
    • Total Michael: 2214
    • In Chess data, not in Michael data: 2
    • In Michael data, not in Chess data: 19
  • MAMMALIA
    • Total Chess: 1674
    • Total Michael: 1694
    • In Chess data, not in Michael data: 0
    • In Michael data, not in Chess data: 20
  • REPTILIA
    • Total Chess: 2090
    • Total Michael: 2212
    • In Chess data, not in Michael data: 0
    • In Michael data, not in Chess data: 123

I suspect I'm still not quite filtering based on threats in the redlist data, which means I'm including more species than I should. In STAR you only need to deal with species that have active threats on them, and early on I had an issue where I was still counting threats that had a timing field value to indicate the threats had gone away. My suspicion is I'm still including too many threats, but I need Chess to help guide me as to where. The two bird species that Chess has that I don't is because in the proper STAR evaulation they suppliment the redlist data with some data direct from Bird Life International that isn't yet ingested into the redlist, so for me those two species don't have ranges.

PostGIS on Docker on Apple Silicon

As I suspect I'll need to work a bit more peripatetically over the next few months, I wanted to duplicate the PostGIS I have on the AMD EPYC Linux server I use onto my M3 MacBook Pro - my vague aim is to be able to run both the LIFE and STAR pipelines locally. I'm using docker for running PostGIS on AMD server, so I thought I'd do the same on the Mac, but there is still no ARM docker images for PostGIS. Whilst that will run on my Mac using translation, given I'm working with very large datasets, I wanted to ideally have a native image.

Thankfully, as per this post in the issue thread, there are images for ARM being built by others, which you can get by fetching from ghcr.io/baosystems/postgis.

Related to this, homebrew hides the client only version of postgres in the libpq package, and because that package has the a subset of the binaries that are in the full server install, it doesn't put the psql binary and friends on your path. You can thankfully force that by doing:

$ brew install libpq
$ brew link --force libpq

Geomob talk

I gave my talk at GeoMob London, which probably went well, but they'd packed in many speakers so there was no time for questions/reactions, and so I've no real idea whether my message of needing to build datasets and tooling with ecologists in mind resonated with anyone, or if anyone knew of things I was missing. There was a post event pub trip where all that was deferred to, but needing to get back to Cambridge I couldn't stick around for that, so no sure it was worth the trip in the end, as I didn't get to have any conversations with folk.

Still, there were some interesting other talks. I definitely want to find time to check out Fused.io which seems to let you run snippets of python against datasets, where it's doing some clever optimisation of datasets in the middle to speed up queries? The talk wasn't super clear (again, the speakers were against the clock with so many packed in), but enough to make me want to learn more.

EEG seminar on fungi

Even if my talk was a wash, there was a great talk the following day in the EEG seminar series by Toby Kiers from SPUN, the Society for the Protection of Underground Networks, which look at the importance of fungi in our ecosystem. I discovered that fungi live in a symbiotic relationship with plants, taking carbon from their roots and delivering things the plants need in return like phosphorus, and they do so in a way where they fungi will optimise what it does to get the best return on phosphorus delivery to the plants around it. It was quite a fascinating topic, and I've now got myself a copy the book of Entangled Life to learn more about this hidden world going on below the ground.

Weeknotes

Thanks to Anil I managed to get all of my old weeknotes that were in the 4C notion system, which I can now slowly migrate to this blog. The technical porting is relatively painless, though did make me expand my OCaml website framework to handle images using markdown syntax, and not just the shortcode syntax I'd developed from when I used Hugo for the website. However, the notes will needs some light editing to remove names of students etc., which was fine for a group internal blog, but not appropriate for this more public forum, as they've not consented to being here.


I started using Patrick's library Hilite to try add syntax highlighting back to the website, something that worked out the box under Hugo, but now I need to manage. Out of the box Hilite supports OCaml and related syntaxes, but I've had to add my own for other languages covered by this blog regularly: Go, Swift, and Python. Thankfully Hilite uses the old Textmate format for syntax definitions, of which there are many language definitions out there already.


I had a chat or two with Anil about our storage schemes, as he's also been doing similar but interestingly different things to what I've been doing with moving my websites over to my own backend. He's been doing a lot of interesting things with special slugs for different topics and contacts that I'd like to try bring over to this website.

This week coming

Project ideas

I need to write up some student project ideas I have around Yirgacheffe, in terms of extracting better performance. Whilst I chip away at that whilst I can, doing both that and actual pipelines means more hands would always be welcome, and I think there's some interesting bits to be picked out and worked on.

LIFE

I'm still doing some spelunking on individual species in the species data for the LIFE maps, trying to understand why species are represented the way they are. I'll be doing some more of that, and talking later in the week with Daniele Baisero from the KBA Secretariat on their scoping tool, as they would like to include LIFE layers in that, and I think it has a nice UI that would be good for making inspection of the LIFE maps easier for people who aren't familiar with the code.

Yirgacheffe

I did some little bits of tidying with Yirgacheffe, but did not gat around to making the Metal and CUDA backends swappable.

I also want to write a small how to, using perhaps the AoH data or a species richness map as an example workflow. For that though I need to work out if blogging counts as publishing, as the IUCN data I use for things like this is open access, but requires permission for publishing.

Tags: weeknotes, star, postgis, geomob