Weeknotes: 21st August 2023

The week in review

Tropical Moist Forest Evaluation Method Implementation

We had a meeting last week to review the results from the TMFEMI for our standard test project, and the main outcome (for me) was that I had misunderstood how this work would be evaluated. My expectation was that there would be some equivalents of myself and Patrick on the ecology side to help check the results, but the outcome of the meeting was that Patrick and I need to do this, which explains at least why things had stalled.

Tom is going to provide us with some of his data so Patrick can check into the additionality calculations, so there’s not currently much I can do to assist, but I stand/sit poised to help if I can.

We’ve had some issues running other projects as we hit corner cases in the project data. One problem is that it turns out python multiprocessing will just restart a crashed child and pretend nothing has happened, which meant that when we had one worker thread in the find_potential_matches.py section, the collector thread would wait forever for lemon soaked paper napkins^W^W^W^Wdata that would never arrive. Thankfully you can implement a work around for this, though I’d not call it obvious such that a non-compsci would come up with it, which for a tool aimed at generalists is a let down.

Julia

I continued to explore using Julia as an alternative to Python after the previous week’s success, but progress was a little slower this time round.

First up, I tried to implement more of find_pairs.py in Julia, trying to get it to do all 100 rounds of finding pairs at once, as we do in the Python version (last week I just implemented a single pass). I suspect it’s just me doing something naive, but when I did the suggested Threads approach, the script seemed to use as many cores as I suggested, but the run time was around the same time as just doing the work in one core in a serialised loop, and some times was even worse! I feel to be this bad it really is a user problem, so I need to dig into why.

On the flip side, I did just run the single iteration version with littlejohn for comparison, and I got the time for find_pairs.jl to be about three minutes, plus or minus a minute depending on your random seed (this is compared to around six minutes for the Python version).

I also looked into dealing with rasters and shape files in Julia, and here things feel a lot further behind the world of Python and R. Whilst there are a bunch of libraries built upon GDAL etc., I struggled to find the direct equivelents of the kinds of operations we do with geopandas/shapely/yirgacheffe in Julia.

I’ve signed up to the Julia slack, which seems reasonably active, and there is a geo channel in there, so I suspect I’ll put together some examples and see if I can find out a place where this might work.

Biodiversity

I did a bit of data processing for Alison to help find examples of cells where the there was a negative impact from restoration so she could see if this was due to an error in the analysis or actually correct.

I also helped Tom B diagnose a rounding issue in some code I’d written for him a while ago, such that he was then able to make it work.

Related to this Anil asked me to look into some visualisation tools for the biodiversity data. Thus far we’ve been using kepler.gl as a tool to visualise the hex tile data, but that has limitations, and it’d be nice if we could express queries to it. We did some work earlier in the year shoving all this data into a postgres database, and that seemed fine, so putting a web frontend on this is then a logical next step.

However, I took a look at kepler, and whilst you can add it as a dependancy for a website, it is entirely react based, which both isn’t a framework I’ve used in the past, and given all the other things we’re doing, I don’t feel I have the time or energy to become an expert in. Unfortunately I suspect we need to deliver something here to help with funding efforts, so I’ll continue to look for alternatives.

This coming week

I’m away for most of this week, attending a conference up in Hebden Bridge, so not much will get done. But my priorities are:

Assist with TMFEMI work where possible
Look into data visualisation possibilities