Weeknotes: 11th April 2023

The previous week

I was in Snowdonia reminding myself of natural habitats for a Michael (I grew up in Scotland, so Cambridge is a bit flat to me :)

We also went to Dyfi Osprey Project and saw the Ospreys at their nest

The previous work week

Swift-GDAL alternative code

As a follow on from my work writing Swift code to parse GeoTIFF files the prior week, I wrote a small Swift library to let me parse and decode GPKG files in Swift, which means I can pull out vectors and rasterise them into GeoTIFFs all in Swift:

I’m using a cross platform graphics library to do the low level rasterising, that maps onto Cairo in linux, which means this code works on both macOS (as you’d expect) and Linux too. I had to make a few changes, which I’ll try upstreaming, but I need to do more Linux testing of these so they have a nice PR rather than just “it worked for Michael”. I also need to add some features missing, like holes in polygons.

Getting to this stage was useful, as it means I had to not only write code to let me parse GPKG files and the WKB polygon format, but I also had to properly understand the GeoTIFF format more enough to generate a file that QGIS would accept as a valid GeoTIFF.

The next step is to get it to run the persistence calculator AoH algorithm, so I can do some performance comparisons with our Python workflow before adding in some GCD magic to parallelise it (which is why I went down this rabbit hole to begin with).

Ark

I caught up with Patrick on Ark things, and he acted as a useful anchor to stop me worrying about the mess that is Docker storage options, and just try get some of our problems into a working Ark proof of concept.

ChatGPT

Anil and I tried to replace me with ChatGPT, but alas we were unsuccessful, and so 4C is stuck with me for a bit longer. We basically asked it to try and code up their AoH calculation stage from the persistence pipeline, on the grounds that it’s a reasonably simple problem to communicate. I have to confess to being generally unexcited by “AI”, which to me has always just been a way to scale up “if” statements beyond human comprehension, but I had to change my mind after this, because what we found was a tool that could enable people to build better things, and that sort of thing is something I am very interested in.

Whilst looking at the code ChatGPT produced as a computer scientist I can see that it would have issues with memory footprints (saving me from looking for a new job, for now), it did otherwise do a good job of generating some understandable code that mostly did the right thing. It was close enough that I don’t mind that it didn’t do the perfect answer: it was close enough I’m convinced that any of the ecologists on the team could then use this as the basis on which to build from. Anil is thankfully prompt savvy enough that he structured the conversation so that the code generated was very concise and readable.

I think what excites me about this is that whilst I enjoy helping the ecologists improve their code, I don’t scale well, but ChatGPT does. I’d be very interested if we gave say a few of the ecologists paid ChatGPT accounts today to act as co-pilots, whether it’d be an enabler or a hindrance. I’d like to think the former. And if it isn’t I wonder how much better it would be if rather than writing Yirgacheffe or other nice libraries, I just helped ChatGPT understand the domain more.

For example, when pushed on concurrency, ChatGPT did point us to dask, a common way to scale python on servers, but it’s not something I’d expect an ecologist to want to invest time in learning how to set up and deploy. So could I teach it about Yirgacheffe? Could I find another way that worked well with ChatGPT advice?

In general, I’m still not excited about ChatGPT as a technology, but as a tool to enable those without compsci training to leverage technology properly in light of the limitations of programming languages and operating systems to scale to the meet their needs, that I do find interesting.

I sorted myself out with a ChatGPT account but have yet to do much with it due to hours in the day being fixed.

This week coming

Our main compute server went down again, due to memory overcommitment. This is an good example of where operating systems just give up these days, I guess it’s not a topic exciting enough for the OS research community, but it feels like we’re happy to have virtual memory as a way to allow multiple processes to have distinct memory spaces, but we don’t really try and support applications manage their memory access well. Thankfully, despite the Python Yirgacheffe library not doing concurrency yet, it does provide memory management support, so I’m going to chat to Tom B about how we might move him over to this so he can stop worrying about flooding sherwood’s memory. This also gets me yet another use case example, so hopefully everyone wins.
I need to read through Keshav’s methodology document
Implement the AoH calculation with my Swift tooling
Come up with an Ark proof of concept plan with Patrick for when Anil returns

Interesting links

Funktal: A minimal functional language designed to have a low carbon footprint https://limited.systems/articles/funktal/

Tags: weeknotes, swift, chatgpt

Tech notes by Michael Winston Dales