Nordic-RSE 2025 event blog

31 May 2025

Tags: conference, nordic-rse

This is a summary of last week's 2025 Nordic-RSE conference, held in Gothenburg, Sweden. Whilst I'm not technically an Research Software Engineer (RSE), a lot of my role involves essentially the same activities in working on ecology pipelines like LIFE, STAR, and so on; indeed I'm a member of the UK Society of Research Software Engineering. Not only do I effectively act as an RSE a good amount of my time, but it's also a part of my job I enjoy: collaborating with experts in other fields whilst getting to use my own expertise and learning something along the way is often quite satisfying.

My role at the conference was twofold: to learn more about how others are working in the domain so I can pick up things for when I am an acting-RSE, but then also with the other side of my role as someone who is trying to build tools to support reproducible/repeatable scientific pipelines, looking at how our work to date on things like Shark might connect with that.

Disclaimer: all these summaries are projected through my own thoughts, so what I put here isn't necessarily the opinion of the speaker, but rather my interpretation. If you want a simpler summary of just the facts, you can look at the group notes form the event. Apologies to speakers if I've misinterpreted their words - please do correct me if so!

A group photo of about forty research software engineers stood or knelt for a group photo inside a building.

(Thanks to the organisers for taking a picture of us all!)

Day 1

Intro by Matteo Tomasini, Nordic RSE

One of the things I loved about the conference was that it was still small enough that I got to know a good proportion of the attendees throughout the conference. In the introduction Matteo Tomasini revealed that there were 45 people this year, up from 30 from last year, which was also the first year.

There was a bit about what made an RSE, particularly as in most institutions in the nordics (except Aalto) there is no official RSE job (unlike in UK universities where RSE now is an officially recognised role). Generally in the RSE community, both in the UK and in the Nordics, it is recognised that a lot of people act as defacto-RSEs without having the term in their job title, and as such I've found both communities to be welcoming to those of us who self-identify as RSEs, and thus it was with this conference. Matteo defined it as:

  • If you develop software for research
  • You're the go to in your group for software work/questions
  • You support the other researchers in your group
  • If you feel like one

I liked this broad definition in the opening, as it made it clear that everyone was welcome here.

Matteo also touched on what does Nordic-RSE do:

  • This conference
  • Has a community Zulip chat for members
  • A weekly online coffee meet (9am CET on Thursdays)
  • Bi-weekly online community meeting

It's clear the group has ambitions to help foster the RSE role in the Nordics, and throughout the conference the UK's Society of Research Software Engineering (of which I'm a member, tough I couldn't make their conference last year) was cited as being about 5 years ahead of where this group wanted to be.

Keynote: Clarity, Not Consensus: Rethinking Unity in Open Science by Rebecca Willén, IGDORE

This was an interesting keynote on the quest for "open science". Rebecca Willén was the founder of IGDORE, the Institute for Globally Distributed Open Research and Education, which they founded after the end of their PhD, a champion for reproducible science.

She started by explaining there was a revolution in psychology in 2012, with research looking at the field of psychology and questioning the reproducibility of the results and how selective people were being about what they presented. This isn't necessarily scientific misconduct, but with the push to get published people might slip into what is defined as Questionable Research Practices (QRPs). Examples of this were:

  • P hacking or data torture (selective results)
  • Harking - the practice of finding a thing of interest in the data and then pretending that this was your hypothesis all along

The QRP framing is meant to go beyond the deliberate misleading, and I think as a computer scientist interested in tools for reproducibility and having worked with many vernacular programmers, I think that computers amplify QRPs, by making it hard to do a good job at understanding lineage/provenance. I need to dig more into QRPs, and I think the citations for this are:

I also found this more recent 2021 book, The Problem with Science: The Reproducibility Crisis and What to do About It by R. Barker Bausell (specifically in chapter 3) that seems to cover the topic in detail. Lots of interesting things to follow up on.

Back to the talk. From this epiphany in the psychology research community in 2012 spun out an attempt to do better - a theme we'll see repeated later in Ina Pöhner's talk in the Pharmacy community - and a push to open science.

Rebecca then presented what she felt where the five tenants of open science that people talked about, each of which had many subcategories which I didn't manage to record, but the high levels were:

  • Open access to knowledge and resources
  • Access to infrastructure and enabling tools
  • Reproducibility and scientific quality
  • Research culture and work environment
  • Diversity and inclusion

The first two were listed as being accepted requirements in the open science world, at least in IGDORE, and the last three were still being debated.

Rebecca made a comparison at this point to the open source software movement at this point, and gave a historic overview and pointed out how over time that movement started out as being a moral movement (people should have the right to examine and modify the code they run), to being a more qualitative bar (aka, libre vs gratis).

"The Free Software movement and the Open Source movement are today separate movements with different views and goals, although we can and do work together on some practical projects." - https://www.gnu.org/philosophy/free-software-for-freedom.html

Rebecca identifies this theme in the timeline of open science also:

  • Open access, arxiv, CC - late 1990s
  • Protocols for clinical trials mandatory in 2005 - open and version controlled
  • Work showing QRPs are common in 2011
  • Open science framework - developed for psychology, used now used in all social science. Describes the process of pre-registration - saying what you're doing before the research -
  • Added reproducibility to open science with intent that it prevents QRPs
  • The Center for Open Science similar time to the Open Science Framework, but starts to shift from morality to quality similar to that shift in the OSS world
  • Another reference to the UNESCO open science toolkit factsheet 2022, specifically the enumeration of its tenants - the quality shift is now appearing here

My personal opinion is that tech culture did lose track of that morality of open vs the open speeds up the tech sector discussion - part of the enshittification we see today I guess, though some of that is just also unchecked capitalism having caught up with naive tech optimism from the prior decades. But I digress.

At this point I got a little confused as to which tenants Rebecca was advocating for - as I wasn't sure as to which bits of the original 5 tenant list and the UNESCO definition of open science she saw as being about the moral purpose of open science vs the check boxing of open science to do what you were going to do anyway. But what was clear was the in IGDORE they'd had a loss of momentum because of this pull in different directions of what it means to be open science, and they'd not realised that this split was happening, and so consensus was lost in the organisation and there was lack of doing anything useful for many years as a result.

So I'm not sure I agree about which tenants should be in or out of a definition of open science, but I do see that the split that happened in the tech community around libre/gratis could also be a challenge for the science community. But for me the main take away was the learning about QPRs, as this has given a name to a whole bunch of things I've thought about but never had a way to tie them together.

Design Patterns: The secret to smarter, reproducible research by Marine Guyot, CodingResearcher

The next talk was by Marine Guyot, who is a freelance RSE, and gave a talk on using design patterns in building software for research. The motivation for the talk is what I feel must be a very common pattern, which she told via the persona Anna:

  • Anna makes a script to save time for her own research
  • Others use it
  • Other users ask for small modifications....
  • Now Anna is trying to juggle hacking this script vs her own work - bad quality etc. due to time pressures

Then either at some point it will be recognised as critical and a team will form around it, or Anna will still carry on trying to maintain it and burn out.

I feel there is another option which is the software is abandoned and then something is lost, but I guess that's not part of the narrative for a talk on how to design better software.

The rest of the talk focussed on design patterns in software, a topic I won't try to reiterate here as there's good books on this. The premise is that if you make something useful, others will want changes, and unless you put structure in place to manage those changes early on then you'll pay for it later. Something I suspect most people know (at least by the time they write software a second time :), but I suspect few people think of software as being anything other than a quick thing they do to try get a result for their work. It's like the old question about when is a number of things "many".

The best nugget was in the Q&A at the end:

Audience Q: what's the best thing I should do for the hand over (from RSE to researcher) Marine A: documentation

In the modern era, what can we learn from the history of free and open software? by Richard Darst, Aalto University

Richard Darst gave a talk on the history of open source software, looking at how it has evolved over time, and then how to deal with some challenges in opening up code (and maybe data or science?) today. Richard's slides are quite readable and available here, so I won't attempt to recap them here.

I enjoyed the talk, and learned a bunch about the details of how debian view things via his overview of the Debian Free Software Guidelines, and how they have tests to help decide if a thing is truly open, such as the desert island test and the dissident test.

One note that struck a chord after some recent experiences with primary data sources we've had:

"In short term closed may be better, but more people will improve the open option long-term"

In our case a group making open digital elevation maps that we've used in the past have switched to restrictive licensing for an open version and a paid version if you want to avoid that, and how that feels quite short cited, particularly given we're in the midst of a climate emergency.

Tutorial: 3D visualisation and manipulation of scientific data in static web applications by Joakim Bohlin, InfraVis, Chalmers University of Technology

This talk by Joakim Bohlin was on building static web sites for visualising science data. The code examples he used are here.

In the EEG group we have quite a strong static-site, self-hosting theme (this website is currently hosted on a raspberry-pi and running its own static-site generator!), and I also have close to zero interest building frontends for our work that involve me working in React, Vue, or any of the larger contemporary Javascript frameworks that a lot of geospatial visualisation libraries assume you're using. Indeed, I think this is somewhat a point of contention within the group, as there's a clear need for communicating what we do, but because we're effectively mostly people who work at the bottom of the stack, it means no one wants to take time to learn those frameworks, and so we've been poor in communicating what we do.

I guess this is another RSE thing - we write software, but we can't write all software individually.

So with that context, I was interested to learn what Joakim had to share: although he can't solve the problem with geospatial visualisation libraries requiring React etc., it was good to know that people are having success delivering usable visualisations with a minimal stack, and if more people are doing that, hopefully we'll eventually see more tooling begin to support this approach.

Some particularly interesting bits of tooling to me were:

  • Pyodide - this lets you run Python in the browser, which Joakim pointed out isn't the best solution, but often if your group works in Python they might have existing things that use plotting libraries to generate graphs, and as a first cut at getting that in front of more people, just can be an easy way to get started. You can combine this with micropip to include python packages from the javascript wrapper you use to load Pyodide.
  • Vega-Lite - a native javascript interactive graphing library, which I pronounce to rhyme the first half of the name with "Sega"but I fear is a pun based on Vegamite :) In the past I've used C3.js for this sort of thing, but Vega-lite looked a little more easy to make the data interactive.

There were more, so if this sort of thing catches your interest, do check out the linked examples.

Donated data and what Google already knows by Jarno Rantaharju, Aalto University

The premise of this talk was that collecting data in studies of people is hard:

  • Takes time, expensive
  • Requires participant effort
  • Impacts subject behaviour
  • Data is only collected after study starts

That last one might seem obvious, but I guess it's a valid point if you wanted to say study how the covid pandemic changed behaviours. Jarno Rantaharju's point was that actually for a lot of studies the data you might want could already exist in the various cloud services you use, knowingly or not: Google or Netflix already have a lot of data on your behaviours, and thanks to GDPR you can get access to that data as a participant. This is being worked on by, amongst others, the DigiTraces Lab at Aalto University, and is referred to as Facilitating Donated Data.

An example publication that was made using this data gathering technique on Netflix data.

Jarno then went to walk through how Google's "takeout" service works to facilitate extracting user data, how to filter it, and so forth, all of which can be quite complicated. So then Jarno showed a browser extension they'd made that will automate much of the "takeout" process, show the user what it has, and then talk to a data collection website they were hosting for an experiment (all of which is open source I believe).

There are also other tools out there, such as PORT, which are designed to allow the user to do some introspection and filtering of the donated data before uploading it, as "takeout" for instance doesn't make it easy to time restrict data, you have to give the science team a lot of data they don't necessarily want and you might not want them to have more than is necessary.

I noted Jarno was using Niimpy in his demo showing what was in the "takeout" data, which is a python package for working with behavioural data, which looked quite useful if you were into that sort of thing.

Unreviewed and Unrunnable? On the role of peer review in publishing reproducible research software by Ina Pöhner, School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland

This talk was one of the highlights for me in terms of how it related to existing work we've done on the topic in our group here, e.g., our Undone paper on how CS hinders climate research.

Ina Pöhner started out with context that echoed the opening keynote talk, looking at how in their domain there are papers from over a decade ago flagging issues with reproducibility of work, and then another large survey in 2016 calling it a "reproducibility crisis". Since then there has been an increased requirement in providing code along side publications, but the question is does a code requirement really equate to reproducibility?

Ina and her group looked at 1200 articles published between 2020 and 2024 and looked at how many had code, and then how many of those could actually be used. Some headline figures were, of those articles only 481 had code repositories associated with them. Of those they tried to run, and only 10% worked, some even no longer exist having been deleted after publication, and so forth. They also did a dive into those that didn't run, and worked out why they didn't run, looking at was it lack of documentation, lack of dependancies and so forth. I made a lot of notes here, but given the paper for this is still in review I feel best to wait for it to emerge.

One of the more interesting comments was how it is seen in the review process. Of 75 journals that were surveyed, 65% mandate code be published, 34% ask for it (I assume without it blocking publication if not available), but only 4% give the reviewers any guidelines on how to review the code itself, and so effectively very little is done beyond checking the presence of code. Some reviewers interviewed did say they looked for a README or such, but they also had some reviewers say "we'd not try to rerun wet-lab experiments, so why would we try run the code?"

I think this is all a great survey and the fact that the group did a lot of actually grind to check all these papers is valuable versus the gut instinct (that the entire audience shared) that code published isn't runnable. I think there's a second question here which would also cover data availability too, but I don't want to let that detract from this work which I appreciated.

Ina went through a list of possible things publishers should do to address this, the most interesting of which I thought was drafting in early career researchers to help with code review for papers, and ensuring they get credit for this obviously. I kinda like this idea, though it might be hard to get a perfect match, it's a great way to not only get review, but build up code-for-publication as a habit in new researchers.

As a final note to this session, Ina mentioned FAIR (and here), which I'd not come across before, which is a guiding principles for scientific data management and Ina was advocating these should be used for code also.

RSE Career panel

Day 1 closed with a group discussion on RSE careers, the notes for this are online. Common themes mostly stemmed from the fact that in the nordics this isn't for most people their full time role, they work in other departments (e.g., the university HPC group), and so there was talk of how to get funded for it, and how to ring fence time for such work.

Day 2

Day 2 was mostly short lightning talks of about ten minutes long, with a couple of longer talks and two panels thrown in also.

Panel : How to finance your RSE group - Jørn Dietze, HPC group at UiT The Arctic University of Norway

Jørn Dietze is a member of the RSE group at UiT, but it's not really funded, they are part of HPC/central admin for university. The RSE side is done as slack time, so roughly 20% per person. They do 2 hours a week office hours where students come along with problems.

This was then held in contrast to the Research Software Engineers service at Aalto, which is part of the computing department, and represented by Richard Darst (who gave the previous day's talk on what we can learn from the history of open source). It started with no funding, helping out, then starts helping projects with funding, where in theory they can bill hours. Finance pushed back, saying nothing under a month is worth doing the billing for. Then a centre was set up for AI, which funded an research engineer, and in theory they work for the centre, but any spare time is used as general RSE. Also out of the university HPC group originally - so experience of working with other depts.

Their funding breaks down as

  • Big enough (more than month): on grant
  • Small projects: out the departments general funding

Inspiration from UK:

Another topic was acknowledgements for work so as to try show group value.

  • Some require RSE groups require acknowledgements in papers (not co-authorship)
  • At Aalto they collate the publications they assisted with every year to show contribution to department

This section is a bit disjoint, but we covered a lot of topics in an hour!

CodeRefinery: Where Research Software Engineers can begin and grow by Samantha Wittke, CSC - IT Center for Science

Samantha Wittke talked aboutCodeRefinery, which is a collaborative project that:

  • Provides hands-on training for coding for research
  • Focus on good-enough
  • Support Open Science and FAIR software development

The teaching sits between introductory programming basics and high perf/GPU training. They're not the only ones doing it, and it sounds like they exchange ideas with other groups, e.g., The Carpentries FAIR Research Software course. The courses are open licensed CC-BY.

CodeRefinery run workshops twice a year with global access, via both online and some in person classrooms. Currently they serve about 500 students per year and have 30 instructors/speakers.

They also run a Zulip channel to go along side the course and provide networking (it's the same Zulip used by Nordic-RSE).

Ways to get involved:

  • Become a co-instructor
  • Contribute to lesson materials
  • Join as an observer

They have had people become RSE types after completing CodeRefinery courses.

LUMI AI Guide by Gregor Decristoforo, UiT The Arctic University of Norway and Oskar Taubert, CSC - IT Center for Science

Gregor Decristoforo and Oskar Taubert talked about the Finish national supercomputer LUMI, in particular how it can be used for AI work, despite not being designed for that sort of workload. Apparently now 29% of users are using it for AI, 41% for machine learning. Most of this is with done with PyTorch.

The main challenges users face when using LUMI:

  • Software installation: how do people get their software on to LUMI? Options include singularity containers and easy build modules. This is typically set up by support team.
  • LUMI uses AMD GPUs, so no CUDA support I guess, which is somewhat more common
  • It uses the Luster file system, but that isn't well suited to many small files, which is common in Python environments
  • Helping people scaling training jobs to multiple nodes
  • Monitor and profiling

To this end they've put together a LUMI AI Guide on how to go from laptop to LUMI, and Gregor and Oskar walked us through select parts of that.

It uses Slurm for job access, which I chatted to Gregor about over lunch, and which will crop up again in a later talk. I'll put some notes on Slurm and what we do in the EEG below.

Harnessing power and synergy – Using game engines to produce visualisation and simulation by Filip Berendt, KTH Royal Institute of Technology

Filip Berendt gave a talk on using game engines in research software, which is something that I've seen used in our group before, but it was interesting to see a more broad appraisal of how it can be applied.

The top level was that they can be quite useful, though not ideally matched, are great for prototyping and mixing in visualisation at the core of your work, and that licensing can be an issue, for example the Unity controversy from last year - ultimately the game engine developers structure the payment model around games, not science!

The engines covered in the discussion where Unity 3D, which Filip did use until the licensing issue, Unreal Engine, and Godot (which is open source).

Filip showed an example he'd built using Unity for implementing a model of pedestrian steering algorithm, compared with established algorithm Related works developed their own testing env, and visualisation done after the fact using a second environment - game engine lets you do both. I think this last fact is probably generally under-appreciated as to how important visualisation of results are for spotting issues in large data sets, so I like this a lot.

How Alarming by Callum Rollo, Voice of the Ocean Foundation

Callum Rollo works for Voice of The Ocean, who have several autonomous underwater drones in the waters around Sweden: you can see a live feed of their location online. They will occasionally surface to report back data using an Iridium back-haul, and if they need support, they will stay on the surface until given more instructions. This is the most dangerous time for the drones, as they can get hit by boats, versus when they're lower down in the water they're relatively safe, and so when a drone surfaces and requests assistance, the right expert must be fetched quickly.

Callum had to build a system to handle emergency calling of people in the company, with redundancy, and slow escalation up the staff hierarchy if calls aren't handled. Building a reliable system like this is hard - it's not a job I'd relish taking on given that a false positive is going to annoy a lot of people, and a false negative can be very expensive.

It was a nice reminder that RSE software isn't just about data-processing pipelines or HPCs or perhaps even embedded sensor software. The tooling Callum put in place here is essential to the science work, but not being on the data collection or processing path probably isn't something we think of RSEs doing. But the tooling itself can be quite similar, as Callum pulled all this together using Python.

The RSE experience at the northernmost university in the world by Gregor Decristoforo, HPC group at UiT The Arctic University of Norway

Gregor Decristoforo gave a follow up to a talk last year about the nascent RSE group forming within the HPC group a UiT. They are now 6 part time RSEs, having started 2.5 years ago, growing out of the HPC group, which is part of IT Dept at UiT.

Mostly they collaborate with groups that they were part of before they joined (as in, the individuals were in a particular other discipline research group, and they now help those groups under this new banner).

Challenges of forming an RSE group:

  • Visibility of the group to researchers
  • Convincing higher-ups RSE is valuable
  • Mostly working as HPC engineers, so time is limited for RSE jobs
  • People come with R problems, but its often a stats problem, and so not their area of expertise

That last one is an interesting one that hasn't come up for us in our cluster in the EEG group, but perhaps that's because everyone knows I'm not good at R or stats :)

It's not in my notes, but IIRC they hold an office hour once a week to help people that rotates between members.

The periodic table: R package and visualized with ggplot2 by Taha Ahmed

Another data-visualisation talk, this time Taha Ahmed talking on an R package he built to make a customisable version of the periodic table. There was a lack of freely licensed periodic table for customising, so he made his own.

Internal data store is in yaml for both data and metadata, which are split into two parts, which is flexible, but gives raise to data-hygiene issues reading yaml into R (the usual JSON/YAML issues with lack of typing).

Works nicely in a notebook, you can set values per entry and visualise on the table.

Experiences from 'High-performance R' course by Heli Juottonen, CSC - IT Center for Science

The next talk was by Heli Juottonen talking about how at CSC they try to teach people to use R in an HPC context and the training course they run. The slides for this talk are here. The course was made by a pair of people: one a biologist looking at R as a tool, and the other as a comp-sci.

Heli maintains R on a supercomputer; they use r-env for running R on the HPC machines.

Common question: "Blah in R is taking too long, they run out memory, what now?" This certainly echos the questions we get on our computer cluster, and its frustrating (to Michael) that it's so hard to answer such seemingly simple questions - (though not unexpected: yay, halting problem).

The course aims:

  • Understanding resource usage, finding bottlenecks
  • Parallel and distributed computing

Audience:

  • RStudio user on supercomputer but doesn't know how to utilise the resources well

It's a two day course, with the first day being about measurement, and the second day about batch jobs and dealing with distribution over multiple cores/nodes.

One problem they hit was with users bringing their data - needs cleaning before use so slows down the course.

N-min-ct - a research software to tell how sustainable is the neighbourhood. by Ruslan Zhuravchak, Norwegian University of Science and Technology

Ruslan Zhuravchak gave a talk on how he helped implement a project to facilitate the interactive analysis of urban form and mobility in smart (15-minute) ciries, and of various relevant performance metrics as suggested by FME/ZEN - Zero Emission Neighbourhoods.

Project was to try to assess how well a city is meeting ZEN KPIs based on sensor data.

Unfortunately the project was only internally available, so whilst we got a demo which was quite interesting, I can't link to it alas. I had a chat with Ruslan afterwards, and he hopes to get it published. In the EEG group we have a few people working on smart city metrics, and Ruslan seemed keen to chat to those interested.

Modern HPC-access alternatives for the bash averse by Viktor Rehnberg, Chalmers University of Technology

This session was quite interesting to me, as I semi-manage a compute "cluster" of machines shared by a set of ecologists. This is not an HPC set up, rather a set of very-large regular computers (256 cores per machine, 1TB RAM, etc.). We deliberately have taken a hands off approach to access, just leaving it as ssh access to a particular machine, but I think whilst we've got away with that, I'd like to see what else we can do here.

One of the themes I've seen consistently in this conference is the adaption of Slurm, as was the case here. This talk wasn't about Slrum per se, but it did show me different ways our compute cluster could be presented, even if this talk was about HPC and we're just a bag of big computers (BBoC :).

Victor Rehnberg gave this talk, and he started by trying to define what is an HPC cluster:

  • Users access over login node
  • It contains many nodes
  • Is managed by a scheduler (typically Slurm)
  • Has shared storage for the nodes (which enables the scheduled to distribute jobs)

From this measurement, perhaps our BBoC does count as HPC, all it's missing is the login node and the scheduler. I usually think of HPC as having slightly more GPUs or other specialist hardware to them.

The typical way you'd access the HPC is you ssh to login node, use command line to make jobs via the scheduler, and then it'll run your work at some point as resources allow. Currently we run a social scheduler (aka a Slack channel that people rarely use), and quite often I have to go nag people about this.

The other topic that came up in a lunch discussion (I think with Gregor and Maria) was I realise that by not using Slurm, which is the de-facto standard for HPC, we're not preparing our users for when they migrate up to proper HPC. There will always be a big set of learning needed when moving from a big normal computer to a dedicated HPC, but if we moved to make our environment a little more like this it might both make things run smoother and help our users prepare for the wider world of scientific computing? In the past I've been resistant to using Slurm just as it adds overhead, but now we have more help on the server management side, perhaps it's time to reconsider that.

Anyway, back to the talk! The main thrust of Viktor's talk was about what if you don't want to use ssh, can you use other tools to access the HPC? Graphical tools for instance. The answer was yes, and the options he presented were:

  1. Use X-forwarding - as an old person I love that this is still considered an option
  2. Remote desktop - thinlinc most common, but commercial. When you connect you are still using sbatch etc. from a terminal to launch jobs, but matlab etc. can also x-forward from compute notes.
  3. Gfxlauncher - runs on login node
  4. IDEs like Visual Studio Code or PyCharm, using ssh remote. I suspect VSCode is what most of our ecologists use to access the BBoC.
  5. Language environments like Jypter (for Python), RStudio server, matlab-proxy, which can be tunnelled over ssh.
  6. Web portal that sets up the above, like Open OnDemand
  7. Language libraries that let you code up job submission: e.g., submitit and nextflow

Training Triumphs: Maximizing Impact with External Domain Experts by Yonglei Wang, Linköpings Universitet

Yonglei Wang works at Linköpings Universitet in the national supercomputer centre there, and gave an overview of all the various bits of training that are available, specifically from ENCCS - EuroCC National Competence Centre Sweden. Aims to enable HPC, AI, and High performance data analytics (HPDA).

They run domain specific events: Bio-molecular, fluid dynamics. quantum chemistry, quantum computing - no ecology! They have had 3600 participants: 80% academic, 8% public sector, 7% large companies, and 5% SMEs. Gender breakdown was 23% female, 73% male.

There was a long list of the training, but alas too much for me to note here - check out ENCCS lessons list for more - but there's definitely some I want to check out.

My discussion session on lineage in scientific data-processing pipelines

This I'll write up and link to shortly as an independent post! But it was (at least from my perspective) a success, with many interesting tools and techniques I'd not been aware of before. I say, a proper post on that so everyone can share the results soon!

Misc other notes

  • Oxford has a dedicated RSE group, hat tip to Gregor Decristoforo for pointing them out to me.
  • AI Factories seemed to be somewhat of a contentious term I'd not come across before, which seems to be an EU initiative to power AI projects. I suspect it's seen as hype that is draining money and people don't quite know what it means.
  • The discussion/panel sessions were ran via the audience collaboratively editing a markdown document that was on the projector, and the moderator calling out interesting things and asking whoever wrote that to speak a little on that. As a technique it worked really well with this size of audience, both live, and leaves everyone with notes for afterwards!
  • Nordic-RSE 2026 will be in Tromsø, Jun 9-10!