GitHub CI/CD workflows with EESSI¶
In a previous blog post, “An example CI workflow that leverages EESSI CI tools”, Pedro Santos Neves explained how to set up a GitLab CI workflow. This post will focus on GitHub CI workflows and show how to access the development repository of EESSI.
Using the CI component in GitHub¶
We will use the pyMBE1 and
SwarmRL2 projects as examples.
They both rely on the molecular dynamics simulation package
ESPResSo3 available in EESSI.
SwarmRL requires features only available in the development version of ESPResSo,
while pyMBE supports both the last stable release of ESPResSo and the development branch of ESPResSo.
EESSI can satisfy both communities: software.eessi.io
provides stable releases of scientific software identified by their version number,
while dev.eessi.io provides development snapshots identified by a commit hash.
Historically, both SwarmRL and pyMBE had to build ESPResSo from sources in every CI job. This added 15 min of build time and required extra steps to properly install build dependencies (pyMBE) or configure a custom Docker image (SwarmRL). Both projects migrated to the EESSI GitHub Action to reduce the complexity and execution times of their CI/CD workflows.
Quickstart¶
The SwarmRL project uses a compact CI/CD workflow5 that loads project dependencies from EESSI, installs extra Python dependencies in a virtual environment, runs the testsuite, and uploads a code coverage report. It is reproduced here, simplified for clarity, with annotations:
- Run the testsuite in a virtual machine with the latest Ubuntu operating system, using an AMD Zen3 host machine.
- Locally clone the repository and checkout the branch that triggered the workflow.
- Set up the EESSI software stack version 2023.06 for the automatically-detected microarchitecture of the host machine.
- Set up project dependencies:
- activate the
dev.eessi.iodevelopment repository using version 2023.06 and microarchitecture AMD Zen2 (line 17 not required when onlysoftware.eessi.iois needed) - load the ESPResSo package from commit hash
dc87ede - create a Python virtual environment and install project dependencies not available in EESSI
- activate the
- Reload the software environment and run the testsuite with code coverage enabled.
- Publish the code coverage report.
This workflow is transferable to other software projects with minimal changes. For a Python project, it is only a matter of substituting lines 18 and 29 by the appropriate commands. For projects in other programming languages, Python-specific commands can be safely removed.
When a workflow grows in complexity, it might be desirable to decouple the dependencies installation step from the testsuite step. One can leverage the modular architecture of GitHub Action workflows and delegate dependencies management to a self-contained action that can be called from the CI/CD workflow. We will explore this strategy in the next section.
A custom action to manage dependencies¶
Actions are reusable building blocks that are called from a workflow.
We already saw the EESSI action and the Codecov action in the previous section.
Custom actions
use the same syntax and structure as workflows, but are stored in a different folder:
actions go to .github/actions/my_action/action.yml,
while workflows go to .github/workflows/my_workflow.yml.
Here is how pyMBE defines a custom action to manage dependencies with EESSI and pip6:
A workflow that calls this action will load software from an EESSI stack (argument modules)
and install Python packages from PyPI (argument extra-python-packages).
Packages are managed by a Lmod module collection
and by a Python virtual environment.
The steps can be broken down as follows:
- source the
dev.eessi.iodevelopment repository, which extends thesoftware.eessi.ioproduction repository already sourced by the EESSI GitHub Action- line 14 can be removed when only the
software.eessi.iorepository is needed - microarchitecture
x86_64/amd/zen2is selected, which is forward-compatible with the Zen3 microarchitecture available on standard GitHub-hosted runners - Ubuntu ARM64 runners have microarchitecture compatible with the
aarch64/nvidia/gracesoftware stack that EESSI ships
- line 14 can be removed when only the
- load the list of EESSI modules provided by the workflow and stash them to a collection called
espresso - create a Python virtual environment that gives priority to Python packages available in EESSI
(
venvoption--system-site-packages, other package managers have a different syntax) - source the Python virtual environment
- paste the list of Python package requirements from the workflow into the project's
requirements.txt, pip install them, and restorerequirements.txt - clean up the session
This action allows us to "hide" the logic for package installation, and should only be edited by project maintainers. Project developers can ignore this file and focus on the workflow, described in the next section.
A modular CI/CD workflow¶
We will define a CI/CD workflow that sets up project dependencies, runs the testsuite, uploads a code coverage report, and builds the software documentation for deployment to GitHub Pages. Since the file is rather long, we will break it down into reusable code fragments and go through them with the help of the workflow syntax for GitHub Actions reference page.
We start by defining a workflow called testsuite that is automatically triggered
when pushing a commit to a branch (push event) or updating a pull request (pull_request event):
| Workflow general properties | |
|---|---|
We then define a job to run in an Ubuntu 24.04 virtual machine:
| Virtual machine settings | |
|---|---|
Since details about the host machine hardware can be fuzzy in a virtual machine,
we need to help OpenMPI select a suitable communication mechanism with environment variables (env field).
Other MPI implementations and virtual machines may require their own workarounds.
Next, we define a matrix strategy to automatically create jobs with different input parameters:
| Job matrix configuration | |
|---|---|
In this case, we simply define a flat list with two elements, one for each ESPResSo version, and give the job a name that reflects the input parameters combination. The syntax also accepts lists of values, in which case the n-fold Cartesian product would be evaluated.
We can now define the steps that will be executed by every job of the job matrix:
We start by cloning the git repository and checking out the branch that triggered the workflow. We then set up the EESSI 2023.06 stack version (it needs to match the one defined in our custom action in the previous section). Finally, we use our custom action to install dependencies, using the job matrix parameters to select a specific ESPResSo version and providing a list of extra packages for code linting, code coverage and documentation generation.
We now have a complete environment to run the actual testsuite:
| Testing the code | |
|---|---|
This step:
- reloads the module collection called
espresso(which contains ESPResSo and its dependencies, such as Python and NumPy) and sources the Python virtual environment (which contains tools for code linting and coverage) - runs static code analysis (
make pylint) to detect anti-patterns - runs the testsuite (
make unit_tests)-j4lets the test driver run the tests concurrently on all 4 CPU coresCOVERAGE=1is a pyMBE-specific Makefile variable to enable code coverage collection (injects-m coverage run --parallel-modein the Python invocation of ESPResSo)
- generates the software documentation in HTML format (
make docs) - generates the code coverage report in XML format (
make coverage_xml)
We now turn our attention to the continuous delivery part of the workflow:
| Saving the software documentation as an artifact | |
|---|---|
The software documentation is uploaded to workflow artifacts with a 48h retention policy, but only with the stable release of ESPResSo. When the CI/CD workflow runs on the main branch of the repository and is successful, a separate deployment workflow8 is automatically triggered to download the artifact and publish it to pymbe-dev.github.io/pyMBE.
The code coverage report is submitted to Codecov using an upload token:
| Publishing the code coverage report | |
|---|---|
Coverage reports are published to app.codecov.io/gh/pyMBE-dev/pyMBE.
The readme file of the pyMBE project also displays a code coverage badge that
dynamically refreshes every time the main branch is updated.
This step is skipped on forked projects, although fork owners can manually activate it
in a new branch by removing the if field and setting up their own token.
Here is the complete workflow7:
Conclusion¶
We have just learned how to configure CI/CD workflows to install dependencies, run a testsuite, upload coverage reports, and generate software documentation. The pyMBE and SwarmRL workflows are modular and can be easily adapted to fit the needs of your own project.
If you are new to the GitHub workflow syntax, don't worry: it is easy to learn and the GitHub documentation is extremely well-written. To troubleshoot any issue with your first workflow, consider using the GitHub Action tmate to remotely log into the virtual machine via SSH or web shell. To do so, add the following step after the checkout step:
| Remotely logging into a virtual machine | |
|---|---|
The EESSI GitHub Action streamlines the installation of scientific software in the cloud, reduces the complexity of CI/CD workflows, tightens the CI feedback loop for developers and saves on billable hours for cloud resources.
Another use case for the EESSI GitHub Action is executable research papers4. There is more than one way to design them; in the case of pyMBE, all simulation scripts from the pyMBE paper1 were added to the code repository as code samples that run every two weeks in a samples workflow9 to detect regressions in the development branch of the software. The workflow can be triggered manually, for example before merging a large pull request, and takes 1h of runtime, compared to the 10min runtime of the CI/CD workflow.
-
David Beyer, Paola B. Torres, Sebastian P. Pineda, Claudio F. Narambuena, Jean-Noël Grad, Peter Košovan, and Pablo M. Blanco. “pyMBE: the Python-based molecule builder for ESPResSo”. In: The Journal of Chemical Physics 161.2 (Modular and Interoperable Software for Chemical Physics, July 2024), pp. 022502, doi:10.1063/5.0216389. ↩↩
-
Samuel Tovey, Christoph Lohrmann, Tobias Merkt, David Zimmer, Konstantin Nikolaou, Simon Koppenhöfer, Anna Bushmakina, Jonas Scheunemann, and Christian Holm. “SwarmRL: building the future of smart active systems”. In: The European Physical Journal E 48.4–5 (April 2025), pp. 16, doi:10.1140/epje/s10189-025-00477-4. ↩
-
Florian Weik, Rudolf Weeber, Kai Szuttor, Konrad Breitsprecher, Joost de Graaf, Michael Kuron, Jonas Landsgesell, Henri Menke, David Sean, and Christian Holm. “ESPResSo 4.0 – an extensible software package for simulating soft matter systems”. In: The European Physical Journal Special Topics, 227.14 (March 2019), pp. 1789–1816, doi:10.1140/epjst/e2019-800186-9. ↩
-
Jana Lasser. “Creating an executable paper is a journey through Open Science”. In: Communications Physics 3.1 (August 2020), pp.143, doi:10.1038/s42005-020-00403-4. ↩
-
SwarmRL
181edfaCI/CD workflow: SwarmRL/SwarmRL@181edfa:.github/workflows/espresso.yml↩ -
pyMBE v1.0.0 custom action: pyMBE-dev/pyMBE@1.0.0:
.github/actions/dependencies/action.yml↩ -
pyMBE v1.0.0 CI/CD workflow: pyMBE-dev/pyMBE@1.0.0:
.github/workflows/testsuite.yml↩ -
pyMBE v1.0.0 GitHub Pages workflow: pyMBE-dev/pyMBE@1.0.0:
.github/workflows/deploy.yml↩ -
pyMBE v1.0.0 samples workflow: pyMBE-dev/pyMBE@1.0.0:
.github/workflows/samples.yml↩