Mid-Term Update: MPI Appliance for HPC Research on Chameleon

Hi everyone! This is my mid-term blog update for the project MPI Appliance for HPC Research on Chameleon, developed in collaboration with Argonne National Laboratory and the Chameleon Cloud community. This blog follows up on my earlier post, which you can find here.
π§ June 15 β June 29, 2025
Worked on creating and configuring images on Chameleon Cloud for the following three sites: CHI@UC, CHI@TACC, and KVM@TACC.
Key features of the images:
- Spack: Pre-installed and configured for easy package management of HPC software.
- Lua Modules (LMod): Installed and configured for environment module management.
- MPI Support: Both MPICH and Open MPI are pre-installed, enabling users to run distributed applications out-of-the-box.
These images are now publicly available and can be seen directly on the Chameleon Appliance Catalog, titled MPI and Spack for HPC (Ubuntu 22.04).
I also worked on some example Jupyter notebooks on how to get started using these images.
π§ June 30 β July 13, 2025
With the MPI Appliance now published on Chameleon Cloud, the next step was to automate the setup of an MPI-Spack cluster.
To achieve this, I developed a set of Ansible playbooks that:
- Configure both master and worker nodes with site-specific settings
- Set up seamless access to Chameleon NFS shares
- Allow users to easily install Spack packages, compilers, and dependencies across all nodes
These playbooks aim to simplify the deployment of reproducible HPC environments and reduce the time required to get a working cluster up and running.
π§ July 14 β July 28, 2025
This week began with me fixing some issues in python-chi, the official Python client for the Chameleon testbed. We also discussed adding support for CUDA-based packages, which would make it easier to work with NVIDIA GPUs. We successfully published a new image on Chameleon, titled MPI and Spack for HPC (Ubuntu 22.04 - CUDA), and added an example to demonstrate its usage.
We compiled the artifact containing the Jupyter notebooks and Ansible playbooks and published it on Chameleon Trovi. Feel free to check it out here. The documentation still needs some work.
π Thatβs it for now! Iβm currently working on the documentation, a ROCm-based image for AMD GPUs, and some container-based examples. Stay tuned for more updates in the next blog.