CommunityData:Klone
Klone is the latest version of hyak, the UW super computing system. We will soon have a larger allocation of machines on Klone than on Mox. The Klone machines have 40 cores and either 384GB or 768GB of RAM.
Setup
The recommended way to manage software for your research projects on Klone is to use Singularity containers. You can build a singularity container on your local machine using the linux package manager of your choice. The instructions on this page document how to build the cdsc_base.sif
singularity package which provides python, R, julia, and pyspark based on Debian 10 (Buster).
Installing singularity on your local computer
We want singularity version 3.7.1 which is the version installed oh klone. Follow these instructions for installing singularity on your local linux machine.
Creating a singularity container
The file cdsc_base.def
is a singularity definition file that contains instructions for installing software and configuring the environment. We just have to run:
sudo singularity build --sandbox cdsc_base_sandbox cdsc_base.def
This can take quite awhile to run as it installs a lot of software!
You might run into trouble with exceeding space in your temporary file path. If you do, run
sudo export SINGULARITY_TMPDIR=/my/large/tmp
sudo export SINGULARITY_CACHEDIR=/my/large/apt_cache
sudo export SINGULARITY_LOCALCACHEDIR=/my/large/apt_cache
before running the build.
We built this container as a sandbox
container, which is mutable. However, singularity recommends using immutable containers. We can convert the mutuable container to an immutable one by running build again.
sudo singularity build cdsc_base.sif cdsc_base_sandbox
Copy cdsc_base.sif
to your user directory under /gscratch
on klone.
You can open a shell in the container by running.
singularity shell --no-home cdsc_base.sif
The potentially confusing thing about using singularity on klone, stems from the fact that you have to be root to modify the root directories of a container. This is why you have to install software on the container locally. However, once you have made the immutable cdsc_base.sif
file you can use the software installed in the container to do work outside of the container!
The cdsc_base_sandbox
is mutable, so we can continue working on that environment and installing more software as we like. We just need to build it as a .sif
file to use it on klone. It's also possible to convert the container back into sandbox mode and then modify non-root parts of the container on klone, but this requires running the container in a way that makes the outside klone system invisible! This is useful for installing R or Python packages in userspace within the container. It's not that useful for working with data outside of the container.
So in summary, the workflow is:
- Install software into a sandbox container on your local machine.
- Keep the
cdsc_base.def
file up to date so your container is reproducible. - Convert the sandbox container to a immutable
.sif
container. - Copy the immutable container to klone.
- Run programs in the container to work with files outside of it (possibly including other packages, allowing us to use debian to bootstrap klone-optimzed binaries).
Initial .Bashrc
Before we get started using our singularity package on klone, we need to start with a .bashrc
that just sets the umask
so that other members of the group can edit your files and that loads singularity.
# .bashrc
# Stuff that's in there already that you need for working with the cluster.
# Add the following two lines
umask 007
module load singularity/3.7.1
## this makes it so singularity can see /gscratch/comdata
export SINGULARITY_BIND="/gscratch/comdata:/gscratch/comdata"