CommunityData:Hyak migration

This page is a list of things that we want to do to migrate from ikt to mox.


 * 1) Copy data (only raw data, data that we are using in current and future projects)
 * 2) Backup other data?
 * 3) Copy code (Everyone copy their own user directory)
 * 4) Create a shared .bashrc that everyone will load. This will provide us with a shared environment (python, R, other packages, useful aliases).

Hyak Migration Working Group
 * 1) Mako
 * 2) Kaylea
 * 3) Nate
 * 4) Sayamindu
 * 5) Jeremy
 * 6) Jim?

Shared environment design
We will use custom modules to maintain installations of software that we use. Sometimes the hyak team already provides a module that we need (i.e. up-to-date R and Python) then we should prefer these packages so we don't have to do the work of compiling and packaging the modules. But if we want to be on the cutting edge of python and R we'll be in the business.

Since I (Nate) typically develop code on my laptop before running it on hyak. I think it is ideal if our Hyak environment maintains versions of software that are equivalent to those included in Debian Buster whenever possible. Ideally we will even provide modules for important R and Python packages (e.g. spark, ggplot, pandas) so that we can keep versions consistent and stable over time.

We'll create a list of packages that people can expect to be loaded in their environments and load them in the shared.

We'll also provide a shared  that provide common commands for interacting with slurm.

List of modules we'll maintain on Hyak (WIP)
We can get a list of packages from /gscratch/com/local/bin on Ikt.

Add packages you want below!
[X] zsh

[X] Spark 2.4

[X] Python 3.7 Installed Anoconda and created a minimal anaconda environment to speed up startup time. This seems like the easiest way to get an optimized python installation.

[X] R 3.6.2

[X] moreutils 0.62 (seems like at least some of the moreutils are broken (i.e. parallel))

[X] emacs 25

[X] p7zip 16.02

[X] htop 2.2.0

[X] pandoc 2.2.1

[X] gcc 4.9+

RStudio Server
It could be nice to run an RStudio server on the interactive node to provide a nicer IDE for working interactively on hyak compared to Jupyter notebooks or editing in the terminal. If this isn't feasible then we should use kibo for this purpose instead.

Etherpad link
https://etherpad.wikimedia.org/p/cdsc_hyak_migration_todo

Scheduler Options
It might be a good idea to ask the hyak folks to configure the scheduler for our partition so that we can request specific quantities of memory or cpus in our jobs. (Hyak Wiki)

This might be particularly useful if we don't get more nodes soon because it allows us to chunk up nodes into smaller pieces.

Getting more nodes
We have funding in the ecology grant (and maybe other sources of funding as well) that we can use to purchase additional Mox capacity. Let's keep track of plans around that to try to minimize the gap from the time we lose Ikt nodes until we get more Mox nodes.