Editing CommunityData:Hyak
From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 2: | Line 2: | ||
To use Hyak, you must first have a UW NetID, access to Hyak, and a two factor authentication token which you will need as part of [[CommunityData:Hyak setup|getting setup]]. The following links will be useful. | To use Hyak, you must first have a UW NetID, access to Hyak, and a two factor authentication token which you will need as part of [[CommunityData:Hyak setup|getting setup]]. The following links will be useful. | ||
* [[CommunityData:Klone | * [[CommunityData:Klone]] (for the new hyak nodes). | ||
* [[CommunityData:Hyak setup | * [[CommunityData:Hyak setup]] | ||
* [[CommunityData:Hyak software installation]] | * [[CommunityData:Hyak software installation]] | ||
* [[CommunityData:Hyak Spark]] | * [[CommunityData:Hyak Spark]] | ||
Line 39: | Line 39: | ||
=== X11 forwarding === | === X11 forwarding === | ||
You may also want to add these two lines to your Hyak <code>.ssh/config</code> (indented under the line starting with "Host"): | You may also want to add these two lines to your Hyak <code>.ssh/config</code> (indented under the line starting with "Host"): | ||
Line 177: | Line 175: | ||
== 5 productivity tips == | == 5 productivity tips == | ||
# Find a workflow that works for you. There isn't a standardized workflow for quantitative / computational social science or social computing. People normally develop idiosyncratic workflows around the distinctive tools they know or have been exposed and that meet their diverse needs and tastes | # Find a workflow that works for you. There isn't a standardized workflow for quantitative / computational social science or social computing. People normally develop idiosyncratic workflows around the distinctive tools they know or have been exposed and that meet their diverse needs and tastes. | ||
# If you find yourself spending time manually rerunning code in a multistage project, learn [https://en.wikipedia.org/wiki/Make_(software) Make] or another pipeline tool. Such tools take some effort but really help you organize, test, and refine your project. Make is a good choice because it is old and incredibly polished and featureful. You don't need to learn every feature, just the basics. Its interface has a different flavor than more recently designed tools which can be a downside. Other positives are that it is language agnostic and can run shell commands. | # If you find yourself spending time manually rerunning code in a multistage project, learn [https://en.wikipedia.org/wiki/Make_(software) Make] or another pipeline tool. Such tools take some effort but really help you organize, test, and refine your project. Make is a good choice because it is old and incredibly polished and featureful. You don't need to learn every feature, just the basics. Its interface has a different flavor than more recently designed tools which can be a downside. Other positives are that it is language agnostic and can run shell commands. | ||
# [https://slurm.schedmd.com/documentation.html Slurm] the system that you use to access hyak nodes, is also a very powerful system. The hyak team used to maintain a tool called parallel-sql which helped with running a large number of short-running programs. This tool is no longer supported, but [https://slurm.schedmd.com/job_array.html job arrays] are slurm feature that is even better. | # [https://slurm.schedmd.com/documentation.html Slurm] the system that you use to access hyak nodes, is also a very powerful system. The hyak team used to maintain a tool called parallel-sql which helped with running a large number of short-running programs. This tool is no longer supported, but [https://slurm.schedmd.com/job_array.html job arrays] are slurm feature that is even better. | ||
# Use the free resources. Job arrays (mentioned above) are great in combination with the [https://wiki.cac.washington.edu/display/hyakusers/Mox_checkpoint checkpoint queue]. The checkpoint (or ckpt) queue runs your jobs on other people's idle nodes. You can access thousands of cores and terabytes of RAM on the checkpoint queue. There are limitations. If the owner of a node wants to use it, they will cancel your job. If this happens, the scheduler will automatically restart it, and it has a maximum total running time (restarts don't reset the clock). Therefore, it is best suited for jobs that can be paused (saved) and restarted. If you can design a script to catch the checkpoint signal, save progress, and restart you will be able to make excellent use of the checkpoint queue. | # Use the free resources. Job arrays (mentioned above) are great in combination with the [https://wiki.cac.washington.edu/display/hyakusers/Mox_checkpoint checkpoint queue]. The checkpoint (or ckpt) queue runs your jobs on other people's idle nodes. You can access thousands of cores and terabytes of RAM on the checkpoint queue. There are limitations. If the owner of a node wants to use it, they will cancel your job. If this happens, the scheduler will automatically restart it, and it has a maximum total running time (restarts don't reset the clock). Therefore, it is best suited for jobs that can be paused (saved) and restarted. If you can design a script to catch the checkpoint signal, save progress, and restart you will be able to make excellent use of the checkpoint queue. | ||
<br />There is also virtually [https://hyak.uw.edu/docs/storage/gscratch/ unlimited free storage] on hyak under <code>/gscratch/scrubbed/comdata</code> with the catch that the storage is much slower and that files will be automatically deleted after a short time (currently 21 days). | |||
# Get connected to the hyak team and other hyak users. Hyak isn't perfect and has many recent issues related to the new Klone system. If you run into trouble and it feels like the system isn't working you should email help@uw.edu with a subject line that starts with "hyak:". They are nice and helpful. Other good resources are the [https://mailman12.u.washington.edu/mailman/listinfo/hyak-users mailing list] and if you are a UW student, the [https://depts.washington.edu/uwrcc/getting-started-2/getting-started/ research computing club]. The club has its own nodes, including GPU nodes that only students who join the club can use. | # Get connected to the hyak team and other hyak users. Hyak isn't perfect and has many recent issues related to the new Klone system. If you run into trouble and it feels like the system isn't working you should email help@uw.edu with a subject line that starts with "hyak:". They are nice and helpful. Other good resources are the [https://mailman12.u.washington.edu/mailman/listinfo/hyak-users mailing list] and if you are a UW student, the [https://depts.washington.edu/uwrcc/getting-started-2/getting-started/ research computing club]. The club has its own nodes, including GPU nodes that only students who join the club can use. | ||