Editing CommunityData:Hyak tutorial
From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 45: | Line 45: | ||
# Try the int_machine | # Try the int_machine | ||
# Try a shorter lease time. | # Try a shorter lease time. You can see the contents of our alias command by typing <code>which <node-alias></code>. You'll see a --time=$walltime flag. <code>echo $walltime</code> will tell you that current walltime is a large number, like 200:00:00 -- 200 hours! Copy-paste the alias contents (starting with srun...., no single-quotes) and set the time to something smaller. | ||
# Check ourjobs to see who is using the other nodes, and ask on the irc channel to see if anyone can free up a node | # Check ourjobs to see who is using the other nodes, and ask on the irc channel to see if anyone can free up a node | ||
==== My job on int_machine is getting killed, or doesn't have enough memory ==== | ==== My job on int_machine is getting killed, or doesn't have enough memory ==== | ||
When you request an int_machine with srun, the default is 24G. Try a higher number. Technically the max is 240G but that would mean no one else in the group can have any memory if they need to access an int_machine....so ask for no more than 216G unless you're able to vacate the node right away if asked. | When you request an int_machine with srun, the default is 24G. Try a higher number. Technically the max is 240G but that would mean no one else in the group can have any memory if they need to access an int_machine....so ask for no more than 216G unless you're able to vacate the node right away if asked. | ||
=== Running a job across many cores using GNU R's parallelization features === | === Running a job across many cores using GNU R's parallelization features === | ||
Line 106: | Line 102: | ||
=== Setup for running batch jobs on Hyak (only need to be done once) === | === Setup for running batch jobs on Hyak (only need to be done once) === | ||
Create a users directory for yourself in / | Create a users directory for yourself in /com/users: | ||
You will want to store the output of your script in / | You will want to store the output of your script in /com/, or you will run out of space in your personal filesystem (/usr/lusers/...) | ||
$ mkdir / | $ mkdir /com/users/USERNAME # Replace USERNAME with your user name | ||
2. Create a batch_jobs directory | 2. Create a batch_jobs directory | ||
$ mkdir / | $ mkdir /com/users/USERNAME/batch_jobs | ||
3. Create a symlink from your home directory to this directory (this lets you use the /com storage from the more convenient home directory) | 3. Create a symlink from your home directory to this directory (this lets you use the /com storage from the more convenient home directory) | ||
$ ln -s / | $ ln -s /com/users/USERNAME/batch_jobs ~/batch_jobs | ||
4. Create a user in parallel SQL | 4. Create a user in parallel SQL |