CommunityData:Hyak tutorial: Difference between revisions

From CommunityData
No edit summary
Line 8: Line 8:
** Our mox cluster is online and we will migrating to it in late 2019/early 2020
** Our mox cluster is online and we will migrating to it in late 2019/early 2020


== Setup steps (only need to be done once) ==
== Connecting to Hyak ==
 
Details information on setting up Hyak is covered [[CommunityData:Hyak]]. Make sure you have:
 
* Set up SSH
* Connected to Hyak
* Set up your user's Hyak environment with the CDSC aliases and tools
 
== Setup for running batch jobs on Hyak (only need to be done once) ==


Create a users directory for yourself in /com/users:
Create a users directory for yourself in /com/users:

Revision as of 20:33, 2 August 2019

This file provides a complete, step-by-step walk-through for how to parse a list of Wikia wikis with wikiq. The same principles can be followed for other tasks.

Things you should know before you start

  • Computing paradigms: HPC versus MapReduce/Hadoop
  • ikt versus mox and the transition
    • This material will cover getting setup on the older ikt cluster
    • Our mox cluster is online and we will migrating to it in late 2019/early 2020

Connecting to Hyak

Details information on setting up Hyak is covered CommunityData:Hyak. Make sure you have:

  • Set up SSH
  • Connected to Hyak
  • Set up your user's Hyak environment with the CDSC aliases and tools

Setup for running batch jobs on Hyak (only need to be done once)

Create a users directory for yourself in /com/users:

You will want to store the output of your script in /com/, or you will run out of space in your personal filesystem (/usr/lusers/...)

$ mkdir /com/users/USERNAME  # Replace USERNAME with your user name

2. Create a batch_jobs directory

$ mkdir /com/users/USERNAME/batch_jobs

3. Create a symlink from your home directory to this directory (this lets you use the /com storage from the more convenient home directory)

$ ln -s /com/users/USERNAME/batch_jobs ~/batch_jobs

4. Create a user in parallel SQL

$ module load parallel_sql
$ sudo pssu --initial
$ [sudo] password for USERID: <Enter your UW NetID password>