CommunityData:Hyak tutorial: Difference between revisions
From CommunityData
(Created page with "This file provides a complete, step-by-step walk-through for how to parse a list of Wikia wikis with wikiq. The same principles can be followed for other tasks. == Things you...") |
|||
Line 4: | Line 4: | ||
* Computing paradigms: [[:wikipedia:High performance computing]] versus [[:wikipedia:MapReduce|MapReduce and the Hadoop]] | * Computing paradigms: [[:wikipedia:High performance computing]] versus [[:wikipedia:MapReduce|MapReduce and the Hadoop]] | ||
* ikt versus mox and the transition | * [https://wiki.cac.washington.edu/display/hyakusers/WIKI+for+Hyak+users#WIKIforHyakusers-HyakOverview ikt versus mox] and the transition | ||
** This material will cover getting setup on the older ikt cluster | |||
** Our mox cluster is online and we will migrating to it in late 2019/early 2020 | |||
== Setup steps (only need to be done once) == | == Setup steps (only need to be done once) == |
Revision as of 18:30, 2 August 2019
This file provides a complete, step-by-step walk-through for how to parse a list of Wikia wikis with wikiq. The same principles can be followed for other tasks.
Things you should know before you start
- Computing paradigms: wikipedia:High performance computing versus MapReduce and the Hadoop
- ikt versus mox and the transition
- This material will cover getting setup on the older ikt cluster
- Our mox cluster is online and we will migrating to it in late 2019/early 2020
Setup steps (only need to be done once)
Create a users directory for yourself in /com/users:
You will want to store the output of your script in /com/, or you will run out of space in your personal filesystem (/usr/lusers/...)
$ mkdir /com/users/USERNAME # Replace USERNAME with your user name
2. Create a batch_jobs directory
$ mkdir /com/users/USERNAME/batch_jobs
3. Create a symlink from your home directory to this directory (this lets you use the /com storage from the more convenient home directory)
$ ln -s /com/users/USERNAME/batch_jobs ~/batch_jobs
4. Create a user in parallel SQL
$ module load parallel_sql $ sudo pssu --initial $ [sudo] password for USERID: <Enter your UW NetID password>