This page collects resources for Community Data Science Collective members.
If you're new to the collective, check out the Introduction to CDSC Resources.
Technical documentation and getting setup
- CommunityData:Compute Overview and Resource Matching -- What we have and what it's good for
- CommunityData:Backups (nada) — Details on what is, and what isn't, backed up from nada.
- CommunityData:Beamer — Installing/using Mako's Beamer templates
- CommunityData:Build papers — Both the TeX and Beamer templates above come along with a Makefile that makes some assumptions about your workflow. Learn about that here.
- CommunityData:Code — List of software projects maintained by the collective.
- CommunityData:Email — Information on email aliases and their management.
- CommunityData:Embedding fonts in PDFs —
ggplot2creates PDFs with fonts that are not embedded which, in turn, causes the ACM to bounce our papers back. This page describes how to fix it.
- CommunityData:Exporting from Python to R
- CommunityData:Git — Getting set up on the git server
- CommunityData:Hyak — Using the Hyak supercomputer system at UW for research.
- CommunityData:Hyak setup — Getting an account and getting setup on Hyak.
- CommunityData:Hyak Spark — Documents how to use Spark on Hyak
- CommunityData:Northwestern VPN — Connecting to the Northwestern VPN
- CommunityData:Jargon — Jargon and Common Shorthand
- CommunityData:Jitsi — Some etiquette/usability tips for Jitsi, our preferred video conference tool.
- CommunityData:Knitr — Using Knitr with Tex to build graphs, tables, insert and format numbers in tex documents.
- CommunityData:ORES - Using ORES with wikipedia data
- CommunityData:Planning document — Details on producing Matsuzaki-style planning documents
- CommunityData:reveal.js — Using RMarkdown to create reveal.js HTML presentations
- CommunityData:TeX — Installing our LaTeX templates
- CommunityData:Tmux — Using tmux (terminal multiplexer) to keep a persistent session on a server.
- CommunityData:Zotero — How to use our shared Zotero directory.
- CommunityData:Wikia data — Documents information about how to get and validate wikia dumps.
- CommunityData:Message Walls -- Documents information about how to get and validate wikia dumps.
Ongoing and Future Meetings and Meetups
Meetings and Meetups of Past
- CommunityData:Meetup April 2020
- CommunityData:Meetup September 2019
- CommunityData:Meetup March 2019
- Sociotechnocanonicon Great Books Discussion Series
- CommunityData:Meetup September 2018
- CommunityData:Meetup April 2018
- CommunityData:Meetup April 2018: Organizational notes
- CommunityData:Meetup July 2017
- CommunityData:UW Weekly Meeting
- Schedule — Deadlines, events, and similar
- CommunityData:Logos — Like our visual branding, not like λόγος. Although we should always make sure we're good in that department too.
- CommunityData:Advice on writing a background section to an academic paper — Once upon a time, Mako and Aaron submitted a paper with a mediocre introduction to a journal. Mac Parks, the editor of that journal at the time, set us straight with some very clear pointers. Save yourself the trouble and learn to follow these today!
- Community Data Science Lab (UW) — Directions to the lab space at UW. This is something you can share with visitors.
University of Washington Resources
Chat on IRC
A number of us are idling in #communitydata on OFTC (irc.oftc.net). IRC is basically a chat system that is similar to Slack in many ways. In fact, it was the inspiration for Slack!
To use IRC, you'll need a client. A really good one for folks new to IRC is IRCCloud. With IRCCloud there is a web interface as well as good Apps for iOS and Android.
One limitation of IRCCloud is that, after a 1-week trial period, the system will disconnect folks every two hours. There are a couple options for this. The easiest one is subscribing to IRC cloud which costs $5/month or $50 a year. You just pay for a year and send the receipt to User:Ashaw who will pay the bill! If you are totally new to IRC and just want something easy and straight forward, this is our recommendation.
The other options are more indirect, technical, and/or involve a bit more work or figuring stuff out:
- On option is an IRC "bouncer" such as ZNC.
- IRC Bridge/Matrix
- Another option that a few people are doing is connecting connecting through a service with an IRC bridge. One popular one is the Matrix protocol. While there are numerous clients, User:Salt recommends Riot which is freely supported on every platform. Once you get Matrix Join the room
#_oftc_#communitydata:matrix.orgto bridge into IRC from Matrix. Get in contact with User:Salt if you want to go this way.
Registering your "nick" (i.e., IRC username)
Due to spam, we have on a couple of occasions in the past had to block "non-registered" users from posting to
#communitydata. This helps a lot with the spam but has a big disadvantage in that if you speak on the channel but are not registered, nobody else will be able to hear you!
It's a good idea to register your nickname or "nick" in any case because it means that can essentially reserve your nickname so that nobody else can have it.
If you've already done this long ago, you can stop reading, there's nothing new to do. If you haven't gotten fully registered and verified, here are the four "easy" steps, heavily mediated by the NickServ bot:
- From IRC (irc.oftc.net),
/msg NickServ register <<your password>> <<your email>>
- Verification is via the website, see: 
- Ask NickServ '
checkverify' and she'll tell you if you're verified.
- Please say something on the channel to test. If you do not get a response, your messages can still be seen by community members.
The light in the lab at UW is funny. We have three fluorescent lights. On flipping the light switch, only two turn on. The third turns on eventually. We are studying this arcane phenomenon over at CommunityData:Light events
- CommunityData:GameIDs — A directory containing the game IDs for CDSC members to connect with each other across various gaming platforms.