Human Centered Data Science (Fall 2018)

From CommunityData
Revision as of 21:55, 2 January 2019 by Jtmorgan (talk | contribs) (rmv header)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Human Centered Data Science
DATA 512 - UW Interdisciplinary Data Science Masters Program - Thursdays 5:00-9:50pm in ART 003.
Principal instructor
Jonathan T. Morgan
Co-instructor
Os Keyes
Course Website
This wiki page is the canonical information resource for DATA512. All other course-related information will be linked on this page. We will use the Canvas site for announcements, file hosting, and submitting reading reflections, graded in-class assignments, and other programming and writing assignments. We will use Slack for Q&A and general discussion.
Course Description
Fundamental principles of data science and its human implications, including research ethics; data privacy; legal frameworks; algorithmic bias, transparency, fairness and accountability; data provenance, curation, preservation, and reproducibility; user experience design and research for big data; human computation; data communication and visualization; and societal impacts of data science.[1]

Overview and learning objectives[edit]

The format of the class will be a mix of lecture, discussion, in-class activities, and qualitative and quantitative research assignments. Students will work in small groups for in-class activities, and work independently on all class project deliverables and homework assignments. Instructors will provide guidance in completing the exercises each week.

By the end of this course, students will be able to:

  • Analyze large and complex data effectively and ethically with an understanding of human, societal, and socio-technical contexts.
  • Take into account the ethical, social, and legal considerations when designing algorithms and performing large-scale data analysis.
  • Combine quantitative and qualitative research methods to generate critical insights into human behavior.
  • Discuss and evaluate ethical, social and legal trade-offs of different data analysis, testing, curation, and sharing methods.

Course resources[edit]

All pages and files on this wiki that are related to the Fall 2018 edition of DATA 512: Human-Centered Data Science are listed in Category:HCDS (Fall 2018).

Office hours[edit]

  • Os Keyes: Monday (5pm-7pm) and Wednesday (5-7pm), Sieg 431, and by request.
  • Jonathan Morgan: Google Meet, by request


Datasets[edit]

For some examples of datasets you could use for your final project, see Human Centered Data Science/Datasets.

Lecture slides[edit]

Slides for weekly lectures will be available in PDF form on this wiki, generally within 24 hours of each course session

Schedule[edit]

Direct link: Human Centered Data Science (Fall 2018)/Schedule

Course schedule (click to expand)


Week 1: September 27[edit]

Day 1 plan

Day 1 slides

Introduction to Human Centered Data Science
What is data science? What is human centered? What is human centered data science?
Assignments due
Agenda
  • Syllabus review
  • Pre-course survey results
  • What do we mean by data science?
  • What do we mean by human centered?
  • How does human centered design relate to data science?
  • Looking ahead: Week 2 assignments and topics


Readings assigned
Homework assigned
  • Reading reflection
Resources




Week 2: October 4[edit]

Day 2 plan


Ethical considerations
privacy, informed consent and user treatment


Assignments due
  • Week 1 reading reflection
Agenda
  • Intro to assignment 1: Data Curation
  • A brief history of research ethics
  • Guest lecture: Javier Salido and Mark van Hollebeke, "A Practitioners View of Privacy & Data Protection"
  • Guest lecture: Javier Salido, "Differential Privacy"
  • Contextual Integrity in data science
  • Week 2 reading reflection


Readings assigned


Homework assigned
Resources




Week 3: October 11[edit]

Day 3 plan

Day 3 slides

Reproducibility and Accountability
data curation, preservation, documentation, and archiving; best practices for open scientific research
Assignments due
  • Week 2 reading reflection
Agenda
  • Six Provocations for Big Data: Review & Reflections
  • A primer on copyright, licensing, and hosting for code and data
  • Introduction to replicability, reproducibility, and open research
  • Reproducibility case study: fivethirtyeight.com
  • Group activity: assessing reproducibility in data journalism
  • Overview of Assignment 1: Data curation


Readings assigned
Homework assigned
  • Reading reflection
Resources


Assignment 1 Data curation resources





Week 4: October 18[edit]

Day 4 plan

Day 4 slides

Interrogating datasets
causes and consequences of bias in data; best practices for selecting, describing, and implementing training data


Assignments due
Agenda
  • Final project: Goal, timeline, and deliverables.
  • Overview of assignment 2: Bias in data
  • Reading reflections review
  • Sources of bias in datasets
  • Introduction to assignment 2: Bias in data
  • Sources of bias in data collection and processing
  • In-class exercise: assessing bias in training data


Readings assigned (Read both, reflect on one)
Homework assigned


Resources




Week 5: October 25[edit]

Day 5 plan

Day 5 slides

Introduction to mixed-methods research
Big data vs thick data; integrating qualitative research methods into data science practice; crowdsourcing


Assignments due
  • Reading reflection


Agenda
  • Assignment 1 review & reflection
  • Week 4 reading reflection discussion
  • Survey of qualitative research methods
  • Mixed-methods case study #1: The Wikipedia Gender Gap: causes & consequences
  • In-class activity: Automated Gender Recognition scenarios
  • Introduction to ethnography
  • Ethnographic research case study: Structured data on Wikimedia Commons
  • Introduction to crowdwork
  • Overview of Assignment 3: Crowdwork ethnography


Readings assigned (Read both, reflect on one)


Homework assigned


Qualitative research methods resources
Wikipedia gender gap research resources
Crowdwork research resources





Week 6: November 1[edit]

Day 6 plan

Day 6 slides

Interrogating algorithms
algorithmic fairness, transparency, and accountability; methods and contexts for algorithmic audits
Assignments due
Agenda
  • Reading reflections
  • Ethical implications of crowdwork
  • Algorithmic transparency, interpretability, and accountability
  • Auditing algorithms
  • In-class activity: auditing the Perspective API


Readings assigned


Homework assigned
  • Reading reflection


Resources





Week 7: November 8[edit]

Day 7 plan

Day 7 slides

Critical approaches to data science
power, data, and society; ethics of crowdwork


Assignments due
  • Reading reflection
  • A3: Crowdwork ethnography


Agenda
  • Guest lecture: Rochelle LaPlante


Readings assigned (read both, reflect on one)
Homework assigned


Resources





Week 8: November 15[edit]

Day 8 plan

Day 8 slides

Human-centered algorithm design
algorithmic interpretibility; human-centered methods for designing and evaluating algorithmic systems


Assignments due
  • Reading reflection


Agenda
  • Final project overview & examples
  • Guest Lecture: Kelly Franznick, Blink UX
  • Reading reflections
  • Human-centered algorithm design
  • design process
  • user-driven evaluation
  • design patterns & anti-patterns


Readings assigned
Homework assigned
  • Reading reflection
Resources





Week 9: November 22 (No Class Session)[edit]

Day 9 plan

Data science for social good
Community-based and participatory approaches to data science; Using data science for society's benefit
Assignments due
  • Reading reflection
  • A4: Final project plan
Agenda
  • Reading reflections discussion
  • Feedback on Final Project Plans
  • Guest lecture: Steven Drucker (Microsoft Research)
  • UI patterns & UX considerations for ML/data-driven applications
  • Final project presentation: what to expect
  • In-class activity: final project peer review


Readings assigned
Homework assigned
  • Reading reflection
Resources





Week 10: November 29[edit]

Day 10 plan

Day 10 slides

User experience and big data
Design considerations for machine learning applications; human centered data visualization; data storytelling


Assignments due
  • Reading reflection


Agenda
  • Reading reflections discussion
  • Feedback on Final Project Plans
  • Guest lecture: Steven Drucker (Microsoft Research)
  • UI patterns & UX considerations for ML/data-driven applications
  • Final project presentation: what to expect
  • In-class activity: final project peer review


Readings assigned
  • NONE
Homework assigned
  • A5: Final presentation
Resources





Week 11: December 6[edit]

Day 11 plan

Final presentations
course wrap up, presentation of student projects


Assignments due
  • A5: Final presentation


Agenda
  • Student final presentations
  • Course wrap-up


Readings assigned
  • none!
Homework assigned
  • A6: Final project report (due 12/9 by 11:59pm)
Resources
  • one




Week 12: Finals Week (No Class Session)[edit]

  • NO CLASS
  • A6: FINAL PROJECT REPORT DUE BY 11:59PM on Sunday, December 9
  • LATE PROJECT SUBMISSIONS NOT ACCEPTED.

Assignments[edit]

For details on individual assignments, see Human Centered Data Science (Fall 2018)/Assignments


Assignments are comprised of weekly in-class activities, weekly reading reflections, written assignments, and programming/data analysis assignments. Weekly in-class reading groups will discuss the assigned readings from the course and students are expected to have read the material in advance. In class activities each week are posted to Canvas and may require time outside of class to complete.

Unless otherwise noted, all assignments are due before 5pm on the following week's class.

Unless otherwise noted, all assignments are individual assignments.

Assignment timeline[edit]

Assignments due every week
  • In-class activities - 2 points (weekly): In-class activity output posted to Canvas (group or individual) within 24 hours of class session.
  • Reading reflections - 2 points (weekly): Reading reflections posted to Canvas (individual) before following class session.


Scheduled assignments
  • A1 - 5 points (due 10/18): Data curation (programming/analysis)
  • A2 - 10 points (due 11/1): Sources of bias in data (programming/analysis)
  • A3 - 10 points (due 11/8): Crowdwork Ethnography (written)
  • A4 - 10 points (due 11/22): Final project plan (written)
  • A5 - 10 points (due 12/6): Final project presentation (oral, slides)
  • A6 - 15 points (due 12/9): Final project report (programming/analysis, written)

more information...


Policies[edit]

The following general policies apply to this course.

Respect[edit]

Students are expected to treat each other, and the instructors, with respect. Students are prohibited from engaging in any kind of harassment or derogatory behavior, which includes offensive verbal comments or imagery related to gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, ethnicity, or religion. In addition, students should not engage in any form of inappropriate physical contact or unwelcome sexual attention, and should respect each others’ right to privacy in regards to their personal life. In the event that you feel you (or another student) have been subject to a violation of this policy, please reach out to the instructors in whichever form you prefer.

The instructors are committed to providing a safe and healthy learning environment for students. As part of this, students are asked not to wear any clothing, jewelry, or any related medium for symbolic expression which depicts an indigenous person or cultural expression re­appropriated as a mascot, logo, or caricature. These include, but are not limited to, iconography associated with the following sports teams:

  1. Chicago Blackhawks
  2. Washington Redskins
  3. Cleveland Indians
  4. Atlanta Braves

Attendance and participation[edit]

Students are expected to attend class regularly. If you run into a conflict that requires you to be absent (for example, medical issues) feel free to reach out to the instructors. We will do our best to ensure that you don’t miss out, and treat your information as confidential.

If you miss class session, please do not ask the professor or TA what you missed during class; check the website or ask a classmate (best bet: use Slack). Graded in-class activities cannot be made up if you miss a class session.

Grading[edit]

Active participation in class activities is one of the requirements of the course. You are expected to engage in group activities, class discussions, interactions with your peers, and constructive critiques as part of the course work. This will help you hone your communication and other professional skills. Correspondingly, working in groups or on teams is an essential part of all data science disciplines. As part of this course, you will be asked to provide feedback of your peers' work.


Individual assignments will have specific requirements listed on the assignment sheet, which the instructor will make available on the day the homework is assigned. If you have questions about how your assignment was graded, please see the TA or instructor.

Assignments and coursework[edit]

Grades will be determined as follows:

  • 20% in-class work
  • 20% reading reflections
  • 60% assignments

You are expected to produce work in all of the assignments that reflects the highest standards of professionalism. For written documents, this means proper spelling, grammar, and formatting.

Late assignments will not be accepted; if your assignment is late, you will receive a zero score. Again, if you run into an issue that necessitates an extension, please reach out. Final projects cannot be turned in late and are not eligible for any extension whatsoever.

Academic integrity and plagiarism[edit]

Students are expected to adhere to rules around academic integrity. Simply stated, academic integrity means that you are to do your own work in all of your classes, unless collaboration is part of an assignment as defined in the course. In any case, you must be responsible for citing and acknowledging outside sources of ideas in work you submit.

Please be aware of the HCDE Department's and the UW's policies on plagiarism and academic misconduct: HCDE Academic Conduct policy. This policy will be strictly enforced.

Other academic integrity resources:

Disability and accommodations[edit]

As part of ensuring that the class is as accessible as possible, the instructors are entirely comfortable with you using whatever form of note-taking method or recording is most comfortable to you, including laptops and audio recording devices. The instructors will do their best to ensure that all slides and scripts/notes are immediately available online after a lecture has concluded. In addition, if asked ahead of time we can try to record the audio of individial lectures for students who have learning differences that make audiovisual notes preferable to written ones.

If you require additional accommodations, please contact Disabled Student Services: 448 Schmitz, 206-543-8924 (V/TTY). If you have a letter from DSS indicating that you have a disability which requires academic accommodations, please present the letter to the instructors so we can discuss the accommodations you might need in the class. If you have any questions about this policy, reach out to the instructors directly.

For more information on disability accommodations, and how to apply for one, please review UW's Disability Resources for Students.

Disclaimer[edit]

This syllabus and all associated assignments, requirements, deadlines and procedures are subject to change.

References[edit]