HCDS (Fall 2017)/Schedule



Week 1: September 28
Day 1 plan


 * Assignments due
 * fill out the pre-course survey


 * Agenda


 * Readings assigned
 * Watch: Why Humans Should Care About Data Science (Cecilia Aragon, 2016 HCDE Seminar Series)
 * Read: Aragon, C. et al. (2016). Developing a Research Agenda for Human-Centered Data Science. Human Centered Data Science workshop, CSCW 2016.
 * Read: Provost, Foster, and Tom Fawcett. Data science and its relationship to big data and data-driven decision making. Big Data 1.1 (2013): 51-59.
 * Read: Kling, Rob and Star, Susan Leigh. Human Centered Systems in the Perspective of Organizational and Social Informatics. 1997.


 * Homework assigned
 * Reading reflection


 * Resources
 * Ideo.org The Field Guide to Human-Centered Design. 2015.
 * Faraway, Julian. The Decline and Fall of Statistics. Faraway Statistics, 2015.
 * Press, Gil. Data Science: What's The Half-Life Of A Buzzword?'' Forbes, 2013.
 * Bloor, Robin. A Data Science Rant. Inside Analysis, 2013.
 * Various authors. Position papers from 2016 CSCW Human Centered Data Science Workshop. 2016.

Week 2: October 5
Day 2 plan

Ethical considerations in Data Science: privacy, informed consent and user treatment


 * Assignments due
 * Week 1 reading reflection


 * Agenda


 * Readings assigned
 * Read: Markham, Annette and Buchanan, Elizabeth. Ethical Decision-Making and Internet Researchers. Association for Internet Research, 2012.
 * Read: Barocas, Solan and Nissenbaum, Helen. Big Data's End Run around Anonymity and Consent. In Privacy, Big Data, and the Public Good. 2014. (PDF on Canvas)


 * Homework assigned
 * Reading reflection


 * Resources
 * Wittkower, D.E. Lurkers, creepers, and virtuous interactivity: From property rights to consent and care as a conceptual basis for privacy concerns and information ethics
 * National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. The Belmont Report. U.S. Department of Health and Human Services, 1979.
 * Hill, Kashmir. Facebook Manipulated 689,003 Users' Emotions For Science. Forbes, 2014.
 * Adam D. I. Kramer, Jamie E. Guillory, and Jeffrey T. Hancock Experimental evidence of massive-scale emotional contagion through social networks. PNAS 2014 111 (24) 8788-8790; published ahead of print June 2, 2014.
 * Barbaro, Michael and Zeller, Tom. A Face Is Exposed for AOL Searcher No. 4417749. New York Times, 2008.
 * Zetter, Kim. Arvind Narayanan Isn’t Anonymous, and Neither Are You. WIRED, 2012.
 * Gray, Mary. When Science, Customer Service, and Human Subjects Research Collide. Now What? Culture Digitally, 2014.
 * Tene, Omer and Polonetsky, Jules. Privacy in the Age of Big Data. Stanford Law Review, 2012.
 * Dwork, Cynthia. Differential Privacy: A survey of results. Theory and Applications of Models of Computation, 2008.
 * Green, Matthew. What is Differential Privacy? A Few Thoughts on Cryptographic Engineering, 2016.
 * Hsu, Danny. Techniques to Anonymize Human Data. Data Sift, 2015.
 * Metcalf, Jacob. Twelve principles of data ethics. Ethical Resolve, 2016.
 * Poor, Nathaniel and Davidson, Roei. When The Data You Want Comes From Hackers, Or, Looking A Gift Horse In The Mouth. CSCW Human Centered Data Science Workshop, 2016.

Week 3: October 12
Day 3 plan


 * Data provenance, preparation, and reproducibility: data curation, preservation, documentation, and archiving; best practices for open scientific research


 * Assignments due
 * Week 2 reading reflection


 * Agenda


 * Readings assigned
 * Read: Chapter 2 "Assessing Reproducibility" and Chapter 3 "The Basic Reproducible Workflow Template" from The Practice of Reproducible Research University of California Press, 2018.
 * Read: Hickey, Walt. The Dollars and Cents Case Against Hollywood's Exclusion of Women. FiveThirtyEight, 2014. AND Keegan, Brian. The Need for Openness in Data Journalism. 2014.


 * Homework assigned
 * Reading reflection
 * A1: Data curation


 * Examples of well-documented open research projects
 * Keegan, Brian. WeatherCrime. GitHub, 2014.
 * Geiger, Stuart R. and Halfaker, Aaron. Operationalizing conflict and cooperation between automated software agents in Wikipedia: A replication and expansion of "Even Good Bots Fight". GitHub, 2017.
 * Thain, Nithum; Dixon, Lucas; and Wulczyn, Ellery. Wikipedia Talk Labels: Toxicity. Figshare, 2017.
 * Narayan, Sneha et al. Replication Data for: The Wikipedia Adventure: Field Evaluation of an Interactive Tutorial for New Users. Harvard Dataverse, 2017.


 * Examples of not-so-well documented open research projects
 * Eclarke. SWGA paper. GitHub, 2016.
 * David Lefevre. Lefevre and Cox: Delayed instructional feedback may be more effective, but is this contrary to learners’ preferences? Figshare, 2016.
 * Alneberg. CONCOCT Paper Data. GitHub, 2014.


 * Other resources
 * Press, Gil. Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says. Forbes, 2016.
 * Christensen, Garret. Manual of Best Practices in Transparent Social Science Research. 2016.
 * Hickey, Walt. The Bechdel Test: Checking Our Work. FiveThirtyEight, 2014.
 * Chapman et al. Cross Industry Standard Process for Data Mining. IBM, 2000.

Week 4: October 19
Day 4 plan


 * Study design: understanding your data; framing research questions; planning your study


 * Assignments due
 * Reading reflection
 * A1: Data curation


 * Agenda


 * Readings assigned


 * Homework assigned
 * Reading reflection
 * A2: Bias in data


 * Resources
 * Aschwanden, Christie. Science Isn't Broken FiveThirtyEight, 2015.

Week 5: October 26
Day 5 plan


 * Machine learning: ethical AI, algorithmic transparency, societal implications of machine learning


 * Assignments due
 * Reading reflection
 * A2: Bias in data


 * Agenda


 * Readings assigned


 * Homework assigned
 * Reading reflection
 * A3: Final project plan


 * Resources

Week 6: November 2
Day 6 plan


 * Mixed-methods research: Big data vs thick data; qualitative research in data science 


 * Assignments due
 * Reading reflection


 * Agenda


 * Readings assigned


 * Homework assigned
 * Reading reflection


 * Resources

Week 7: November 9
Day 7 plan


 * Human computation: ethics of crowdwork, crowdsourcing methodologies for analysis, design, and evaluation


 * Assignments due
 * Reading reflection
 * A3: Final project plan


 * Agenda


 * Readings assigned


 * Homework assigned
 * Reading reflection
 * A4: Crowdwork self-ethnography


 * Resources
 * go here

Week 8: November 16
Day 8 plan


 * User experience and big data: prototyping and user testing; benchmarking and iterative evaluation; UI design for data science


 * Assignments due
 * Reading reflection


 * Agenda


 * Readings assigned


 * Homework assigned
 * Reading reflection


 * Resources

Week 9: November 23
Day 9 plan


 * Human-centered data science in the wild: community data science; data science for social good


 * Assignments due
 * Reading reflection
 * A4: Crowdwork self-ethnography


 * Agenda


 * Readings assigned


 * Homework assigned
 * Reading reflection


 * Resources

Week 10: November 30
Day 10 plan


 * Communicating methods, results, and implications: translating for non-data scientists ''


 * Assignments due
 * Reading reflection


 * Agenda


 * Readings assigned


 * Homework assigned
 * Reading reflection
 * A5: Final presentation


 * Resources
 * one

Week 11: December 7
Day 11 plan


 * Future of human centered data science: case studies from research, industry, and policy; final presentations


 * Assignments due
 * Reading reflection
 * A5: Final presentation


 * Agenda


 * Readings assigned
 * none!


 * Homework assigned
 * none!


 * Resources
 * one

Week 12: Finals Week

 * NO CLASS
 * A6: FINAL PROJECT REPORT DUE BY 11:59PM on Sunday, December 10
 * LATE PROJECT SUBMISSIONS NOT ACCEPTED.