Human Centered Data Science (Fall 2019)/Schedule



Week 1: September 26
Day 1 plan


 * Introduction to Human Centered Data Science: What is data science? What is human centered? What is human centered data science?


 * Assignments due
 * Fill out the pre-course survey
 * Read and reflect: Provost, Foster, and Tom Fawcett. Data science and its relationship to big data and data-driven decision making. Big Data 1.1 (2013): 51-59.


 * Agenda


 * Homework assigned
 * Read and reflect on both:
 * Hickey, Walt. The Dollars and Cents Case Against Hollywood's Exclusion of Women. FiveThirtyEight, 2014.
 * Keegan, Brian. The Need for Openness in Data Journalism. 2014.


 * A1: Data curation


 * Resources
 * Aragon, C. et al. (2016). Developing a Research Agenda for Human-Centered Data Science. Human Centered Data Science workshop, CSCW 2016.
 * Kling, Rob and Star, Susan Leigh. Human Centered Systems in the Perspective of Organizational and Social Informatics. 1997.
 * Harford, T. (2014). Big data: A big mistake? Significance, 11(5), 14–19.

Week 2: October 3
Day 2 plan


 * Reproducibility and Accountability: data curation, preservation, documentation, and archiving; best practices for open scientific research


 * Assignments due
 * Week 1 reading reflection
 * A1: Data curation


 * Agenda


 * Homework assigned
 * Read and reflect: Olteanu, A., Castillo, C., Diaz, F., Kıcıman, E., & Kiciman, E. (2019). Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries. Frontiers in Big Data, 2, 13. https://doi.org/10.3389/fdata.2019.00013
 * A2: Bias in data


 * Resources
 * Hickey, Walt. The Bechdel Test: Checking Our Work. FiveThirtyEight, 2014.
 * J. Priem, D. Taraborelli, P. Groth, C. Neylon (2010), Altmetrics: A manifesto, 26 October 2010.
 * Chapter 2 "Assessing Reproducibility" and Chapter 3 "The Basic Reproducible Workflow Template" from The Practice of Reproducible Research University of California Press, 2018.

Week 3: October 10
Day 3 plan


 * Interrogating datasets: causes and consequences of bias in data; best practices for selecting, describing, and implementing training data


 * Assignments due
 * Week 2 reading reflection


 * Agenda


 * Homework assigned
 * Read both, reflect on one:
 * Wang, Tricia. Why Big Data Needs Thick Data. Ethnography Matters, 2016.
 * Ford, D., Smith, J., Guo, P. J., & Parnin, C. (2016). Paradise unplugged: Identifying barriers for female participation on stack overflow. Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 13-18-Nove, 846–857. https://doi.org/10.1145/2950290.2950331


 * Resources
 * Bender, E. M., & Friedman, B. (2018). Data Statements for NLP: Toward Mitigating System Bias and Enabling Better Science. To appear in Transactions of the ACL.
 * Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumeé III, H., & Crawford, K. (2018). Datasheets for datasets. arXiv preprint arXiv:1803.09010.

Week 4: October 17
Day 4 plan


 * Introduction to mixed-methods research: Big data vs thick data; integrating qualitative research methods into data science practice; crowdsourcing


 * Assignments due
 * Week 3 reading reflection
 * A2: Bias in data


 * Agenda


 * Homework assigned
 * Read and reflect: Barocas, Solan and Nissenbaum, Helen. Big Data's End Run around Anonymity and Consent. In Privacy, Big Data, and the Public Good. 2014.
 * A3: Crowdwork ethnography


 * Qualitative research methods resources
 * Ladner, S. (2016). Practical ethnography: A guide to doing ethnography in the private sector. Routledge.
 * Spradley, J. P. (2016). The ethnographic interview. Waveland Press.
 * Spradley Participant Observation FIXME
 * Eriksson, P., & Kovalainen, A. (2015). Ch 12: Ethnographic Research. In Qualitative methods in business research: A practical guide to social research. Sage.
 * Usability.gov, System usability scale.
 * Nielsen, Jakob (2000). Why you only need to test with five users. nngroup.com.


 * Wikipedia gender gap research resources


 * Crowdwork research resources
 * WeArDynamo contributors. How to be a good requester and Guidelines for Academic Requesters. Wearedynamo.org

Week 5: October 24
Day 5 plan


 * Research ethics for big data: privacy, informed consent and user treatment


 * Assignments due
 * Week 4 reading reflection


 * Agenda


 * Homework assigned
 * Read and reflect: Mary Gray, Ghost Work FIXME
 * Final project proposal FIXME


 * Resources
 * National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. The Belmont Report. U.S. Department of Health and Human Services, 1979.
 * Bethan Cantrell, Javier Salido, and Mark Van Hollebeke (2016). Industry needs to embrace data ethics: Here's how it could be done. Workshop on Data and Algorithmic Transparency (DAT'16). http://datworkshop.org/
 * Javier Salido (2012). Differential Privacy for Everyone. Microsoft Corporation Whitepaper.
 * Markham, Annette and Buchanan, Elizabeth. Ethical Decision-Making and Internet Researchers. Association for Internet Research, 2012.
 * Hill, Kashmir. Facebook Manipulated 689,003 Users' Emotions For Science. Forbes, 2014.
 * Adam D. I. Kramer, Jamie E. Guillory, and Jeffrey T. Hancock Experimental evidence of massive-scale emotional contagion through social networks. PNAS 2014 111 (24) 8788-8790; published ahead of print June 2, 2014.
 * Barbaro, Michael and Zeller, Tom. A Face Is Exposed for AOL Searcher No. 4417749. New York Times, 2008.
 * Zetter, Kim. Arvind Narayanan Isn’t Anonymous, and Neither Are You. WIRED, 2012.
 * Gray, Mary. When Science, Customer Service, and Human Subjects Research Collide. Now What? Culture Digitally, 2014.
 * Tene, Omer and Polonetsky, Jules. Privacy in the Age of Big Data. Stanford Law Review, 2012.
 * Dwork, Cynthia. Differential Privacy: A survey of results. Theory and Applications of Models of Computation, 2008.
 * Hsu, Danny. Techniques to Anonymize Human Data. Data Sift, 2015.

Week 6: October 31
Day 6 plan


 * Data science and society: power, data, and society; ethics of crowdwork


 * Assignments due
 * Reading reflection
 * A3: Crowdwork ethnography


 * Agenda


 * Homework assigned
 * Read both, reflect on one:
 * Baumer, E. P. S. (2017). Toward human-centered algorithm design. Big Data & Society.
 * Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the People: The Role of Humans in Interactive Machine Learning. AI Magazine, 35(4), 105.


 * A4: Final project plan


 * Resources

Week 7: November 7
Day 7 plan


 * Human centered machine learning: algorithmic fairness, transparency, and accountability; methods and contexts for algorithmic audits


 * Assignments due
 * Reading reflection
 * A4: Project proposal


 * Agenda


 * Homework assigned
 * Read and reflect: TBD
 * A5: Final project plan


 * Resources

Week 8: November 14
Day 8 plan


 * User experience and data science: algorithmic interpretibility; human-centered methods for designing and evaluating algorithmic systems


 * Assignments due
 * Reading reflection
 * A5: Final project plan


 * Agenda


 * Homework assigned
 * Reading and reflect: TBD (data science ethics survey paper)
 * A6: Final project presentation


 * Resources
 * Ethical OS Toolkit and Risk Mitigation Checklist. EthicalOS.org.
 * Morgan, J. 2016. Evaluating Related Articles recommendations. Wikimedia Research.
 * Morgan, J. 2017. Comparing most read and trending edits for the top articles feature. Wikimedia Research.
 * Michael D. Ekstrand, F. Maxwell Harper, Martijn C. Willemsen, and Joseph A. Konstan. 2014. User perception of differences in recommender algorithms. In Proceedings of the 8th ACM Conference on Recommender systems (RecSys '14).
 * Sean M. McNee, John Riedl, and Joseph A. Konstan. 2006. Making recommendations better: an analytic model for human-recommender interaction. In CHI '06 Extended Abstracts on Human Factors in Computing Systems (CHI EA '06).
 * Sean M. McNee, Nishikant Kapoor, and Joseph A. Konstan. 2006. Don't look stupid: avoiding pitfalls when recommending research papers. In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work (CSCW '06).
 * Michael D. Ekstrand and Martijn C. Willemsen. 2016. Behaviorism is Not Enough: Better Recommendations through Listening to Users. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys '16).
 * Jess Holbrook. Human Centered Machine Learning. Google Design Blog. 2017.
 * Anderson, Carl. The role of model interpretability in data science. Medium, 2016.

Week 9: November 21
Day 9 plan


 * Data science in organizations: TBD


 * Assignments due
 * Reading reflection


 * Agenda


 * Homework assigned
 * Read and reflect: TBD
 * A6: Final project presentation
 * A7: Final project report


 * Resources

Week 10: November 28 (No Class Session)

 * Assignments due
 * Reading reflection


 * Readings assigned
 * NONE


 * Homework assigned
 * NONE


 * Resources

Week 11: December 5

 * Final presentations: presentation of student projects, course wrap up''


 * Assignments due
 * Reading reflection
 * A5: Final presentation


 * Readings assigned
 * NONE


 * Homework assigned
 * NONE


 * Resources
 * NONE

Week 12: Finals Week (No Class Session)

 * NO CLASS
 * A7: FINAL PROJECT REPORT DUE BY 5:00PM on Tuesday, December 10 via Canvas
 * LATE PROJECT SUBMISSIONS NOT ACCEPTED.