Human Centered Data Science (Fall 2019)/Schedule
From CommunityData
< Human Centered Data Science (Fall 2019)
Revision as of 23:46, 8 September 2019 by Jtmorgan (talk | contribs) (→Week 12: Finals Week (No Class Session))
This page is a work in progress.
Week 1: September 26
- Introduction to Human Centered Data Science
- What is data science? What is human centered? What is human centered data science?
- Assignments due
- fill out the pre-course survey
- Read: Provost, Foster, and Tom Fawcett. Data science and its relationship to big data and data-driven decision making. Big Data 1.1 (2013): 51-59.
- Agenda
- Readings assigned
- Hickey, Walt. The Dollars and Cents Case Against Hollywood's Exclusion of Women. FiveThirtyEight, 2014.
- Keegan, Brian. The Need for Openness in Data Journalism. 2014.
- Homework assigned
- Reading reflection
- A1: Data curation
- Resources
- Aragon, C. et al. (2016). Developing a Research Agenda for Human-Centered Data Science. Human Centered Data Science workshop, CSCW 2016.
- Kling, Rob and Star, Susan Leigh. Human Centered Systems in the Perspective of Organizational and Social Informatics. 1997.
- Harford, T. (2014). Big data: A big mistake? Significance, 11(5), 14–19.
Week 2: October 3
- Reproducibility and Accountability
- data curation, preservation, documentation, and archiving; best practices for open scientific research
- Assignments due
- Week 1 reading reflection
- A1: Data curation
- Agenda
- Readings assigned
- Homework assigned
- Reading reflection
- A2: Bias in data
- Resources
- Hickey, Walt. The Bechdel Test: Checking Our Work. FiveThirtyEight, 2014.
- J. Priem, D. Taraborelli, P. Groth, C. Neylon (2010), Altmetrics: A manifesto, 26 October 2010.
- Assignment 1 Data curation resources
- Chapter 2 "Assessing Reproducibility" and Chapter 3 "The Basic Reproducible Workflow Template" from The Practice of Reproducible Research University of California Press, 2018.
- sample code for API calls (view the notebook, download the notebook).
- See the datasets page for examples of well-documented and not-so-well documented open datasets.
Week 3: October 10
- Interrogating datasets
- causes and consequences of bias in data; best practices for selecting, describing, and implementing training data
- Assignments due
- Week 2 reading reflection
- Agenda
- Readings assigned (Read both, reflect on one)
- Wang, Tricia. Why Big Data Needs Thick Data. Ethnography Matters, 2016.
- Shilad Sen, Margaret E. Giesel, Rebecca Gold, Benjamin Hillmann, Matt Lesicko, Samuel Naden, Jesse Russell, Zixiao (Ken) Wang, and Brent Hecht. 2015. Turkers, Scholars, "Arafat" and "Peace": Cultural Communities and Algorithmic Gold Standards. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW '15)
- Homework assigned
- Reading reflection
- Resources
- Olteanu, A., Castillo, C., Diaz, F., & Kiciman, E. (2016). Social data: Biases, methodological pitfalls, and ethical boundaries.
- Brian N Larson. 2017. Gender as a Variable in Natural-Language Processing: Ethical Considerations. EthNLP, 3: 30–40.
- Bender, E. M., & Friedman, B. (2018). Data Statements for NLP: Toward Mitigating System Bias and Enabling Better Science. To appear in Transactions of the ACL.
- Isaac L. Johnson, Yilun Lin, Toby Jia-Jun Li, Andrew Hall, Aaron Halfaker, Johannes Schöning, and Brent Hecht. 2016. Not at Home on the Range: Peer Production and the Urban/Rural Divide. CHI '16. DOI: https://doi.org/10.1145/2858036.2858123
- Leo Graiden Stewart, Ahmer Arif, A. Conrad Nied, Emma S. Spiro, and Kate Starbird. 2017. Drawing the Lines of Contention: Networked Frame Contests Within #BlackLivesMatter Discourse. Proc. ACM Hum.-Comput. Interact. 1, CSCW, Article 96 (December 2017), 23 pages. DOI: https://doi.org/10.1145/3134920
- Cristian Danescu-Niculescu-Mizil, Robert West, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. No country for old members: user lifecycle and linguistic change in online communities. In Proceedings of the 22nd international conference on World Wide Web (WWW '13). ACM, New York, NY, USA, 307-318. DOI: https://doi.org/10.1145/2488388.2488416
Week 4: October 17
- Introduction to mixed-methods research
- Big data vs thick data; integrating qualitative research methods into data science practice; crowdsourcing
- Assignments due
- Week 3 reading reflection
- A2: Bias in data
- Agenda
- Homework assigned
- Read and reflect: Barocas, Solan and Nissenbaum, Helen. Big Data's End Run around Anonymity and Consent. In Privacy, Big Data, and the Public Good. 2014.
- A3: Crowdwork ethnography
- Qualitative research methods resources
- Ladner, S. (2016). Practical ethnography: A guide to doing ethnography in the private sector. Routledge.
- Spradley, J. P. (2016). The ethnographic interview. Waveland Press.
- Spradley Participant Observation FIXME
- Eriksson, P., & Kovalainen, A. (2015). Ch 12: Ethnographic Research. In Qualitative methods in business research: A practical guide to social research. Sage.
- Usability.gov, System usability scale.
- Nielsen, Jakob (2000). Why you only need to test with five users. nngroup.com.
- Wikipedia gender gap research resources
- Crowdwork research resources
- WeArDynamo contributors. How to be a good requester and Guidelines for Academic Requesters. Wearedynamo.org
Week 5: October 24
- Research ethics for big data
- privacy, informed consent and user treatment
- Assignments due
- Week 4 reading reflection
- Agenda
- Homework assigned
- Read and reflect: Mary Gray, Ghost Work FIXME
- Final project proposal FIXME
- Resources
- National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. The Belmont Report. U.S. Department of Health and Human Services, 1979.
- Bethan Cantrell, Javier Salido, and Mark Van Hollebeke (2016). Industry needs to embrace data ethics: Here's how it could be done. Workshop on Data and Algorithmic Transparency (DAT'16). http://datworkshop.org/
- Javier Salido (2012). Differential Privacy for Everyone. Microsoft Corporation Whitepaper.
- Markham, Annette and Buchanan, Elizabeth. Ethical Decision-Making and Internet Researchers. Association for Internet Research, 2012.
- Hill, Kashmir. Facebook Manipulated 689,003 Users' Emotions For Science. Forbes, 2014.
- Adam D. I. Kramer, Jamie E. Guillory, and Jeffrey T. Hancock Experimental evidence of massive-scale emotional contagion through social networks. PNAS 2014 111 (24) 8788-8790; published ahead of print June 2, 2014.
- Barbaro, Michael and Zeller, Tom. A Face Is Exposed for AOL Searcher No. 4417749. New York Times, 2008.
- Zetter, Kim. Arvind Narayanan Isn’t Anonymous, and Neither Are You. WIRED, 2012.
- Gray, Mary. When Science, Customer Service, and Human Subjects Research Collide. Now What? Culture Digitally, 2014.
- Tene, Omer and Polonetsky, Jules. Privacy in the Age of Big Data. Stanford Law Review, 2012.
- Dwork, Cynthia. Differential Privacy: A survey of results. Theory and Applications of Models of Computation , 2008.
- Hsu, Danny. Techniques to Anonymize Human Data. Data Sift, 2015.
Week 6: October 31
- Data science and society
- power, data, and society; ethics of crowdwork
- Assignments due
- Reading reflection
- A3: Crowdwork ethnography
- Agenda
- Homework assigned
- Read both, reflect on one:
- Baumer, E. P. S. (2017). Toward human-centered algorithm design. Big Data & Society.
- Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the People: The Role of Humans in Interactive Machine Learning. AI Magazine, 35(4), 105.
- Resources
Week 7: November 7
- Human centered machine learning
- algorithmic fairness, transparency, and accountability; methods and contexts for algorithmic audits
- Assignments due
- Reading reflection
- A4: Project proposal
- Agenda
- Homework assigned
- Read and reflect: TBD
- A5: Final project plan
- Resources
Week 8: November 14
- User experience and data science
- algorithmic interpretibility; human-centered methods for designing and evaluating algorithmic systems
- Assignments due
- Reading reflection
- A5: Final project plan
- Agenda
- Homework assigned
- Reading and reflect: TBD (data science ethics survey paper)
- A6: Final project presentation
- Resources
- Ethical OS Toolkit and Risk Mitigation Checklist. EthicalOS.org.
- Morgan, J. 2016. Evaluating Related Articles recommendations. Wikimedia Research.
- Morgan, J. 2017. Comparing most read and trending edits for the top articles feature. Wikimedia Research.
- Michael D. Ekstrand, F. Maxwell Harper, Martijn C. Willemsen, and Joseph A. Konstan. 2014. User perception of differences in recommender algorithms. In Proceedings of the 8th ACM Conference on Recommender systems (RecSys '14).
- Sean M. McNee, John Riedl, and Joseph A. Konstan. 2006. Making recommendations better: an analytic model for human-recommender interaction. In CHI '06 Extended Abstracts on Human Factors in Computing Systems (CHI EA '06).
- Sean M. McNee, Nishikant Kapoor, and Joseph A. Konstan. 2006. Don't look stupid: avoiding pitfalls when recommending research papers. In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work (CSCW '06).
- Michael D. Ekstrand and Martijn C. Willemsen. 2016. Behaviorism is Not Enough: Better Recommendations through Listening to Users. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys '16).
- Jess Holbrook. Human Centered Machine Learning. Google Design Blog. 2017.
- Anderson, Carl. The role of model interpretability in data science. Medium, 2016.
Week 9: November 21
- Data science in organizations
- TBD
- Assignments due
- Reading reflection
- Agenda
- Homework assigned
- Read and reflect: TBD
- Resources
Week 10: November 28 (No Class Session)
- Assignments due
- Reading reflection
- Readings assigned
- NONE
- Homework assigned
- NONE
- Resources
Week 11: December 5
- Final presentations
- presentation of student projects, course wrap up
- Assignments due
- Reading reflection
- A5: Final presentation
- Readings assigned
- NONE
- Homework assigned
- NONE
- Resources
- NONE
Week 12: Finals Week (No Class Session)
- NO CLASS
- A7: FINAL PROJECT REPORT DUE BY 5:00PM on Tuesday, December 10
- LATE PROJECT SUBMISSIONS NOT ACCEPTED.