Human Centered Data Science (Fall 2019)/Schedule
From CommunityData
This page is a work in progress.
Week 1: September 26
- Introduction to Human Centered Data Science
- What is data science? What is human centered? What is human centered data science?
- Assignments due
- Fill out the pre-course survey
- Read and reflect: Provost, Foster, and Tom Fawcett. Data science and its relationship to big data and data-driven decision making. Big Data 1.1 (2013): 51-59.
- Agenda
- Homework assigned
- Read and reflect on both:
- Hickey, Walt. The Dollars and Cents Case Against Hollywood's Exclusion of Women. FiveThirtyEight, 2014.
- Keegan, Brian. The Need for Openness in Data Journalism. 2014.
- Resources
- Aragon, C. et al. (2016). Developing a Research Agenda for Human-Centered Data Science. Human Centered Data Science workshop, CSCW 2016.
- Kling, Rob and Star, Susan Leigh. Human Centered Systems in the Perspective of Organizational and Social Informatics. 1997.
- Harford, T. (2014). Big data: A big mistake? Significance, 11(5), 14–19.
Week 2: October 3
- Reproducibility and Accountability
- data curation, preservation, documentation, and archiving; best practices for open scientific research
- Assignments due
- Week 1 reading reflection
- A1: Data curation
- Agenda
- Homework assigned
- Read and reflect: Olteanu, A., Castillo, C., Diaz, F., Kıcıman, E., & Kiciman, E. (2019). Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries. Frontiers in Big Data, 2, 13. https://doi.org/10.3389/fdata.2019.00013
- A2: Bias in data
- Resources
- Hickey, Walt. The Bechdel Test: Checking Our Work. FiveThirtyEight, 2014.
- J. Priem, D. Taraborelli, P. Groth, C. Neylon (2010), Altmetrics: A manifesto, 26 October 2010.
- Chapter 2 "Assessing Reproducibility" and Chapter 3 "The Basic Reproducible Workflow Template" from The Practice of Reproducible Research University of California Press, 2018.
Week 3: October 10
- Interrogating datasets
- causes and consequences of bias in data; best practices for selecting, describing, and implementing training data
- Assignments due
- Week 2 reading reflection
- Agenda
- Homework assigned
- Read both, reflect on one:
- Wang, Tricia. Why Big Data Needs Thick Data. Ethnography Matters, 2016.
- Ford, D., Smith, J., Guo, P. J., & Parnin, C. (2016). Paradise unplugged: Identifying barriers for female participation on stack overflow. Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 13-18-Nove, 846–857. https://doi.org/10.1145/2950290.2950331
- Resources
- Bender, E. M., & Friedman, B. (2018). Data Statements for NLP: Toward Mitigating System Bias and Enabling Better Science. To appear in Transactions of the ACL.
- Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumeé III, H., & Crawford, K. (2018). Datasheets for datasets. arXiv preprint arXiv:1803.09010.
Week 4: October 17
- Introduction to mixed-methods research
- Big data vs thick data; integrating qualitative research methods into data science practice; crowdsourcing
- Assignments due
- Week 3 reading reflection
- A2: Bias in data
- Agenda
- Homework assigned
- Read and reflect: Barocas, Solan and Nissenbaum, Helen. Big Data's End Run around Anonymity and Consent. In Privacy, Big Data, and the Public Good. 2014. (PDF available on Canvas)
- A3: Crowdwork ethnography
- Qualitative research methods resources
- Ladner, S. (2016). Practical ethnography: A guide to doing ethnography in the private sector. Routledge.
- Spradley, J. P. (2016). The ethnographic interview. Waveland Press.
- Spradley, J. P. (2016) Participant Observation. Waveland Press
- Eriksson, P., & Kovalainen, A. (2015). Ch 12: Ethnographic Research. In Qualitative methods in business research: A practical guide to social research. Sage.
- Usability.gov, System usability scale.
- Nielsen, Jakob (2000). Why you only need to test with five users. nngroup.com.
- Crowdwork research resources
- WeArDynamo contributors. How to be a good requester and Guidelines for Academic Requesters. Wearedynamo.org
Week 5: October 24
- Research ethics for big data
- privacy, informed consent and user treatment
- Assignments due
- Reading reflection
- Agenda
- Homework assigned
- Read and reflect: Gray, M. L., & Suri, S. (2019). Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. Eamon Dolan Books. (PDF available on Canvas)
- Resources
- National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. The Belmont Report. U.S. Department of Health and Human Services, 1979.
- Bethan Cantrell, Javier Salido, and Mark Van Hollebeke (2016). Industry needs to embrace data ethics: Here's how it could be done. Workshop on Data and Algorithmic Transparency (DAT'16). http://datworkshop.org/
- Javier Salido (2012). Differential Privacy for Everyone. Microsoft Corporation Whitepaper.
- Markham, Annette and Buchanan, Elizabeth. Ethical Decision-Making and Internet Researchers. Association for Internet Research, 2012.
- Hill, Kashmir. Facebook Manipulated 689,003 Users' Emotions For Science. Forbes, 2014.
- Adam D. I. Kramer, Jamie E. Guillory, and Jeffrey T. Hancock Experimental evidence of massive-scale emotional contagion through social networks. PNAS 2014 111 (24) 8788-8790; published ahead of print June 2, 2014.
- Barbaro, Michael and Zeller, Tom. A Face Is Exposed for AOL Searcher No. 4417749. New York Times, 2008.
- Zetter, Kim. Arvind Narayanan Isn’t Anonymous, and Neither Are You. WIRED, 2012.
- Gray, Mary. When Science, Customer Service, and Human Subjects Research Collide. Now What? Culture Digitally, 2014.
- Tene, Omer and Polonetsky, Jules. Privacy in the Age of Big Data. Stanford Law Review, 2012.
- Dwork, Cynthia. Differential Privacy: A survey of results. Theory and Applications of Models of Computation , 2008.
- Hsu, Danny. Techniques to Anonymize Human Data. Data Sift, 2015.
Week 6: October 31
- Data science and society
- power, data, and society; ethics of crowdwork
- Assignments due
- Reading reflection
- A3: Crowdwork ethnography
- Agenda
- Homework assigned
- Read both, reflect on one:
- Baumer, E. P. S. (2017). Toward human-centered algorithm design. Big Data & Society.
- Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the People: The Role of Humans in Interactive Machine Learning. AI Magazine, 35(4), 105.
- Resources
- Lilly C. Irani and M. Six Silberman. 2013. Turkopticon: interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). DOI: https://doi.org/10.1145/2470654.2470742
- Ingold, David and Soper, Spencer. Amazon Doesn’t Consider the Race of Its Customers. Should It?. Bloomberg, 2016.
- Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner. Machine Bias: Risk Assessment in Criminal Sentencing. Propublica, May 2018.
Week 7: November 7
- Human centered machine learning
- algorithmic fairness, transparency, and accountability; methods and contexts for algorithmic audits
- Assignments due
- Reading reflection
- A4: Project proposal
- Agenda
- Homework assigned
- Read and reflect: Kocielnik, R., Amershi, S., & Bennett, P. N. (2019). Will You Accept an Imperfect AI? Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI ’19, 1–14. https://doi.org/10.1145/3290605.3300641
- A5: Final project plan
- Resources
- Christian Sandvig, Kevin Hamilton, Karrie Karahalios, Cedric Langbort (2014/05/22) Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms. Paper presented to "Data and Discrimination: Converting Critical Concerns into Productive Inquiry," a preconference at the 64th Annual Meeting of the International Communication Association. May 22, 2014; Seattle, WA, USA.
- Shahriari, K., & Shahriari, M. (2017). IEEE standard review - Ethically aligned design: A vision for prioritizing human wellbeing with artificial intelligence and autonomous systems. Institute of Electrical and Electronics Engineers
- ACM US Policy Council Statement on Algorithmic Transparency and Accountability. January 2017.
- Asilomar AI Principles. Future of Life Institute, 2017.
- Diakopoulos, N., Friedler, S., Arenas, M., Barocas, S., Hay, M., Howe, B., … Zevenbergen, B. (2018). Principles for Accountable Algorithms and a Social Impact Statement for Algorithms. Fatml.Org 2018.
- Friedman, B., & Nissenbaum, H. (1996). Bias in Computer Systems. ACM Trans. Inf. Syst., 14(3), 330–347.
- Nate Matias, 2017. How Anyone Can Audit Facebook's Newsfeed. Medium.com
- Hill, Kashmir. Facebook figured out my family secrets, and it won't tell me how. Engadget, 2017.
- Blue, Violet. Google’s comment-ranking system will be a hit with the alt-right. Engadget, 2017.
- Google's Perspective API
Week 8: November 14
- User experience and data science
- algorithmic interpretibility; human-centered methods for designing and evaluating algorithmic systems
- Assignments due
- Reading reflection
- A5: Final project plan
- Agenda
- Homework assigned
- Reading and reflect: Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé, III, Miro Dudik, and Hanna Wallach. 2019. Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, Paper 600, 16 pages. DOI: https://doi.org/10.1145/3290605.3300830
- A6: Final project presentation
- Resources
- Ethical OS Toolkit and Risk Mitigation Checklist. EthicalOS.org.
- Sean M. McNee, John Riedl, and Joseph A. Konstan. 2006. Making recommendations better: an analytic model for human-recommender interaction. In CHI '06 Extended Abstracts on Human Factors in Computing Systems (CHI EA '06).
- Sean M. McNee, Nishikant Kapoor, and Joseph A. Konstan. 2006. Don't look stupid: avoiding pitfalls when recommending research papers. In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work (CSCW '06).
- Michael D. Ekstrand and Martijn C. Willemsen. 2016. Behaviorism is Not Enough: Better Recommendations through Listening to Users. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys '16).
- Jess Holbrook. Human Centered Machine Learning. Google Design Blog. 2017.
- Anderson, Carl. The role of model interpretability in data science. Medium, 2016.
- Fabien Girardin. Experience design in the machine learning era. Medium, 2016.
- Xavier Amatriain and Justin Basilico. Netflix Recommendations: Beyond the 5 stars. Netflix Tech Blog, 2012.
- Jess Holbrook. Human Centered Machine Learning. Google Design Blog. 2017.
- Bart P. Knijnenburg, Martijn C. Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. Explaining the user experience of recommender systems. User Modeling and User-Adapted Interaction 22, 4-5 (October 2012), 441-504. DOI=http://dx.doi.org/10.1007/s11257-011-9118-4
- Patrick Austin, Facebook, Google, and Microsoft Use Design to Trick You Into Handing Over Your Data, New Report Warns. Gizmodo, 6/18/2018
- Cremonesi, P., Elahi, M., & Garzotto, F. (2017). User interface patterns in recommendation-empowered content intensive multimedia applications. Multimedia Tools and Applications, 76(4), 5275-5309.
Week 9: November 21
- Data science in context
- Doing human centered datascience in product organizations; communicating across roles and disciplines; data science for social good
- Assignments due
- Reading reflection
- Agenda
- Homework assigned
- Read and reflect: Alkhatib, A., & Bernstein, M. (2019). Street-Level Algorithms: A Theory at the Gaps Between Policy and Decisions. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3290605.3300760
- A7: Final project report
- Resources
Week 10: November 28 (No Class Session)
- Assignments due
- Reading reflection
- Readings assigned
- NONE
- Homework assigned
- NONE
- Resources
- Marilynn Larkin, How to give a dynamic scientific presentation. Elsevier Connect, 2015.
- Megan Risdal, Communicating data science: a guide to presenting your work. Kaggle blog, 2016.
- Megan Risdal, Communicating data science: Why and how to visualize information. Kaggle blog, 2016.
- Megan Risdal, Communicating data science: an interview with a storytelling expert. Kaggle blog, 2016.
- Brent Dykes, Data Storytelling: The Essential Data Science Skill Everyone Needs. Forbes, 2016.
Week 11: December 5
- Final presentations
- presentation of student projects, course wrap up
- Assignments due
- Reading reflection
- A5: Final presentation
- Readings assigned
- NONE
- Homework assigned
- NONE
- Resources
- NONE
Week 12: Finals Week (No Class Session)
- NO CLASS
- A7: FINAL PROJECT REPORT DUE BY 5:00PM on Tuesday, December 10 via Canvas
- LATE PROJECT SUBMISSIONS NOT ACCEPTED.