(35 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
<!-- | |||
<noinclude> | <noinclude> | ||
<div style="font-family:Rockwell,'Courier Bold',Courier,Georgia,'Times New Roman',Times,serif; min-width:10em;"> | <div style="font-family:Rockwell,'Courier Bold',Courier,Georgia,'Times New Roman',Times,serif; min-width:10em;"> | ||
Line 18: | Line 19: | ||
}}</div></div> | }}</div></div> | ||
</noinclude> | </noinclude> | ||
--> | |||
=== Week 1: September 28 === | === Week 1: September 28 === | ||
[[HCDS_(Fall_2017)/Day_1_plan|Day 1 plan]] | [[HCDS_(Fall_2017)/Day_1_plan|Day 1 plan]] | ||
[[:File:HCDS Week 1 slides.pdf|Day 1 slides]] | |||
;Course overview: ''What is data science? What is human centered? What is human centered data science?'' | |||
;Assignments due | ;Assignments due | ||
Line 50: | Line 56: | ||
[[HCDS_(Fall_2017)/Day_2_plan|Day 2 plan]] | [[HCDS_(Fall_2017)/Day_2_plan|Day 2 plan]] | ||
Ethical considerations in Data Science: privacy, informed consent and user treatment | [[:File:HCDS Week 2 slides.pdf|Day 2 slides]] | ||
;Ethical considerations in Data Science: ''privacy, informed consent and user treatment'' | |||
Line 87: | Line 95: | ||
=== Week 3: October 12 === | === Week 3: October 12 === | ||
[[HCDS_(Fall_2017)/Day_3_plan|Day 3 plan]] | [[HCDS_(Fall_2017)/Day_3_plan|Day 3 plan]] | ||
[[:File:HCDS Week 3 slides.pdf|Day 3 slides]] | |||
;Data provenance, preparation, and reproducibility: ''data curation, preservation, documentation, and archiving; best practices for open scientific research'' | ;Data provenance, preparation, and reproducibility: ''data curation, preservation, documentation, and archiving; best practices for open scientific research'' | ||
Line 128: | Line 138: | ||
=== Week 4: October 19 === | === Week 4: October 19 === | ||
[[HCDS_(Fall_2017)/Day_4_plan|Day 4 plan]] | [[HCDS_(Fall_2017)/Day_4_plan|Day 4 plan]] | ||
[[:File:HCDS Week 4 slides.pdf|Day 4 slides]] | |||
;Study design: ''understanding your data; framing research questions; planning your study'' | ;Study design: ''understanding your data; framing research questions; planning your study'' | ||
Line 159: | Line 171: | ||
=== Week 5: October 26 === | === Week 5: October 26 === | ||
[[HCDS_(Fall_2017)/Day_5_plan|Day 5 plan]] | [[HCDS_(Fall_2017)/Day_5_plan|Day 5 plan]] | ||
[[:File:HCDS Week 5 slides.pdf|Day 5 slides]] | |||
;Machine learning: ''ethical AI, algorithmic transparency, societal implications of machine learning'' | ;Machine learning: ''ethical AI, algorithmic transparency, societal implications of machine learning'' | ||
Line 164: | Line 178: | ||
;Assignments due | ;Assignments due | ||
* Reading reflection | * Reading reflection | ||
;Agenda | ;Agenda | ||
Line 178: | Line 191: | ||
;Resources | ;Resources | ||
Bamman, David ''[https://cscw2016hcds.files.wordpress.com/2015/10/bamman_hcds.pdf Interpretability in Human-Centered Data Science].'' 2016 CSCW workshop on Human-Centered Data Science. | * Bamman, David ''[https://cscw2016hcds.files.wordpress.com/2015/10/bamman_hcds.pdf Interpretability in Human-Centered Data Science].'' 2016 CSCW workshop on Human-Centered Data Science. | ||
* Anderson, Carl. ''[https://medium.com/@leapingllamas/the-role-of-model-interpretability-in-data-science-703918f64330 The role of model interpretability in data science].'' Medium, 2016. | |||
* Hill, Kashmir. ''[https://gizmodo.com/facebook-figured-out-my-family-secrets-and-it-wont-tel-1797696163 Facebook figured out my family secrets, and it won't tell me how].'' Engadget, 2017. | |||
* Blue, Violet. ''[https://www.engadget.com/2017/09/01/google-perspective-comment-ranking-system/ Google’s comment-ranking system will be a hit with the alt-right].'' Engadget, 2017. | |||
* Ingold, David and Soper, Spencer. ''[https://www.bloomberg.com/graphics/2016-amazon-same-day/ Amazon Doesn’t Consider the Race of Its Customers. Should It?].'' Bloomberg, 2016. | |||
* Mars, Roman. ''[https://99percentinvisible.org/episode/the-age-of-the-algorithm/ The Age of the Algorithm].'' 99% Invisible Podcast, 2017. | |||
* [https://www.perspectiveapi.com/#/ Google's Perspective API] | |||
<br/> | <br/> | ||
<hr/> | <hr/> | ||
Line 185: | Line 205: | ||
=== Week 6: November 2 === | === Week 6: November 2 === | ||
[[HCDS_(Fall_2017)/Day_6_plan|Day 6 plan]] | [[HCDS_(Fall_2017)/Day_6_plan|Day 6 plan]] | ||
[[:File:HCDS Week 6 slides.pdf|Day 6 slides]] | |||
;Mixed-methods research: ''Big data vs thick data; qualitative research in data science '' | ;Mixed-methods research: ''Big data vs thick data; qualitative research in data science '' | ||
Line 191: | Line 213: | ||
;Assignments due | ;Assignments due | ||
* Reading reflection | * Reading reflection | ||
* A2: Bias in data | |||
Line 198: | Line 221: | ||
;Readings assigned | ;Readings assigned | ||
* R. Stuart Geiger and Aaron Halfaker. 2017. ''[https://commons.wikimedia.org/wiki/File:conflict-bots-wp-cscw.pdf Operationalizing conflict and cooperation between automated software agents in Wikipedia: A replication and expansion of Even Good Bots Fight]''. Proceedings of the ACM on Human-Computer Interaction (Nov 2017 issue, CSCW 2018 Online First) 1, 2, Article 49. DOI: https://doi.org/10.1145/3134684 | |||
;Homework assigned | ;Homework assigned | ||
Line 204: | Line 228: | ||
;Resources | ;Resources | ||
* Maximillian Klein. ''[http://whgi.wmflabs.org/gender-by-language.html Gender by Wikipedia Language]''. Wikidata Human Gender Indicators (WHGI), 2017. | |||
* Benjamin Collier and Julia Bear. ''[https://static1.squarespace.com/static/521c8817e4b0dca2590b4591/t/523745abe4b05150ff027a6e/1379354027662/2012+-+Collier%2C+Bear+-+Conflict%2C+confidence%2C+or+criticism+an+empirical+examination+of+the+gender+gap+in+Wikipedia.pdf Conflict, criticism, or confidence: an empirical examination of the gender gap in wikipedia contributions]''. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work (CSCW '12). DOI: https://doi.org/10.1145/2145204.2145265 | |||
* Christina Shane-Simpson, Kristen Gillespie-Lynch, Examining potential mechanisms underlying the Wikipedia gender gap through a collaborative editing task, In Computers in Human Behavior, Volume 66, 2017, https://doi.org/10.1016/j.chb.2016.09.043. (PDF on Canvas) | |||
* Amanda Menking and Ingrid Erickson. 2015. ''[https://upload.wikimedia.org/wikipedia/commons/7/77/The_Heart_Work_of_Wikipedia_Gendered,_Emotional_Labor_in_the_World%27s_Largest_Online_Encyclopedia.pdf The Heart Work of Wikipedia: Gendered, Emotional Labor in the World's Largest Online Encyclopedia]''. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). https://doi.org/10.1145/2702123.2702514 | |||
* Andrea Forte, Nazanin Andalibi, and Rachel Greenstadt. ''[http://andreaforte.net/ForteCSCW17-Anonymity.pdf Privacy, Anonymity, and Perceived Risk in Open Collaboration: A Study of Tor Users and Wikipedians]''. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17). DOI: https://doi.org/10.1145/2998181.2998273 | |||
<br/> | <br/> | ||
Line 223: | Line 252: | ||
{{:HCDS (Fall 2017)/Day 7 plan}} | {{:HCDS (Fall 2017)/Day 7 plan}} | ||
;Readings assigned | ;Readings assigned (read both, reflect on one) | ||
* Lilly C. Irani and M. Six Silberman. 2013. ''[https://escholarship.org/content/qt10c125z3/qt10c125z3.pdf Turkopticon: interrupting worker invisibility in amazon mechanical turk]''. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). DOI: https://doi.org/10.1145/2470654.2470742 | |||
* Shilad Sen, Margaret E. Giesel, Rebecca Gold, Benjamin Hillmann, Matt Lesicko, Samuel Naden, Jesse Russell, Zixiao (Ken) Wang, and Brent Hecht. 2015. ''[http://www-users.cs.umn.edu/~bhecht/publications/goldstandards_CSCW2015.pdf Turkers, Scholars, "Arafat" and "Peace": Cultural Communities and Algorithmic Gold Standards]''. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW '15). DOI: http://dx.doi.org/10.1145/2675133.2675285 | |||
;Homework assigned | ;Homework assigned | ||
* Reading reflection | * Reading reflection | ||
* A4: Crowdwork | * A4: Crowdwork ethnography | ||
;Resources | ;Resources | ||
*'' | * WeArDynamo contributors. ''[http://wiki.wearedynamo.org/index.php?title=Basics_of_how_to_be_a_good_requester How to be a good requester]'' and ''[http://wiki.wearedynamo.org/index.php?title=Guidelines_for_Academic_Requesters Guidelines for Academic Requesters]''. Wearedynamo.org | ||
* Wang, Tricia. ''[https://medium.com/ethnography-matters/why-big-data-needs-thick-data-b4b3e75e3d7 Why Big Data Needs Thick Data]''. Ethnography Matters, 2016. | |||
<!-- * Wanda J. Orlikowski. 1992. ''[https://dspace.mit.edu/bitstream/handle/1721.1/2412/SWP-3428-27000158-CCSTR-134.pdf%3Bjsessionid%3D89CCB8F0923C0235DB2902AA40C25E28?sequence%3D1 Learning from Notes: organizational issues in groupware implementation]''. In Proceedings of the 1992 ACM conference on Computer-supported cooperative work (CSCW '92). DOI=http://dx.doi.org/10.1145/143457.143549 --> | |||
<br/> | <br/> | ||
Line 240: | Line 273: | ||
[[HCDS_(Fall_2017)/Day_8_plan|Day 8 plan]] | [[HCDS_(Fall_2017)/Day_8_plan|Day 8 plan]] | ||
;User experience and big data: '' | [[:File:HCDS Week 8 slides.pdf|Day 8 slides]] | ||
;User experience and big data: ''user-centered design and evaluation of recommender systems; UI design for data science, collaborative visual analytics'' | |||
Line 250: | Line 285: | ||
;Readings assigned | ;Readings assigned | ||
*Michael D. Ekstrand, F. Maxwell Harper, Martijn C. Willemsen, and Joseph A. Konstan. 2014. ''[https://md.ekstrandom.net/research/pubs/listcmp/listcmp.pdf User perception of differences in recommender algorithms].'' In Proceedings of the 8th ACM Conference on Recommender systems (RecSys '14). ACM, New York, NY, USA, 161-168. DOI: https://doi.org/10.1145/2645710.2645737 | |||
* Chen, N., Brooks, M., Kocielnik, R., Hong, R., Smith, J., Lin, S., Qu, Z., Aragon, C. ''[https://aisel.aisnet.org/cgi/viewcontent.cgi?article=1254&context=hicss-50 Lariat: A visual analytics tool for social media researchers to explore Twitter datasets].'' Proceedings of the 50th Hawaii International Conference on System Sciences (HICSS), Data Analytics and Data Mining for Social Media Minitrack (2017) | |||
;Homework assigned | ;Homework assigned | ||
Line 256: | Line 293: | ||
;Resources | ;Resources | ||
Snyder, Jaime. ''[https://cscw2016hcds.files.wordpress.com/2015/10/snyder_hcds20162.pdf Values in the Design of Visualizations].'' 2016 CSCW workshop on Human-Centered Data Science. | * Sean M. McNee, John Riedl, and Joseph A. Konstan. 2006. ''[http://files.grouplens.org/papers/mcnee-chi06-hri.pdf Making recommendations better: an analytic model for human-recommender interaction].'' In CHI '06 Extended Abstracts on Human Factors in Computing Systems (CHI EA '06). ACM, New York, NY, USA, 1103-1108. DOI=http://dx.doi.org/10.1145/1125451.1125660 | ||
* Kevin Crowston and the Gravity Spy Team. 2017. ''[https://crowston.syr.edu/sites/crowston.syr.edu/files/cpa137-crowstonA.pdf Gravity Spy: Humans, Machines and The Future of Citizen Science].'' In Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17 Companion). ACM, New York, NY, USA, 163-166. DOI: https://doi.org/10.1145/3022198.3026329 | |||
* Michael D. Ekstrand and Martijn C. Willemsen. 2016. ''[https://md.ekstrandom.net/research/pubs/behaviorism/BehaviorismIsNotEnough.pdf Behaviorism is Not Enough: Better Recommendations through Listening to Users].'' In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys '16). ACM, New York, NY, USA, 221-224. DOI: https://doi.org/10.1145/2959100.2959179 | |||
* Jess Holbrook. ''[https://medium.com/google-design/human-centered-machine-learning-a770d10562cd Human Centered Machine Learning].'' Google Design Blog. 2017. | |||
* Xavier Amatriain and Justin Basilico. ''[https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429 Netflix Recommendations: Beyond the 5 stars].'' Netflix Tech Blog, 2012. | |||
*Fabien Girardin. ''[https://medium.com/@girardin/experience-design-in-the-machine-learning-era-e16c87f4f2e2 Experience design in the machine learning era].'' Medium, 2016. | |||
* Brian Whitman. ''[https://notes.variogr.am/2012/12/11/how-music-recommendation-works-and-doesnt-work/ How music recommendation works - and doesn't work].'' Variogram, 2012. | |||
* Paul Lamere. ''[https://musicmachinery.com/2011/05/14/how-good-is-googles-instant-mix/ How good is Google's Instant Mix?].'' Music Machinery, 2011. | |||
* Snyder, Jaime. ''[https://cscw2016hcds.files.wordpress.com/2015/10/snyder_hcds20162.pdf Values in the Design of Visualizations].'' 2016 CSCW workshop on Human-Centered Data Science. | |||
<br/> | <br/> | ||
Line 269: | Line 314: | ||
;Assignments due | ;Assignments due | ||
* Reading reflection | * Reading reflection | ||
* A4: Crowdwork | * A4: Crowdwork ethnography | ||
;Agenda | ;Agenda | ||
Line 275: | Line 320: | ||
;Readings assigned | ;Readings assigned | ||
* Hill, B. M., Dailey, D., Guy, R. T., Lewis, B., Matsuzaki, M., & Morgan, J. T. (2017). Democratizing Data Science: The Community Data Science Workshops and Classes. In N. Jullien, S. A. Matei, & S. P. Goggins (Eds.), ''Big Data Factories: Scientific Collaborative approaches for virtual community data collection, repurposing, recombining, and dissemination''. New York, New York: Springer Nature. [[https://mako.cc/academic/hill_etal-cdsw_chapter-DRAFT.pdf Preprint/Draft PDF]] | |||
* Bivens, R. and Haimson, O.L. 2016. ''[http://journals.sagepub.com/doi/pdf/10.1177/2056305116672486 Baking Gender Into Social Media Design: How Platforms Shape Categories for Users and Advertisers]''. Social Media + Society. 2, 4 (2016), 205630511667248. DOI:https://doi.org/10.1177/2056305116672486. | |||
* Schlesinger, A. et al. 2017. ''[http://arischlesinger.com/wp-content/uploads/2017/03/chi2017-schlesinger-intersectionality.pdf Intersectional HCI: Engaging Identity through Gender, Race, and Class].'' Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI ’17. (2017), 5412–5427. DOI:https://doi.org/10.1145/3025453.3025766. | |||
;Homework assigned | ;Homework assigned | ||
Line 280: | Line 328: | ||
;Resources | ;Resources | ||
* Berney, Rachel, Bernease Herman, Gundula Proksch, Hillary Dawkins, Jacob Kovacs, Yahui Ma, Jacob Rich, and Amanda Tan. ''[https://dssg.uchicago.edu/wp-content/uploads/2017/09/berney.pdf Visualizing Equity: A Data Science for Social Good Tool and Model for Seattle].'' Data Science for Social Good Conference, September 2017, Chicago, Illinois USA (2017). | |||
* Sayamindu Dasgupta and Benjamin Mako Hill. ''[https://cscw2016hcds.files.wordpress.com/2015/10/dasgupta_hcds2016.pdf Learning With Data: Designing for Community Introspection and Exploration].'' Position paper for Developing a Research Agenda for Human-Centered Data Science (a CSCW 2016 workshop). | |||
<br/> | <br/> | ||
Line 287: | Line 337: | ||
=== Week 10: November 30 === | === Week 10: November 30 === | ||
[[HCDS_(Fall_2017)/Day_10_plan|Day 10 plan]] | [[HCDS_(Fall_2017)/Day_10_plan|Day 10 plan]] | ||
[[:File:HCDS Week 10 slides.pdf|Day 10 slides]] | |||
;Communicating methods, results, and implications: translating for non-data scientists '' | ;Communicating methods, results, and implications: translating for non-data scientists '' | ||
Line 299: | Line 351: | ||
;Readings assigned | ;Readings assigned | ||
* Megan Risdal, ''[http://blog.kaggle.com/2016/06/29/communicating-data-science-a-guide-to-presenting-your-work/ Communicating data science: a guide to presenting your work].'' Kaggle blog, 2016. | |||
* Marilynn Larkin, ''[https://www.elsevier.com/connect/how-to-give-a-dynamic-scientific-presentation How to give a dynamic scientific presentation].'' Elsevier Connect, 2015. | |||
;Homework assigned | ;Homework assigned | ||
Line 305: | Line 360: | ||
;Resources | ;Resources | ||
* '' | * Bart P. Knijnenburg, Martijn C. Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. ''[https://pure.tue.nl/ws/files/3484177/724656348730405.pdf Explaining the user experience of recommender systems].'' User Modeling and User-Adapted Interaction 22, 4-5 (October 2012), 441-504. DOI=http://dx.doi.org/10.1007/s11257-011-9118-4 | ||
* Sean M. McNee, Nishikant Kapoor, and Joseph A. Konstan. 2006. ''[http://files.grouplens.org/papers/p171-mcnee.pdf Don't look stupid: avoiding pitfalls when recommending research papers].'' In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work (CSCW '06). ACM, New York, NY, USA, 171-180. DOI=http://dx.doi.org/10.1145/1180875.1180903 | |||
* Megan Risdal, ''[http://blog.kaggle.com/2016/08/10/communicating-data-science-why-and-some-of-the-how-to-visualize-information/ Communicating data science: Why and how to visualize information].'' Kaggle blog, 2016. | |||
* Megan Risdal, ''[http://blog.kaggle.com/2016/06/13/communicating-data-science-an-interview-with-a-storytelling-expert-tyler-byers/ Communicating data science: an interview with a storytelling expert].'' Kaggle blog, 2016. | |||
* Richard Garber, ''[https://joyfulpublicspeaking.blogspot.com/2010/08/power-of-brief-speeches-world-war-i-and.html Power of brief speeches: World War I and the Four Minute Men].'' Joyful Public Speaking, 2010. | |||
* Brent Dykes, ''[https://www.forbes.com/sites/brentdykes/2016/03/31/data-storytelling-the-essential-data-science-skill-everyone-needs/ Data Storytelling: The Essential Data Science Skill Everyone Needs].'' Forbes, 2016. | |||
<br/> | <br/> | ||
Line 314: | Line 374: | ||
[[HCDS_(Fall_2017)/Day_11_plan|Day 11 plan]] | [[HCDS_(Fall_2017)/Day_11_plan|Day 11 plan]] | ||
;Future of human centered data science: | ;Future of human centered data science: course wrap up, final presentations'' | ||
Latest revision as of 21:35, 8 December 2017
Week 1: September 28[edit]
- Course overview
- What is data science? What is human centered? What is human centered data science?
- Assignments due
- fill out the pre-course survey
- Agenda
- Course overview & orientation
- What do we mean by "data science?"
- What do we mean by "human centered?"
- How does human centered design relate to data science?
- Readings assigned
- Watch: Why Humans Should Care About Data Science (Cecilia Aragon, 2016 HCDE Seminar Series)
- Read: Aragon, C. et al. (2016). Developing a Research Agenda for Human-Centered Data Science. Human Centered Data Science workshop, CSCW 2016.
- Read: Provost, Foster, and Tom Fawcett. Data science and its relationship to big data and data-driven decision making. Big Data 1.1 (2013): 51-59.
- Read: Kling, Rob and Star, Susan Leigh. Human Centered Systems in the Perspective of Organizational and Social Informatics. 1997.
- Homework assigned
- Reading reflection
- Resources
- Ideo.org The Field Guide to Human-Centered Design. 2015.
- Faraway, Julian. The Decline and Fall of Statistics. Faraway Statistics, 2015.
- Press, Gil. Data Science: What's The Half-Life Of A Buzzword? Forbes, 2013.
- Bloor, Robin. A Data Science Rant. Inside Analysis, 2013.
- Various authors. Position papers from 2016 CSCW Human Centered Data Science Workshop. 2016.
Week 2: October 5[edit]
- Ethical considerations in Data Science
- privacy, informed consent and user treatment
- Assignments due
- Week 1 reading reflection
- Agenda
- Informed consent in the age of Data Science
- Privacy
- User expectations
- Inferred information
- Correlation
- Anonymisation strategies
- Readings assigned
- Read: Markham, Annette and Buchanan, Elizabeth. Ethical Decision-Making and Internet Researchers. Association for Internet Research, 2012.
- Read: Barocas, Solan and Nissenbaum, Helen. Big Data's End Run around Anonymity and Consent. In Privacy, Big Data, and the Public Good. 2014. (PDF on Canvas)
- Homework assigned
- Reading reflection
- Resources
- Wittkower, D.E. Lurkers, creepers, and virtuous interactivity: From property rights to consent and care as a conceptual basis for privacy concerns and information ethics
- National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. The Belmont Report. U.S. Department of Health and Human Services, 1979.
- Hill, Kashmir. Facebook Manipulated 689,003 Users' Emotions For Science. Forbes, 2014.
- Adam D. I. Kramer, Jamie E. Guillory, and Jeffrey T. Hancock Experimental evidence of massive-scale emotional contagion through social networks. PNAS 2014 111 (24) 8788-8790; published ahead of print June 2, 2014.
- Barbaro, Michael and Zeller, Tom. A Face Is Exposed for AOL Searcher No. 4417749. New York Times, 2008.
- Zetter, Kim. Arvind Narayanan Isn’t Anonymous, and Neither Are You. WIRED, 2012.
- Gray, Mary. When Science, Customer Service, and Human Subjects Research Collide. Now What? Culture Digitally, 2014.
- Tene, Omer and Polonetsky, Jules. Privacy in the Age of Big Data. Stanford Law Review, 2012.
- Dwork, Cynthia. Differential Privacy: A survey of results. Theory and Applications of Models of Computation , 2008.
- Green, Matthew. What is Differential Privacy? A Few Thoughts on Cryptographic Engineering, 2016.
- Hsu, Danny. Techniques to Anonymize Human Data. Data Sift, 2015.
- Metcalf, Jacob. Twelve principles of data ethics. Ethical Resolve, 2016.
- Poor, Nathaniel and Davidson, Roei. When The Data You Want Comes From Hackers, Or, Looking A Gift Horse In The Mouth. CSCW Human Centered Data Science Workshop, 2016.
Week 3: October 12[edit]
- Data provenance, preparation, and reproducibility
- data curation, preservation, documentation, and archiving; best practices for open scientific research
- Assignments due
- Week 2 reading reflection
- Agenda
- Final project overview
- Introduction to open research
- Understanding data licensing and attribution
- Supporting replicability and reproducibility
- Making your research and data accessible
- Working with Wikipedia datasets
- Assignment 1 description
- Readings assigned
- Read: Chapter 2 "Assessing Reproducibility" and Chapter 3 "The Basic Reproducible Workflow Template" from The Practice of Reproducible Research University of California Press, 2018.
- Read: Hickey, Walt. The Dollars and Cents Case Against Hollywood's Exclusion of Women. FiveThirtyEight, 2014. AND Keegan, Brian. The Need for Openness in Data Journalism. 2014.
- Homework assigned
- Reading reflection
- A1: Data curation
- Examples of well-documented open research projects
- Keegan, Brian. WeatherCrime. GitHub, 2014.
- Geiger, Stuart R. and Halfaker, Aaron. Operationalizing conflict and cooperation between automated software agents in Wikipedia: A replication and expansion of "Even Good Bots Fight". GitHub, 2017.
- Thain, Nithum; Dixon, Lucas; and Wulczyn, Ellery. Wikipedia Talk Labels: Toxicity. Figshare, 2017.
- Narayan, Sneha et al. Replication Data for: The Wikipedia Adventure: Field Evaluation of an Interactive Tutorial for New Users. Harvard Dataverse, 2017.
- Examples of not-so-well documented open research projects
- Eclarke. SWGA paper. GitHub, 2016.
- David Lefevre. Lefevre and Cox: Delayed instructional feedback may be more effective, but is this contrary to learners’ preferences? Figshare, 2016.
- Alneberg. CONCOCT Paper Data. GitHub, 2014.
- Other resources
- Press, Gil. Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says. Forbes, 2016.
- Christensen, Garret. Manual of Best Practices in Transparent Social Science Research. 2016.
- Hickey, Walt. The Bechdel Test: Checking Our Work. FiveThirtyEight, 2014.
- Chapman et al. Cross Industry Standard Process for Data Mining. IBM, 2000.
Week 4: October 19[edit]
- Study design
- understanding your data; framing research questions; planning your study
- Assignments due
- Reading reflection
- A1: Data curation
- Agenda
- How Wikipedia works (and how it doesn't)
- guest speaker: Morten Warnke-Wang, Wikimedia Foundation
- Sources of bias in data science research
- Sources of bias in Wikipedia data
- Readings assigned
- Shyong (Tony) K. Lam, Anuradha Uduwage, Zhenhua Dong, Shilad Sen, David R. Musicant, Loren Terveen, and John Riedl. 2011. WP:clubhouse?: an exploration of Wikipedia's gender imbalance. In Proceedings of the 7th International Symposium on Wikis and Open Collaboration (WikiSym '11). ACM, New York, NY, USA, 1-10. DOI=http://dx.doi.org/10.1145/2038558.2038560
- Homework assigned
- Reading reflection
- A2: Bias in data
- Resources
- Aschwanden, Christie. Science Isn't Broken FiveThirtyEight, 2015.
- Halfaker, Aaron et al. The Rise and Decline of an Open Collaboration Community: How Wikipedia's reaction to sudden popularity is causing its decline. American Behavioral Scientist, 2012.
- Warnke-Wang, Morten. Autoconfirmed article creation trial. Wikimedia, 2017.
- Wikipedia Or Encyclopædia Britannica: Which Has More Bias?. Forbes, 2015. Based on Greenstein, Shane, and Feng Zhu.Do Experts or Collective Intelligence Write with More Bias? Evidence from Encyclopædia Britannica and Wikipedia. Harvard Business School working paper.
Week 5: October 26[edit]
- Machine learning
- ethical AI, algorithmic transparency, societal implications of machine learning
- Assignments due
- Reading reflection
- Agenda
- Social implications of machine learning
- Consequences of algorithmic bias
- Sources of algorithmic bias
- Addressing algorithmic bias
- Auditing algorithms
- Readings assigned
- Christian Sandvig, Kevin Hamilton, Karrie Karahalios, Cedric Langbort (2014/05/22) Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms. Paper presented to "Data and Discrimination: Converting Critical Concerns into Productive Inquiry," a preconference at the 64th Annual Meeting of the International Communication Association. May 22, 2014; Seattle, WA, USA.
- Homework assigned
- Reading reflection
- A3: Final project plan
- Resources
- Bamman, David Interpretability in Human-Centered Data Science. 2016 CSCW workshop on Human-Centered Data Science.
- Anderson, Carl. The role of model interpretability in data science. Medium, 2016.
- Hill, Kashmir. Facebook figured out my family secrets, and it won't tell me how. Engadget, 2017.
- Blue, Violet. Google’s comment-ranking system will be a hit with the alt-right. Engadget, 2017.
- Ingold, David and Soper, Spencer. Amazon Doesn’t Consider the Race of Its Customers. Should It?. Bloomberg, 2016.
- Mars, Roman. The Age of the Algorithm. 99% Invisible Podcast, 2017.
- Google's Perspective API
Week 6: November 2[edit]
- Mixed-methods research
- Big data vs thick data; qualitative research in data science
- Assignments due
- Reading reflection
- A2: Bias in data
- Agenda
- Guest speakers: Aaron Halfaker, Caroline Sinders (Wikimedia Foundation)
- Mixed methods research
- Ethnographic methods in data science
- Project plan brainstorm/Q&A session
- Readings assigned
- R. Stuart Geiger and Aaron Halfaker. 2017. Operationalizing conflict and cooperation between automated software agents in Wikipedia: A replication and expansion of Even Good Bots Fight. Proceedings of the ACM on Human-Computer Interaction (Nov 2017 issue, CSCW 2018 Online First) 1, 2, Article 49. DOI: https://doi.org/10.1145/3134684
- Homework assigned
- Reading reflection
- Resources
- Maximillian Klein. Gender by Wikipedia Language. Wikidata Human Gender Indicators (WHGI), 2017.
- Benjamin Collier and Julia Bear. Conflict, criticism, or confidence: an empirical examination of the gender gap in wikipedia contributions. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work (CSCW '12). DOI: https://doi.org/10.1145/2145204.2145265
- Christina Shane-Simpson, Kristen Gillespie-Lynch, Examining potential mechanisms underlying the Wikipedia gender gap through a collaborative editing task, In Computers in Human Behavior, Volume 66, 2017, https://doi.org/10.1016/j.chb.2016.09.043. (PDF on Canvas)
- Amanda Menking and Ingrid Erickson. 2015. The Heart Work of Wikipedia: Gendered, Emotional Labor in the World's Largest Online Encyclopedia. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). https://doi.org/10.1145/2702123.2702514
- Andrea Forte, Nazanin Andalibi, and Rachel Greenstadt. Privacy, Anonymity, and Perceived Risk in Open Collaboration: A Study of Tor Users and Wikipedians. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17). DOI: https://doi.org/10.1145/2998181.2998273
Week 7: November 9[edit]
- Human computation
- ethics of crowdwork, crowdsourcing methodologies for analysis, design, and evaluation
- Assignments due
- Reading reflection
- A3: Final project plan
- Agenda
- the role of qualitative research in human centered data science
- scaling qualitative research through crowdsourcing
- types of crowdwork
- ethical and practical considerations for crowdwork
- Introduction to assignment 4: Mechanical Turk ethnography
- Readings assigned (read both, reflect on one)
- Lilly C. Irani and M. Six Silberman. 2013. Turkopticon: interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). DOI: https://doi.org/10.1145/2470654.2470742
- Shilad Sen, Margaret E. Giesel, Rebecca Gold, Benjamin Hillmann, Matt Lesicko, Samuel Naden, Jesse Russell, Zixiao (Ken) Wang, and Brent Hecht. 2015. Turkers, Scholars, "Arafat" and "Peace": Cultural Communities and Algorithmic Gold Standards. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW '15). DOI: http://dx.doi.org/10.1145/2675133.2675285
- Homework assigned
- Reading reflection
- A4: Crowdwork ethnography
- Resources
- WeArDynamo contributors. How to be a good requester and Guidelines for Academic Requesters. Wearedynamo.org
- Wang, Tricia. Why Big Data Needs Thick Data. Ethnography Matters, 2016.
Week 8: November 16[edit]
- User experience and big data
- user-centered design and evaluation of recommender systems; UI design for data science, collaborative visual analytics
- Assignments due
- Reading reflection
- Agenda
- HCD process in the design of data-driven applications
- understanding user needs, user intent, and context of use in recommender system design
- trust, empowerment, and seamful design
- HCD in data analysis and visualization
- final project lightning feedback sessions
- Readings assigned
- Michael D. Ekstrand, F. Maxwell Harper, Martijn C. Willemsen, and Joseph A. Konstan. 2014. User perception of differences in recommender algorithms. In Proceedings of the 8th ACM Conference on Recommender systems (RecSys '14). ACM, New York, NY, USA, 161-168. DOI: https://doi.org/10.1145/2645710.2645737
- Chen, N., Brooks, M., Kocielnik, R., Hong, R., Smith, J., Lin, S., Qu, Z., Aragon, C. Lariat: A visual analytics tool for social media researchers to explore Twitter datasets. Proceedings of the 50th Hawaii International Conference on System Sciences (HICSS), Data Analytics and Data Mining for Social Media Minitrack (2017)
- Homework assigned
- Reading reflection
- Resources
- Sean M. McNee, John Riedl, and Joseph A. Konstan. 2006. Making recommendations better: an analytic model for human-recommender interaction. In CHI '06 Extended Abstracts on Human Factors in Computing Systems (CHI EA '06). ACM, New York, NY, USA, 1103-1108. DOI=http://dx.doi.org/10.1145/1125451.1125660
- Kevin Crowston and the Gravity Spy Team. 2017. Gravity Spy: Humans, Machines and The Future of Citizen Science. In Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17 Companion). ACM, New York, NY, USA, 163-166. DOI: https://doi.org/10.1145/3022198.3026329
- Michael D. Ekstrand and Martijn C. Willemsen. 2016. Behaviorism is Not Enough: Better Recommendations through Listening to Users. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys '16). ACM, New York, NY, USA, 221-224. DOI: https://doi.org/10.1145/2959100.2959179
- Jess Holbrook. Human Centered Machine Learning. Google Design Blog. 2017.
- Xavier Amatriain and Justin Basilico. Netflix Recommendations: Beyond the 5 stars. Netflix Tech Blog, 2012.
- Fabien Girardin. Experience design in the machine learning era. Medium, 2016.
- Brian Whitman. How music recommendation works - and doesn't work. Variogram, 2012.
- Paul Lamere. How good is Google's Instant Mix?. Music Machinery, 2011.
- Snyder, Jaime. Values in the Design of Visualizations. 2016 CSCW workshop on Human-Centered Data Science.
Week 9: November 23[edit]
- Human-centered data science in the wild
- community data science; data science for social good
- Assignments due
- Reading reflection
- A4: Crowdwork ethnography
- Agenda
- NO CLASS - work on your own
- Readings assigned
- Hill, B. M., Dailey, D., Guy, R. T., Lewis, B., Matsuzaki, M., & Morgan, J. T. (2017). Democratizing Data Science: The Community Data Science Workshops and Classes. In N. Jullien, S. A. Matei, & S. P. Goggins (Eds.), Big Data Factories: Scientific Collaborative approaches for virtual community data collection, repurposing, recombining, and dissemination. New York, New York: Springer Nature. [Preprint/Draft PDF]
- Bivens, R. and Haimson, O.L. 2016. Baking Gender Into Social Media Design: How Platforms Shape Categories for Users and Advertisers. Social Media + Society. 2, 4 (2016), 205630511667248. DOI:https://doi.org/10.1177/2056305116672486.
- Schlesinger, A. et al. 2017. Intersectional HCI: Engaging Identity through Gender, Race, and Class. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI ’17. (2017), 5412–5427. DOI:https://doi.org/10.1145/3025453.3025766.
- Homework assigned
- Reading reflection
- Resources
- Berney, Rachel, Bernease Herman, Gundula Proksch, Hillary Dawkins, Jacob Kovacs, Yahui Ma, Jacob Rich, and Amanda Tan. Visualizing Equity: A Data Science for Social Good Tool and Model for Seattle. Data Science for Social Good Conference, September 2017, Chicago, Illinois USA (2017).
- Sayamindu Dasgupta and Benjamin Mako Hill. Learning With Data: Designing for Community Introspection and Exploration. Position paper for Developing a Research Agenda for Human-Centered Data Science (a CSCW 2016 workshop).
Week 10: November 30[edit]
- Communicating methods, results, and implications
- translating for non-data scientists
- Assignments due
- Reading reflection
- Agenda
- communicating about your research effectively and honestly to different audiences
- publishing your research openly
- disseminating your research
- final project workshop
- Readings assigned
- Megan Risdal, Communicating data science: a guide to presenting your work. Kaggle blog, 2016.
- Marilynn Larkin, How to give a dynamic scientific presentation. Elsevier Connect, 2015.
- Homework assigned
- Reading reflection
- A5: Final presentation
- Resources
- Bart P. Knijnenburg, Martijn C. Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. Explaining the user experience of recommender systems. User Modeling and User-Adapted Interaction 22, 4-5 (October 2012), 441-504. DOI=http://dx.doi.org/10.1007/s11257-011-9118-4
- Sean M. McNee, Nishikant Kapoor, and Joseph A. Konstan. 2006. Don't look stupid: avoiding pitfalls when recommending research papers. In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work (CSCW '06). ACM, New York, NY, USA, 171-180. DOI=http://dx.doi.org/10.1145/1180875.1180903
- Megan Risdal, Communicating data science: Why and how to visualize information. Kaggle blog, 2016.
- Megan Risdal, Communicating data science: an interview with a storytelling expert. Kaggle blog, 2016.
- Richard Garber, Power of brief speeches: World War I and the Four Minute Men. Joyful Public Speaking, 2010.
- Brent Dykes, Data Storytelling: The Essential Data Science Skill Everyone Needs. Forbes, 2016.
Week 11: December 7[edit]
- Future of human centered data science
- course wrap up, final presentations
- Assignments due
- Reading reflection
- A5: Final presentation
- Agenda
- future directions of of human centered data science
- final presentations
- Readings assigned
- none!
- Homework assigned
- none!
- Resources
- one
Week 12: Finals Week[edit]
- NO CLASS
- A6: FINAL PROJECT REPORT DUE BY 11:59PM on Sunday, December 10
- LATE PROJECT SUBMISSIONS NOT ACCEPTED.