User:Groceryheist/drafts/Data Science Syllabus: Difference between revisions

From CommunityData
No edit summary
No edit summary
Line 189: Line 189:




;Resources
<!-- ;Resources -->
* Olteanu, A., Castillo, C., Diaz, F., & Kiciman, E. (2016). ''[http://kiciman.org/wp-content/uploads/2017/08/SSRN-id2886526.pdf Social data: Biases, methodological pitfalls, and ethical boundaries].
<!-- * Olteanu, A., Castillo, C., Diaz, F., & Kiciman, E. (2016). ''[http://kiciman.org/wp-content/uploads/2017/08/SSRN-id2886526.pdf Social data: Biases, methodological pitfalls, and ethical boundaries]. -->
* Brian N Larson. 2017. ''[http://www.ethicsinnlp.org/workshop/pdf/EthNLP04.pdf Gender as a Variable in Natural-Language Processing: Ethical Considerations]. EthNLP, 3: 30–40.
<!-- * Brian N Larson. 2017. ''[http://www.ethicsinnlp.org/workshop/pdf/EthNLP04.pdf Gender as a Variable in Natural-Language Processing: Ethical Considerations]. EthNLP, 3: 30–40. -->
* Bender, E. M., & Friedman, B. (2018). [https://openreview.net/forum?id=By4oPeX9f Data Statements for NLP: Toward Mitigating System Bias and Enabling Better Science]. To appear in Transactions of the ACL.
<!-- * Bender, E. M., & Friedman, B. (2018). [https://openreview.net/forum?id=By4oPeX9f Data Statements for NLP: Toward Mitigating System Bias and Enabling Better Science]. To appear in Transactions of the ACL. -->
* Isaac L. Johnson, Yilun Lin, Toby Jia-Jun Li, Andrew Hall, Aaron Halfaker, Johannes Schöning, and Brent Hecht. 2016. ''[http://delivery.acm.org/10.1145/2860000/2858123/p13-johnson.pdf?ip=209.166.92.236&id=2858123&acc=CHORUS&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1539880715_eb477907771cea4ecaabc953094c3080 Not at Home on the Range: Peer Production and the Urban/Rural Divide].'' CHI '16. DOI: https://doi.org/10.1145/2858036.2858123
<!-- * Isaac L. Johnson, Yilun Lin, Toby Jia-Jun Li, Andrew Hall, Aaron Halfaker, Johannes Schöning, and Brent Hecht. 2016. ''[http://delivery.acm.org/10.1145/2860000/2858123/p13-johnson.pdf?ip=209.166.92.236&id=2858123&acc=CHORUS&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1539880715_eb477907771cea4ecaabc953094c3080 Not at Home on the Range: Peer Production and the Urban/Rural Divide].'' CHI '16. DOI: https://doi.org/10.1145/2858036.2858123 -->
* Leo Graiden Stewart, Ahmer Arif, A. Conrad Nied, Emma S. Spiro, and Kate Starbird. 2017. ''[https://faculty.washington.edu/kstarbi/Stewart_Starbird_Drawing_the_Lines_of_Contention-final.pdf Drawing the Lines of Contention: Networked Frame Contests Within #BlackLivesMatter Discourse].'' Proc. ACM Hum.-Comput. Interact. 1, CSCW, Article 96 (December 2017), 23 pages. DOI: https://doi.org/10.1145/3134920
<!-- * Leo Graiden Stewart, Ahmer Arif, A. Conrad Nied, Emma S. Spiro, and Kate Starbird. 2017. ''[https://faculty.washington.edu/kstarbi/Stewart_Starbird_Drawing_the_Lines_of_Contention-final.pdf Drawing the Lines of Contention: Networked Frame Contests Within #BlackLivesMatter Discourse].'' Proc. ACM Hum.-Comput. Interact. 1, CSCW, Article 96 (December 2017), 23 pages. DOI: https://doi.org/10.1145/3134920 -->
* Cristian Danescu-Niculescu-Mizil, Robert West, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. ''[https://web.stanford.edu/~jurafsky/pubs/linguistic_change_lifecycle.pdf No country for old members: user lifecycle and linguistic change in online communities].'' In Proceedings of the 22nd international conference on World Wide Web (WWW '13). ACM, New York, NY, USA, 307-318. DOI: https://doi.org/10.1145/2488388.2488416  
<!-- * Cristian Danescu-Niculescu-Mizil, Robert West, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. ''[https://web.stanford.edu/~jurafsky/pubs/linguistic_change_lifecycle.pdf No country for old members: user lifecycle and linguistic change in online communities].'' In Proceedings of the 22nd international conference on World Wide Web (WWW '13). ACM, New York, NY, USA, 307-318. DOI: https://doi.org/10.1145/2488388.2488416   -->
<!-- * Astrid Mager. 2012. Algorithmic ideology: How capitalist society shapes search engines. Information, Communication & Society 15, 5: 769–787. http://doi.org/10.1080/1369118X.2012.676056 (in Canvas) -->
<!-- * Astrid Mager. 2012. Algorithmic ideology: How capitalist society shapes search engines. Information, Communication & Society 15, 5: 769–787. http://doi.org/10.1080/1369118X.2012.676056 (in Canvas) -->
<br/>
<br/>
Line 253: Line 253:
;Readings assigned (Read both, reflect on one)
;Readings assigned (Read both, reflect on one)
* Donovan, J., Caplan, R., Matthews, J., & Hanson, L. (2018). ''[https://datasociety.net/wp-content/uploads/2018/04/Data_Society_Algorithmic_Accountability_Primer_FINAL.pdf Algorithmic accountability: A primer]''. Data & Society, 501(c).
* Donovan, J., Caplan, R., Matthews, J., & Hanson, L. (2018). ''[https://datasociety.net/wp-content/uploads/2018/04/Data_Society_Algorithmic_Accountability_Primer_FINAL.pdf Algorithmic accountability: A primer]''. Data & Society, 501(c).
* Astrid Mager. 2012. ''[https://computingeverywhere.soc.northwestern.edu/wp-content/uploads/2017/07/Mager-Algorithmic-Ideology-Required.pdf Algorithmic ideology: How capitalist society shapes search engines]''. Information, Communication & Society 15, 5: 769–787. http://doi.org/10.1080/1369118X.2012.676056




Line 285: Line 286:
<br/>
<br/>


=== Week 8: ===
<!-- [[HCDS_(Fall_2018)/Day_6_plan|Day 6 plan]] -->


=== Week 8: November 15 ===
<!-- [[:File:HCDS 2018 week 6 slides.pdf|Day 6 slides]] -->
[[HCDS_(Fall_2018)/Day_8_plan|Day 8 plan]]
 
[[:File:HCDS 2018 week 8 slides.pdf|Day 8 slides]]
 
;Human-centered algorithm design: ''algorithmic interpretibility; human-centered methods for designing and evaluating algorithmic systems''
 
;Assignments due
* Reading reflection
 
<!-- ;Agenda -->
<!-- {{:HCDS (Fall 2018)/Day 8 plan}} -->
 
 
 
=== Week 6: November 1 ===
[[HCDS_(Fall_2018)/Day_6_plan|Day 6 plan]]
 
[[:File:HCDS 2018 week 6 slides.pdf|Day 6 slides]]


; Algorithms: ''algorithmic fairness, transparency, and accountability; methods and contexts for algorithmic audits''
; Algorithms: ''algorithmic fairness, transparency, and accountability; methods and contexts for algorithmic audits''
Line 315: Line 300:


;Readings assigned
;Readings assigned
* Astrid Mager. 2012. ''[https://computingeverywhere.soc.northwestern.edu/wp-content/uploads/2017/07/Mager-Algorithmic-Ideology-Required.pdf Algorithmic ideology: How capitalist society shapes search engines]''. Information, Communication & Society 15, 5: 769–787. http://doi.org/10.1080/1369118X.2012.676056
* Hill, B. M., Dailey, D., Guy, R. T., Lewis, B., Matsuzaki, M., & Morgan, J. T. (2017). ''[https://mako.cc/academic/hill_etal-cdsw_chapter-DRAFT.pdf Democratizing Data Science: The Community Data Science Workshops and Classes].'' In N. Jullien, S. A. Matei, & S. P. Goggins (Eds.), Big Data Factories: Scientific Collaborative approaches for virtual community data collection, repurposing, recombining, and dissemination.


;Homework assigned
;Homework assigned
Line 321: Line 306:


<!-- ;Resources -->
<!-- ;Resources -->
<!-- * Hill, B. M., Dailey, D., Guy, R. T., Lewis, B., Matsuzaki, M., & Morgan, J. T. (2017). ''[https://mako.cc/academic/hill_etal-cdsw_chapter-DRAFT.pdf Democratizing Data Science: The Community Data Science Workshops and Classes].'' In N. Jullien, S. A. Matei, & S. P. Goggins (Eds.), Big Data Factories: Scientific Collaborative approaches for virtual community data collection, repurposing, recombining, and dissemination. -->
<!-- * Ethical OS ''[https://ethicalos.org/wp-content/uploads/2018/08/Ethical-OS-Toolkit-2.pdf Toolkit]'' and ''[https://ethicalos.org/wp-content/uploads/2018/08/EthicalOS_Check-List_080618.pdf Risk Mitigation Checklist]''. EthicalOS.org. -->
<!-- * Ethical OS ''[https://ethicalos.org/wp-content/uploads/2018/08/Ethical-OS-Toolkit-2.pdf Toolkit]'' and ''[https://ethicalos.org/wp-content/uploads/2018/08/EthicalOS_Check-List_080618.pdf Risk Mitigation Checklist]''. EthicalOS.org. -->
<!-- * Morgan, J. 2016. ''[https://meta.wikimedia.org/wiki/Research:Evaluating_RelatedArticles_recommendations Evaluating Related Articles recommendations]''. Wikimedia Research. -->
<!-- * Morgan, J. 2016. ''[https://meta.wikimedia.org/wiki/Research:Evaluating_RelatedArticles_recommendations Evaluating Related Articles recommendations]''. Wikimedia Research. -->
Line 345: Line 329:
<!-- * Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner. ''[https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing Machine Bias: Risk Assessment in Criminal Sentencing]. Propublica, May 2018. -->
<!-- * Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner. ''[https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing Machine Bias: Risk Assessment in Criminal Sentencing]. Propublica, May 2018. -->
<!-- * [https://www.perspectiveapi.com/#/ Google's Perspective API] -->
<!-- * [https://www.perspectiveapi.com/#/ Google's Perspective API] -->




Line 423: Line 406:
* Reading reflection
* Reading reflection


;Agenda
<!-- ;Agenda -->
{{:HCDS (Fall 2018)/Day 10 plan}}
<!-- {{:HCDS (Fall 2018)/Day 10 plan}} -->


;Readings assigned
;Readings assigned
Line 432: Line 415:
* A5: Final presentation
* A5: Final presentation


;Resources
<!-- ;Resources -->
*Fabien Girardin. ''[https://medium.com/@girardin/experience-design-in-the-machine-learning-era-e16c87f4f2e2 Experience design in the machine learning era].'' Medium, 2016.
<!-- *Fabien Girardin. ''[https://medium.com/@girardin/experience-design-in-the-machine-learning-era-e16c87f4f2e2 Experience design in the machine learning era].'' Medium, 2016. -->
* Xavier Amatriain and Justin Basilico. ''[https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429 Netflix Recommendations: Beyond the 5 stars].'' Netflix Tech Blog, 2012.
<!-- * Xavier Amatriain and Justin Basilico. ''[https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429 Netflix Recommendations: Beyond the 5 stars].'' Netflix Tech Blog, 2012. -->
* Jess Holbrook. ''[https://medium.com/google-design/human-centered-machine-learning-a770d10562cd Human Centered Machine Learning].'' Google Design Blog. 2017.
<!-- * Jess Holbrook. ''[https://medium.com/google-design/human-centered-machine-learning-a770d10562cd Human Centered Machine Learning].'' Google Design Blog. 2017. -->
* Bart P. Knijnenburg, Martijn C. Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. ''[https://pure.tue.nl/ws/files/3484177/724656348730405.pdf Explaining the user experience of recommender systems].'' User Modeling and User-Adapted Interaction 22, 4-5 (October 2012), 441-504. DOI=http://dx.doi.org/10.1007/s11257-011-9118-4
<!-- * Bart P. Knijnenburg, Martijn C. Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. ''[https://pure.tue.nl/ws/files/3484177/724656348730405.pdf Explaining the user experience of recommender systems].'' User Modeling and User-Adapted Interaction 22, 4-5 (October 2012), 441-504. DOI=http://dx.doi.org/10.1007/s11257-011-9118-4 -->
* Patrick Austin, ''[https://gizmodo.com/facebook-google-and-microsoft-use-design-to-trick-you-1827168534 Facebook, Google, and Microsoft Use Design to Trick You Into Handing Over Your Data, New Report Warns].'' Gizmodo, 6/18/2018
<!-- * Patrick Austin, ''[https://gizmodo.com/facebook-google-and-microsoft-use-design-to-trick-you-1827168534 Facebook, Google, and Microsoft Use Design to Trick You Into Handing Over Your Data, New Report Warns].'' Gizmodo, 6/18/2018 -->
* Brown, A., Tuor, A., Hutchinson, B., & Nichols, N. (2018). ''[[https://arxiv.org/abs/1803.04967 Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection].'' arXiv preprint arXiv:1803.04967.
<!-- * Brown, A., Tuor, A., Hutchinson, B., & Nichols, N. (2018). ''[[https://arxiv.org/abs/1803.04967 Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection].'' arXiv preprint arXiv:1803.04967. -->
* Cremonesi, P., Elahi, M., & Garzotto, F. (2017). ''[https://core.ac.uk/download/pdf/74313597.pdf User interface patterns in recommendation-empowered content intensive multimedia applications].'' Multimedia Tools and Applications, 76(4), 5275-5309.
<!-- * Cremonesi, P., Elahi, M., & Garzotto, F. (2017). ''[https://core.ac.uk/download/pdf/74313597.pdf User interface patterns in recommendation-empowered content intensive multimedia applications].'' Multimedia Tools and Applications, 76(4), 5275-5309. -->
* Marilynn Larkin, ''[https://www.elsevier.com/connect/how-to-give-a-dynamic-scientific-presentation How to give a dynamic scientific presentation].'' Elsevier Connect, 2015.
<!-- * Marilynn Larkin, ''[https://www.elsevier.com/connect/how-to-give-a-dynamic-scientific-presentation How to give a dynamic scientific presentation].'' Elsevier Connect, 2015. -->
* Megan Risdal, ''[http://blog.kaggle.com/2016/06/29/communicating-data-science-a-guide-to-presenting-your-work/ Communicating data science: a guide to presenting your work].'' Kaggle blog, 2016.
<!-- * Megan Risdal, ''[http://blog.kaggle.com/2016/06/29/communicating-data-science-a-guide-to-presenting-your-work/ Communicating data science: a guide to presenting your work].'' Kaggle blog, 2016. -->
* Megan Risdal, ''[http://blog.kaggle.com/2016/08/10/communicating-data-science-why-and-some-of-the-how-to-visualize-information/ Communicating data science: Why and how to visualize information].'' Kaggle blog, 2016.
<!-- * Megan Risdal, ''[http://blog.kaggle.com/2016/08/10/communicating-data-science-why-and-some-of-the-how-to-visualize-information/ Communicating data science: Why and how to visualize information].'' Kaggle blog, 2016. -->
* Megan Risdal, ''[http://blog.kaggle.com/2016/06/13/communicating-data-science-an-interview-with-a-storytelling-expert-tyler-byers/ Communicating data science: an interview with a storytelling expert].'' Kaggle blog, 2016.
<!-- * Megan Risdal, ''[http://blog.kaggle.com/2016/06/13/communicating-data-science-an-interview-with-a-storytelling-expert-tyler-byers/ Communicating data science: an interview with a storytelling expert].'' Kaggle blog, 2016. -->
* Brent Dykes, ''[https://www.forbes.com/sites/brentdykes/2016/03/31/data-storytelling-the-essential-data-science-skill-everyone-needs/ Data Storytelling: The Essential Data Science Skill Everyone Needs].'' Forbes, 2016.
<!-- * Brent Dykes, ''[https://www.forbes.com/sites/brentdykes/2016/03/31/data-storytelling-the-essential-data-science-skill-everyone-needs/ Data Storytelling: The Essential Data Science Skill Everyone Needs].'' Forbes, 2016. -->
 





Revision as of 03:24, 9 February 2019

Data Science and Organizational Communication
Principal instructor
Nate TeBlunthuis
Course Catalog Description
Fundamental principles of data science and its implications, including research ethics; data privacy; legal frameworks; algorithmic bias, transparency, fairness and accountability; data provenance, curation, preservation, and reproducibility; human computation; data communication and visualization; the role of data science in organizational context and the societal impacts of data science.

Course Description

The rise of "data science" reflects a broad and ongoing shift in how many teams, organizational leaders, communities of practice, and entire industries create and use knowledge. This class teaches "data science" as practiced by data-intensive knowledge workers but also as it is positioned in historical, organizational, institutional, and societal contexts. Students will gain an appriciation for the technical and intellectual aspects of data science, consider critical questions about how data science is often practiced, and envision ethical and effective science practice in their current and future organiational roles. The format of the class will be a mix of lecture, discussion, in-class activities, and qualitative and quantitative research assignments.

The course is designed around two high-stakes projects. In the first stage of the students will attend the Community Data Science Workshop (CDSC). I am one of the organizers and instructors of this three week intensive workshop on basic programming and data analysis skills. The first course project is to apply these skills together with the conceptual material from this course we have covered so far to conduct an original data analysis on a topic of the student's interest. The second high-stakes project is a critical analysis of an organization or work team. For this project students will serve as consultants to an organizational unit involved in data science. Through interviews and workplace observations they will gain an understanding of the socio-technical and organizational context of their team. They will then synthesize this understanding with the knowledge they gained from the course material to compose a report offering actionable insights to their team.

Learning Objectives

By the end of this course, students will be able to:

  • Understand what it means to analyze large and complex data effectively and ethically with an understanding of human, societal, organizational, and socio-technical contexts.
  • Consider the account ethical, social, organizational, and legal considerations of data science in organizational and institutional contexts.
  • Combine quantitative and qualitative data to generate critical insights into human behavior.
  • Discuss and evaluate ethical, social, organizational and legal trade-offs of different data analysis, testing, curation, and sharing methods.


Schedule

Course schedule (click to expand)

This page is a work in progress.





Week 1:

Introduction to Human Centered Data Science
What is data science? What is human centered? What is human centered data science?
Assignments due


Readings assigned
Homework assigned
  • Reading reflection
  • Attend week 2 of CDSW





Week 2:

Ethical considerations
privacy, informed consent and user treatment
Assignments due
  • Week 1 reading reflection


Readings assigned
Homework assigned





Week 3

Reproducibility and Accountability
data curation, preservation, documentation, and archiving; best practices for open scientific research
Assignments due
  • Week 2 reading reflection
  • Attend week 2 of CDSW


Readings assigned
Homework assigned
  • Reading reflection
  • Attend week 3 of CDSW







Week 4: October 18

Interrogating datasets
causes and consequences of bias in data; best practices for selecting, describing, and implementing training data


Assignments due
  • Reading reflection


Readings assigned (Read both, reflect on one)
  • Barley, S. R. (1986). Technology as an occasion for structuring: evidence from observations of ct scanners and the social order of radiology departments. Administrative Science Quarterly, 31(1), 78–108.
  • Orlikowski, W. J., & Barley, S. R. (2001). Technology and institutions: what can research on information technology and research on organizations learn from each other? MIS Q., 25(2), 145–165. https://doi.org/10.2307/3250927
Homework assigned






Week 5:

Technology and Organizing
Assignments due


Readings assigned
  • Passi, S., & Jackson, S. J. (2018). Trust in Data Science: Collaboration, Translation, and Accountability in Corporate Data Science Projects. Proc. ACM Hum.-Comput. Interact., 2(CSCW), 136:1–136:28. https://doi.org/10.1145/3274405
Homework Assigned




Week 6:

Data science in Organizational Contexts
Assignments due
Readings assigned (Read both, reflect on one)




Week 7: October 25

Day 5 plan

Day 5 slides

Introduction to mixed-methods research
Big data vs thick data; integrating qualitative research methods into data science practice; crowdsourcing


Assignments due
  • Reading reflection


Readings assigned (Read both, reflect on one)


Homework assigned
  • Reading reflection







Week 8:

Algorithms
algorithmic fairness, transparency, and accountability; methods and contexts for algorithmic audits
Assignments due
  • Reading reflection


Readings assigned
Homework assigned
  • Reading reflection













Week 9: November 22 (No Class Session)

Day 9 plan

Data science for social good
Community-based and participatory approaches to data science; Using data science for society's benefit
Assignments due
  • Reading reflection
  • A4: Final project plan
Agenda
  • Reading reflections discussion
  • Feedback on Final Project Plans
  • Guest lecture: Steven Drucker (Microsoft Research)
  • UI patterns & UX considerations for ML/data-driven applications
  • Final project presentation: what to expect
  • In-class activity: final project peer review


Readings assigned
Homework assigned
  • Reading reflection
Resources






Week 10: November 29

Day 10 plan

Day 10 slides

User experience and big data
Design considerations for machine learning applications; human centered data visualization; data storytelling
Assignments due
  • Reading reflection


Readings assigned
  • NONE
Homework assigned
  • A5: Final presentation





Week 11: December 6

Day 11 plan

Final presentations
course wrap up, presentation of student projects


Assignments due
  • A5: Final presentation


Agenda
  • Student final presentations
  • Course wrap-up


Readings assigned
  • none!
Homework assigned
  • A6: Final project report (by 11:59pm)
Resources
  • one




Week 12: Finals Week (No Class Session)

  • NO CLASS
  • A6: FINAL PROJECT REPORT DUE BY 11:59PM