Data Into Insights (Spring 2021)

= Course Information =
 * COM 495/6/7: Turning Data into Insight and Stories
 * Location: ONLINE
 * Class Hours: Tuesdays and Thursdays; 10:30-11:45am

Instructor

 * Instructor: Jeremy Foote
 * Email: jdfoote@purdue.edu
 * Office Hours: Fridays 10am-noon and by appointment

= Course Overview and Learning Objectives =

We are increasingly surrounded by data, and those with the technical skills to analyze it are highly sought after. Even more valuable are those who can not only identify insights from data, but can communicate and persuade with those insights. This course will focus on both developing data skills and crafting persuasive data stories.

Students who complete this course will be able to:
 * 1) Understand the role of narrative in interpreting and producing data analyses
 * 2) Competently import, process, and prepare data for analysis in the R programming language
 * 3) Critically analyze data visualizations and presentations, and recognize poor or misleading visualizations
 * 4) Produce beautiful, well-designed data visualizations in R using ggplot2
 * 5) Craft compelling data presentations

= Required resources and texts =

Laptop
This is a data analysis class and you will need access to a decent computer. You will need a machine with at least 2GB of memory. Windows, Mac OS, and Linux are all fine but an iPad or Android tablet won't work.

Readings

 * Required texts:
 * Data Visualization: A Practical Introduction by Kieran Healy. Web version (free!) or Print version (Amazon)
 * R for Data Science by Hadley Wickham and Garrett Grolemund. Web version (free!) or Print version (Amazon)
 * Effective Data Storytelling by Brent Dykes. Purdue libraries or Print version (Amazon)


 * Other readings: Readings will be linked to from this page. Where necessary, they will be put on Brightspace

Reading Academic Articles
Some of the readings will be academic articles. I do not expect you to read every word of these articles. Rather, you should practice intentional directed skimming. This article gives a nice overview. The TL;DR is that you should carefully read the abstract, introduction, and conclusion. For the rest of the article, focus on section headings and topic sentences to extract the main ideas.

= Course logistics =

Note About This Syllabus
This is my first time teaching this course and this syllabus will be a dynamic document. Although the core expectations for this class are fixed, the details of readings and assignments may shift based on how the class goes. As a result, there are three important things to keep in mind:


 * 1) Although details on this syllabus will change, I will not change readings or assignments less than one week before they are due. If I don't fill in a "To Be Determined" one week before it's due, it is dropped. If you plan to read more than one week ahead, contact me first.
 * 2) Closely monitor the class Discord. Because this a wiki, you will be able to track every change by clicking the history button on this page. I will also summarize these changes in an announcement on Discord that should be emailed to everybody in the class.
 * 3) I will ask the class for voluntary anonymous feedback frequently. Please let me know what is working and what can be improved.

Class Sessions
This course will follow "flipped" classroom model. I expect you to learn most of the content of the course asynchronously. The goal of our time together is not to tell you new things, but to consolidate knowledge and to clear up misconceptions.

The Tuesday meeting will be a collaborative, discussion-centric session. Typically, about half of each session will be devoted to going over assignments and the other half will be a discussion of the readings and videos from that week. We will take collaborative notes using this Etherpad.

If you would like to create collaborative summaries of the readings, you can use this Etherpad.

The Thursday meetings will be more like a lab. Some of these sessions will include synchronous activities but they will often be more of a co-working time, where you can work synchronously on assignments and I can be available to answer questions.

Getting Help
Your first place to look for help should be each other. By asking and answering questions on Discord, you will not only help to build a repository of shared information, but to reinforce our learning community.

I will also hold office hours Friday mornings on Discord (sign up here). If you come with a programming question, I will expect that you have already tried to solve it yourself in multiple ways and that you have discussed it with a classmate (e.g., on Discord). This policy lets me have time to help more students, but it's also a useful strategy. Often just trying to explain your code can help you to recognize where you've gone wrong.

I will also keep an eye on Discord during normal business hours. I encourage you to post questions there, and to use it as a space where we can help and instruct each other. In general, you should contact me there. I am also available by email. You can reach me at [mailto:jdfoote@purdue.edu jdfoote@purdue.edu]. I try hard to maintain a boundary between work and home and I typically respond only on weekdays during business hours.

Resources
Especially for the programming assignments, I will often create video walkthroughs that will be linked from the schedule. I also created the following general videos that may be helpful:


 * Explanation of ggplot (and Chapter 3 in R4DS) [Video]
 * Finding and fixing bugs in your code [Video] [R Markdown file] [HTML file]

= Assignments =

There will be multiple types of assignments, designed to encourage learning in different ways.

Participation
This will be a very participatory class, and I expect you to be an active member of our class, engaged in helping us all to gain insight and inspritation. This includes paying attention in class, participating in activities, and being actively engaged in learning, thinking about, and trying to understand the material.

This also includes doing the readings and watching the videos. To make sure that everyone has an opportunity to participate and to encourage you to do the assignments, I will randomly select students to answer discussion questions or to explain portions of homework assignments and labs. I will keep track of the quantity and quality of your responses and I will make that data available to you to help guide our discussion around grades.

Discussion Questions
This course will have two "modes". For much of the class, we will be reading about theories of communication and rhetoric, about principles of data visualization, etc. For these sessions, you will be required to submit 1-2 discussion questions on Discord on Monday by noon. I will then curate some of these questions (and add some of my own) to use to guide our discussion on Tuesday. I will post the questions on the Etherpad at https://etherpad.wikimedia.org/p/com-495-data-insight

Questions should engage with the readings and either connect to other concepts or to the "real world". Here are some good example questions:


 * The readings this week talked a lot about how data visualizations can be misleading. How can we tell when visualizations are intentionally trying to mislead versus when they are just poorly designed?
 * I was confused by the reading on counterfactuals. We obviously can't really know what would have happened in different conditions, so why even try?
 * Imagine you were asked to create an ad campaign to recruit students to Purdue. What types of appeals would you use and why?

During other weeks, we will be more focused on learning practical skills (mostly data manipulation and visualization in R). On those weeks, discussions will center around identifying places where folks are still confused and students will be randomly selected to share their responses to homework questions.

Homework/Labs
There will be a number of homework assignments. At the beginning of the class, these will be designed to help you to grasp foundational concepts about storytelling, visualization, and data. As the class progresses, more and more of them will be based on learning and developing proficiency in visualizing data in R.

Exams
There will be one in-class exam. It will assess your understanding of core concepts around storytelling and visualization.

Final Project
The main outcome of this course will be your final project, which will be a data presentation, either as a website or a slide deck + presentation. A detailed description of the project is at this link.

There will be a number of intermediate assignments through the semester to help you to identify a dataset, explore the data for insights, and get and give feedback on visualizations and story elements.

= Grades =

This course will follow a "self-assessment" philosophy. I am more interested in helping you to learn things that will be useful to you than in assigning grades. In general, I think that my time is much better spent in providing better feedback and in being available to work through problems together.

The university still requires grades, so you will be leading the evaluation of your work. This will be completed with me in four stages, at the end of weeks 4, 8, 12, and 16. In each stage, you will use this form to reflect on what you have accomplished thus far, how it has met, not met, or exceeded expectations, based both on rubrics and personal goals and objectives. At each of these stages you will receive feedback on your assessments. By the end of the semester, you should have a clear vision of your accomplishments and growth, which you will turn into a grade. As the instructor-of-record, I maintain the right to disagree with your assessment and alter grades as I see fit, but any time that I do this it will be accompanied by an explanation and discussion. These personal assessments, reflecting both honest and meaningful reflection of your work will be the most important factor in final grades.

We will use the following rubric in our assessment:


 * 20%: class participation, including attendance and participation in discussions and group work
 * 20%: Labs and homework assignments
 * 25%: Exam
 * 35%: Final Project

The exam will be graded like a normal exam and the score will make up 25% of your grade. For the rest of the assignments (and the other 75% of your grade), I will provide feedback which will inform an ongoing conversation about your work.

My interpretation of grade levels (A, B, C, D/F) is the following:

A: Reflects work the exceeds expectations on multiple fronts and to a great degree. Students reaching this level of achievement will:
 * Do what it takes to learn the principles and techniques of data storytelling, including looking to outside sources if necessary.
 * Engage thoughtfully with an ambitious final project.
 * Take intellectual risks, offering interpretations based on synthesizing material and asking for feedback from peers.
 * Share work early allowing extra time for engagement with others.
 * Write reflections that grapple meaningfully with lessons learned as well as challenges.
 * Complete all or nearly all homework assignments at a high level.

B: Reflects strong work. Work at this level will be of consistently high quality. Students reaching this level of achievement will:
 * Be more safe or consistent than the work described above.
 * Ask meaningful questions of peers and engage them in fruitful discussion.
 * Exceed requirements, but in fairly straightforward ways - e.g., an additional post in discussion every week.
 * Compose complete and sufficiently detailed reflections.
 * Complete many of the homework assignments.

C: This reflects meeting the minimum expectations of the course. Students reaching this level of achievement will:
 * Turn in and complete the final project on time.
 * Be collegial and continue discussion, through asking simple or limited questions.
 * Compose reflections with straightforward and easily manageable goals and/or avoid discussions of challenges.
 * Not complete homework assignments or turn some in in a hasty or incomplete manner.

D/F: These are reserved for cases in which students do not complete work or participate. Students may also be impeding the ability of others to learn.

Extra Credit for Participating in Research Studies
The Brian Lamb School of Communication uses an online program that expedites the process of recruiting, signing up, and granting extra credit to students for participating in research studies. The program is called the Research Participation System, and it provides an easy online method for you to sign up for research studies, to keep track of the studies you have completed, and to view how many credits you have earned for each study. You can access the system online at any time, from any computer with a standard web browser. By participating in studies done within the Brian Lamb School of Communication, you can learn first hand how a study is conducted, you can contribute to the advancement of the field, and you can improve your grade by earning extra credit.


 * You earn a ½ percent credit for every half-hour that you participate in a study. The maximum extra credit that you can earn for this course is 3%, which will be added to your total course points
 * If you sign up to participate in a study and fail to show up without canceling your appointment in advance (up to 2 hours before the study), you can be restricted from signing up for any studies for 30 days. You may quickly cancel your appointment online using the Research Participation System.
 * Please review the instructions before you sign up for studies; to view the instructions go to https://www.cla.purdue.edu/communication/research/participation/students.html
 * You can sign up to participate in studies by logging into http://purdue-comm.sona-systems.com/.

= Schedule =

NOTE This section will be modified throughout the course to meet the class's needs. Check back in weekly.

Week 1: Introduction
January 19

Assignment Due:
 * Sign up for Discord and introduce yourself
 * Take this very brief survey

Readings (before class):
 * None

Class Schedule:
 * Class overview and expectations — We'll walk through this syllabus.

January 21

Assignment Due:
 * Read the entire syllabus (this document)

Week 2: Storytelling and Narratives
January 26

Assignment Due:
 * Discussion questions

Readings (before class):
 * Zak, P. (2013). How stories change the brain
 * Langston, C. How to use rhetoric to get what you want (video)
 * Leighfield, L. Ethos, Pathos & Logos: Aristotle’s Modes of Persuasion
 * Purdue OWL Aristotle's Rhetorical Situation
 * Kurt Vonnegut's Shapes of Stories
 * Lafrance, A. The Six Main Arcs in Storytelling, as Identified by an A.I.
 * (Optional) A Rulebook for Arguments (link on Brightspace)

Class Schedule:

Week 3: Data insights and data stories
February 2

Assignment Due:
 * Discussion questions

Readings:
 * Effective Data Storytelling (EDS) Ch. 1--3 (Purdue libraries copy)
 * Matei, S. What is a (data) story?
 * Counterfactuals and Storytelling lecture [4:49]
 * (Optional) Levy, J. (2015). Counterfactuals, Causal Inference, and Historical Analysis
 * (Optional) Storytelling for Data Scientists
 * (Optional) How to properly tell a story with data — and common pitfalls to avoid

Class Schedule:
 * Identifying insights
 * Counterfactual thinking
 * The role of statistics

Week 4: The ethics of data stories (Part I)
February 9

Assignment Due:
 * Turn in your Self Assessment Reflection on Brightspace
 * Case Study (Be prepared to talk about this case, based on the readings and the class so far)
 * No Discussion Questions (but feel free to have discussions on Discord!)

Readings:
 * Salganik, M. (2017). Chapter 6: Ethics from Bit by Bit.
 * Kassner, M. 5 ethics principles big data analysts must follow
 * McNulty, K. (2018). Beware of 'storytelling' in data and analytics
 * (Optional) Steinmann, M., Matei, S. A., & Collmann, J. (2016). A Theoretical Framework for Ethical Reflection in Big Data Research. (On Brightspace)

Class Schedule:
 * Ethical frameworks
 * What are ethical data stories?
 * When do analysts need to make ethical decisions?
 * Transparency, respect, beneficence, honesty

Week 5: Where does data come from?
February 16

Assignment Due:
 * Discussion questions

Readings:
 * Where data comes from lecture [14:02]
 * Pelz, W. Measurement of Constructs in Research Methods for the Social Sciences.
 * Dirty Data article
 * Salganik, M. Observing behavior in Bit by Bit
 * EDS Chapter 5
 * Perkel, J. A toolkit for data transparency takes shape
 * (Optional) Tayi, G. K. and Ballou, D. P. (1998). Examining Data Quality

Class Schedule:

Week 6: Introduction to R
February 23

Assignment Due:
 * R Lab 1
 * Video to help with lab [7:39]

Readings: (Optional)
 * Why Programming + Intro to R lecture [12:53]
 * What I Learned Recreating One Chart Using 24 Tools. Lisa Charlotte Rost
 * R4DS Ch. 1
 * Unit 1: Basic Basics (R Ladies Sydney)
 * Intro to R tutorial (Aaron Shaw)

Class Schedule:

Week 7: Making figures in R
March 2

Assignment Due:


 * R4DS Chapter 3 Exercises
 * Video overview of how to do assignment + ggplot explanation [13:33]

Readings:
 * R4DS Chapter 3
 * DV Chapter 2

Class Schedule:
 * ggplot2

Week 8: Manipulating and Aggregating Data
March 9

Assignment Due:
 * Start R4DS Chapter 5 Exercises
 * Video explanation of homework [26:45]
 * Turn in your Self Assessment Reflection on Brightspace

Readings:
 * R4DS Chapter 4 - Workflow Basics
 * R4DS Chapter 5 - Data transformation

Week 9: Visualization Principles
March 16

Assignment Due:
 * R4DS Chapter 5 Exercises
 * Discussion questions

Readings:
 * Graphic Design by Andrew Heiss. Make sure to watch all 4 videos.
 * EDS Chapter 7
 * Healy, K. Data Visualization Chapter 1
 * (Optional) Gelman, A. and Unwin, A. (2012). Infovis and statistical graphics: Differrent goals, different looks.
 * (Optional) Williams, R. (2008). The Non-Designer's Design Book, Chapters 1-6

Class Schedule:

March 18 - READING DAY

Week 10: Visualization Principles II and Exploratory Data Analysis
March 23

Assignment Due:
 * Submit the data source for your final project
 * Visualization Project

Readings:
 * DV Chapter 4: Show the right numbers
 * EDS Chapter 8
 * Hullman, J. How to get better at embracing unknowns
 * Yau, N. Visualizing the uncertainty in data.
 * (Optional) Review R4DS Ch 5

Class Schedule:
 * Summarize and discuss readings
 * Peer feedback on data source + visualization project
 * R4DS Chapter 5 (continued)

Week 11: Text as data
March 30

Assignment Due:
 * Discussion questions - One discussion question and one or more examples of "bad" visualizations that you found

Readings:


 * Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis.
 * Reagan, A. J., Mitchell, L., Kiley, D., Danforth, C. M., & Dodds, P. S. (2016). The emotional arcs of stories are dominated by six basic shapes. EPJ Data Science.

Class Schedule:
 * Guest lecture by Ryan J. Gallagher

Week 12: Advanced visualizations in R
April 6

Assignment Due:
 * Self Assessment Reflection
 * Story Time Mini-project

Readings:
 * DV Chapter 7: Maps
 * R4DS Ch. 28

Class Schedule:
 * Maps
 * Networks
 * Annotations

Week 13: Importing and cleaning data
April 13

READING DAY


 * Synchronous session moved to April 15

April 15

Assignment Due:
 * Proposal for final project
 * R4DS Chapter 12 (12.2 and 12.3)

Readings:
 * R4DS Chapters 11--12
 * (Optional) Wickham, H. (2014). Tidy Data. Journal of statistical software, 59(10), 1-23.
 * (Optional) Huntington-Klein, N. Data Wrangling with R and the Tidyverse

Class schedule:
 * Provide peer feedback on final project proposal

Week 14: Crafting data stories
April 20

Assignment Due:
 * One discussion question
 * New version of final project proposal (edited following peer feedback)
 * R4DS Chapter 12 (12.4-12.6)

Readings:
 * Kim, Y. et al. (2017). Explaining the Gap: Visualizing One’s Predictions Improves Recall and Comprehension of Data.
 * Knaflic, C. N. (2019). Storytelling with Data Chapter 6
 * EDS Chapter 9

Week 15: Ethics of data stories (Part II)
April 27

Assignment Due:
 * 1 Discussion question
 * Final project rough draft for peer feedback

Readings:
 * Re-read McNulty, K. (2018). Beware of 'storytelling' in data and analytics and reflect on how you see this differently now that you know more about data storytelling

Topics:
 * What does an ethical data story look like?

April 29

Assignment Due:
 * Peer feedback (via email or Discord)

Week 16: Finals week
Assignment Due:
 * Final Project - Due Thursday, May 6
 * Turn in your Final self reflection on Brightspace

= Policies =

Attendance
In general, I expect students to attend our Tuesday meetings and to typically attend our Thursday meetings. It is expected that students communicate well in advance to faculty so that arrangements can be made for making up the work that was missed. It is your responsibility to seek out support from classmates for notes, handouts, and other information.

Only the instructor can excuse a student from a course requirement or responsibility. When conflicts can be anticipated, such as for many University-sponsored activities and religious observations, the student should inform the instructor of the situation as far in advance as possible. For unanticipated or emergency conflicts, when advance notification to an instructor is not possible, the student should contact me as soon as possible on Discord or by email. In cases of bereavement, quarantine, or isolation, the student or the student’s representative should contact the Office of the Dean of Students via email or phone at 765-494-1747. Our course Brightspace includes a link to the Dean of Students under 'Campus Resources.'

Classroom Discussions and Peer Feedback
Throughout the course, you may receive, read, collaborate, and/or comment on classmates’ work. These assignments are for class use only. You may not share them with anybody outside of class without explicit written permission from the document’s author and pertaining to the specific piece.

It is essential to the success of this class that all participants feel comfortable discussing questions, thoughts, ideas, fears, reservations, apprehensions and confusion. Therefore, you may not create any audio or video recordings during class time nor share verbatim comments with those not in class linked to people’s identities unless you get clear and explicit permission. If you want to share general impressions or specifics of in-class discussions with those not in class, please do so without disclosing personal identities or details.

Academic Integrity
While I encourage collaboration, I expect that any work that you submit is your own. Basic guidelines for Purdue students are outlined here but I expect you to be exemplary members of the academic community. Please get in touch if you have any questions or concerns.

Nondiscrimination
I strongly support Purdue's policy of nondiscrimination (below). If you feel like any member of our classroom--including me--is not living up to these principles, then please come and talk to me about it.

Purdue University is committed to maintaining a community which recognizes and values the inherent worth and dignity of every person; fosters tolerance, sensitivity, understanding, and mutual respect among its members; and encourages each individual to strive to reach his or her own potential. In pursuit of its goal of academic excellence, the University seeks to develop and nurture diversity. The University believes that diversity among its many members strengthens the institution, stimulates creativity, promotes the exchange of ideas, and enriches campus life.

Accessibility
Purdue University strives to make learning experiences as accessible as possible. If you anticipate or experience physical or academic barriers based on disability, you are welcome to let me know so that we can discuss options. You are also encouraged to contact the Disability Resource Center at: drc@purdue.edu or by phone: 765-494-1247.

Emergency Preparation
In the event of a major campus emergency, I will update the requirements and deadlines as needed.

Mental Health
If you or someone you know is feeling overwhelmed, depressed, and/or in need of mental health support, services are available. For help, such individuals should contact Counseling and Psychological Services (CAPS) at 765-494-6995 during and after hours, on weekends and holidays, or by going to the CAPS office of the second floor of the Purdue University Student Health Center (PUSH) during business hours.

Incompletes
A grade of incomplete (I) will be given only in unusual circumstances. The request must describe the circumstances, along with a proposed timeline for completing the course work. Submitting a request does not ensure that an incomplete grade will be granted. If granted, you will be required to fill out and sign an “Incomplete Contract” form that will be turned in with the course grades. Any requests made after the course is completed will not be considered for an incomplete grade.

Additional Policies
Links to additional Purdue policies are on our Brightspace page. If you have questions about policies please get in touch.