Data Into Insights (Spring 2021)/Final project

From CommunityData

The final project in this class will be a full data storytelling presentation. The goal is to showcase the skills that you have learned to create exactly the kind of presentation that you may be asked to work on in your careers. I hope that you create something that you can show to potential employers, graduate admissions committees, or love interests.

Step 1: Identify a dataset[edit]

You all have different interests and I want you to be able to tell a story about something that interests you. So, the first step is to identify a dataset that you want to work on and to brainstorm at least three questions that you would like to explore in the dataset.

I strongly recommend that you find a dataset that is already organized and fairly well-cleaned. In other words, I don't recommend doing something that will require merging data from multiple datasets, gathering data from APIs, etc. (although talk to me if you have a great idea that you think you can pull off!). In general, you should be looking for data that's in CSV format (there are some even better formats like `.feather` or `.RData` but CSV is usually as good as it gets).


There are lots of places to find data. Here are a few lists but feel free to identify your own:

Due date: March 23

Step 2: Explore the data and write a proposal[edit]

Take some time to explore the data - read the documentation and start exploring the questions that you brainstormed. Identify two insights from the data that seem promising and that you would like to pursue and identify a stakeholder who would be interested in those insights.

Write a short proposal (~1 page) explaining what you have done, why you think it is interesting, and your plan for analysis moving forward.

Due date: April 15

Step 3: Write a rough draft[edit]

Next, you will use counterfactual thinking to identify possible explanations for the insights you have found and you will pursue further data gathering and/or analyses to explore these possibilities.

Then, you will craft a data story intended to persuade the stakeholder that you identified about what you have found, using the tools of data storytelling that we have learned, including narrative and visualizations.

Note that at this point it is not unlikely that you will learn that your original story is not true. In that case, tell us the story of why it isn't true!


For this project, I want you to be able to practice crafting both of the major kinds of data stories: static reports and presentations. Sometimes, these can be quite similar, and can use the same figures to tell the same story in the same way. On the other hand, you may also want to take advantages of different affordances that each medium offers. For example, in a static report much of the narrative has to occur in the text, while presentations can do things like build up a visualization one piece at a time.

There are two deliverables for your rough draft and your final project:

  1. An R Markdown report with ~3,000 words of text, knitted into a Word Doc file
  2. An ~8 minute recorded slide presentation. This can also use R Markdown if you want, or you can use something like Google Slides, Power Point, etc.

Each deliverable should include 3-5 visualizations created by you, each of which helps to tell a story and make an argument.

You will create a rough draft of each of these deliverables and will be randomly assigned to a partner. On April 29, we will use our Thursday sessions to provide feedback.

Due date: April 27

Step 4: Turn in final project[edit]

You will take the feedback you get and refine the rough draft into a polished final project.

Due date: May 6