Editing Statistics and Statistical Programming (Winter 2021)/Problem set 5
From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 46: | Line 46: | ||
===PC7. Create a bivariate table=== | ===PC7. Create a bivariate table=== | ||
Now that you have some categorical variables to work with, let's go ahead and create a bivariate table so that you can examine the distributions of some of these values. Use the <code>table()</code> command to create a cross-tabulation of the recoded versions of the <code>k</code> variable and the <code>j</code> variable. | Now that you have some categorical variables to work with, let's go ahead and create a bivariate table so that you can examine the distributions of some of these values. Use the <code>table()</code> command to create a cross-tabulation of the recoded versions of the <code>k</code> variable and the <code>j</code> variable. | ||
===PC8. Create a bivariate visualization=== | |||
Visualize two variables in the Problem Set #2 dataset using <code>ggplot2</code> and the <code>geom_point()</code> function to produce a scatterplot of <code>x</code> on the x-axis and <code>y</code> on the y-axis. '''Optional bonus:''' Incorporate any of the other variables on other dimensions (e.g., color, shape, and/or size are all good options). If you run into any issues plotting these dimensions, revisit the examples in the tutorial and the ggplot2 documentation and consider that ggplot2 can be very picky about the classes of objects. | |||
== Statistical Questions == | == Statistical Questions == | ||
Line 66: | Line 69: | ||
'''Optional bonus statistical question''' | '''Optional bonus statistical question''' | ||
''We talked about birthdays in the context of one of the textbook exercises for ''OpenIntro'' Chapter 3. Here's an opportunity to apply your knowledge and extend that exercise. Note that you can absolutely use R to help calculate the solutions to both parts of this problem. That said, it's a super famous problem and answers/examples are all over the internet, so if you want to challenge yourself, don't look at them while you're working on it! The only hint I'll give you is that you may find [https://en.wikipedia.org/wiki/Binomial_coefficient binomial coefficients] useful and the <code>choose()</code>) function can calculate them for you in R.'' | |||
# | # The first time I taught this course, there were 25 people in it (including the members of the teaching team). Imagine that I offered you a choice between two bets: Bet #1 is determined by the flip of a fair coin. You can choose heads or tails and you win the bet if your choice turns out to be correct). Bet #2 is determined by whether any two members of that previous version of the class shared a birthday. If a birthday was shared I win the bet, and if no shared birthdays were shared you win the bet. Assuming you want the best chance of winning, which bet should you choose? | ||
# Now calculate the probability that any two members of our | # Now calculate the probability that any two members of our 7 person class share a birthday and compare this probability with the results of SQ2.1 above. |