Editing Statistics and Statistical Programming (Winter 2021)/Problem set 5
From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 51: | Line 51: | ||
===SQ1. Interpret bivariate analyses=== | ===SQ1. Interpret bivariate analyses=== | ||
Return to the dataset you imported and worked with in the programming challenges above. Imagine that it comes from a year-long study of bicyclists using a combination of survey and ride-tracking data from | Return to the dataset you imported and worked with in the programming challenges above. Imagine that it comes from a year-long study of bicyclists using a combination of survey and ride-tracking data from the Divvy bikeshare members in the Chicagoland area conducted a few years ago (let's say 2018, just to pick a year). Each row in the data corresponds to a single Divvy cyclist/member and the variables correspond to the following measures: | ||
* <code>x</code>: Average daily distance cycled (in miles) measured via bicycle dock check-in/check-out data. | * <code>x</code>: Average daily distance cycled (in miles) measured via bicycle dock check-in/check-out data. | ||
* <code>j</code>: An indicator (True/False) of whether any rides were recorded between January and March. | * <code>j</code>: An indicator (True/False) of whether any rides were recorded between January and March. | ||
* <code>l</code>: An indicator (True/False) of whether the cyclist also uses vehicle rideshare provided by | * <code>l</code>: An indicator (True/False) of whether the cyclist also uses vehicle rideshare provided by Lyft (the company that owns Divvy). | ||
* <code>k</code>: A measure of how frequently the cyclist rode in bad weather, with bad weather defined using a standard measure provided by the U.S. NOAA (National Oceanic and Atmospheric Administration) and the categories (none, some, a lot, all) defined in terms of empirical quartiles within the dataset. | * <code>k</code>: A measure of how frequently the cyclist rode in bad weather, with bad weather defined using a standard measure provided by the U.S. NOAA (National Oceanic and Atmospheric Administration) and the categories (none, some, a lot, all) defined in terms of empirical quartiles within the dataset. | ||
* <code>y</code>: A continuous measure of income calculated in tens of thousands of dollars and scaled so that "0" = average income for a | * <code>y</code>: A continuous measure of income calculated in tens of thousands of dollars and scaled so that "0" = average income for a Divvy member (i.e., a value of "5" = $50,000 more per year than an average Divvy member). | ||
# Return to the conditional means you created in PC6 above. Given the information you now have about the study, how would you interpret them? Does there seem to be any sort of relationship between the two variables? | # Return to the conditional means you created in PC6 above. Given the information you now have about the study, how would you interpret them? Does there seem to be any sort of relationship between the two variables? |