Editing Community Data Science Course (Spring 2023)/Week 7 coding challenges
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 12: | Line 12: | ||
# Take a look at the "RecordType" column which describes the kinds of complaints that come in. What are the types of categories? How many are in each category? Show both with numbers and with a simple visualization (a histogram, perhaps?). For each category, print out the "Description" of several examples. What kinds of things are included? | # Take a look at the "RecordType" column which describes the kinds of complaints that come in. What are the types of categories? How many are in each category? Show both with numbers and with a simple visualization (a histogram, perhaps?). For each category, print out the "Description" of several examples. What kinds of things are included? | ||
# Build a new dataset that includes only the "RecordType" | # Build a new dataset that includes only the "RecordType" and "OriginalZip" columns. | ||
# Use this second dataset to filter the dataset down to just rows from your zipcode. If you don't live in Seattle, you can just use my zip code (98112) which covers north Capitol Hill and Montlake or you can pick an area you think is interesting from [https://www.usmapguide.com/washington/seattle-zip-code-map/ this map]. | # Use this second dataset to filter the dataset down to just rows from your zipcode. If you don't live in Seattle, you can just use my zip code (98112) which covers north Capitol Hill and Montlake or you can pick an area you think is interesting from [https://www.usmapguide.com/washington/seattle-zip-code-map/ this map]. | ||
## Now look at the number and proportion of different types of records in this subset. | ## Now look at the number and proportion of different types of records in this subset. | ||
## Be ready to explain if the distribution in this zipcode different than the distribution in Seattle overall? If not, how is it different? | ## Be ready to explain if the distribution in this zipcode different than the distribution in Seattle overall? If not, how is it different? | ||
## Once again, print out the "Description" of several examples from each category. What kinds of things are included? | ## Once again, print out the "Description" of several examples from each category. What kinds of things are included? | ||
# Use pandas to write out the | # Use pandas to write out the two-column dataset to TSV (with ''tabs'' instead of commas). | ||
== It's about time == | == It's about time == | ||
Line 23: | Line 23: | ||
First, lets return to the full dataset and not the two column subset. | First, lets return to the full dataset and not the two column subset. | ||
# Create a new timeseries (use a pandas Series) that contains zip code and that uses the "OpenDate" column as the index. Be sure to check the type of the " | # Create a new timeseries (use a pandas Series) that contains zip code and that uses the "OpenDate" column as the index. Be sure to check the type of the "OriginalZip" column and make sure it's in the pandas datetime format. | ||
# Use the <code>.resample()</code> function associated with your pandas time series so that | # Use the function <code>.resample()</code> function associated with your pandas time series so that is shows the number of complaints per week overall and visualizes this with a time series. | ||
== You've got questions, you've got answers == | == You've got questions, you've got answers == | ||
Line 31: | Line 31: | ||
# Explicitly state the question | # Explicitly state the question | ||
# Include the pandas code to answer it | # Include the pandas code to answer it | ||
# Write a sentence or two explaining what you found and interpret the finding. | # Write a sentence or two explaining what you found and interpret the finding for you. |