Community Data Science Workshops (Spring 2016)/Day 2 Projects/Socrata





Building a Dataset using the Socrata API and data.seattle.gov
In this project, we will explore a few ways to gather data from data.seattle.gov using the [Socrata open data API. Once we've done that, we will extend this to code to create our own datasets of civic data that we might be able to use to ask and answer questions in the final session.

Goals

 * Get set up to build datasets with the Socrata API
 * Have fun collecting different types of data from data.seattle.gov
 * Practice reading and extending other people's code
 * Create a few collections of different types of data from Socrata that you can do research with in the final workshop session

Download the Socrata project
Click here to download the scripts

If you are confused by these steps, go back and refresh your memory with the Day 0 setup and tutorial and Day 0 tutorial

(Estimated time: 10 minutes)

Topics to cover

 * explain Socrata open data platform, exists on other government websites
 * navigate to api page and show the documentation, point out examples
 * introduce the API sandbox as a tool for building queries

Example questions

 * What Seattle neighborhood has the most art galleries? - neighborhood_culture1.py
 * What Seattle neighborhood has the most square feet devoted to arts & culture? - neighborhood_culture2.py
 * Which trail gets more bike traffic per month--the Burke Gilman or the Mountain to Sound Trail? - traffic_counter1.py, traffic_counter2.py
 * Do people use the trails less when it's cold? monthly_weather.py
 * Does the Burke-Gilman get more bike or pedestrian traffic? bike_and_peds1.py
 * What is the primary commute direction on the Burke Gilman? bike_and_peds2.py


 * Other example questions
 * What day of the week does the Burke Gilman have the most total traffic?
 * What day has the most pedestrian traffic?
 * What time of day has the most southbound traffic, on average?
 * How many shoplifting calls has SPD responded to this month so far?

Resources
API resources
 * API Sandbox
 * API-powered app
 * filtering results
 * dealing with timestamps
 * writing API queries

Datasets used in this project
You don't need to download these, but you may want to open the links below to view them online, so that you understand the data we will be working with.


 * Burke-Gilman trail bike and pedestrian traffic counts — this is our primary dataset: hourly traffic counts on the BGT (measured at its intersection of NE 70th street in Northeast Seattle), broken down by bike vs ped, and northbound vs southbound.
 * Mountain-to-Sounds trail bike and pedestrian traffic counts — this is a 'bonus' dataset which you will need to complete the 'bonus' coding challenges.


 * Videos
 * https://data.seattle.gov/videos
 * https://www.youtube.com/watch?v=YlKzXTrTLOQ
 * https://www.youtube.com/watch?v=Whfp8ojMf0U
 * https://www.youtube.com/watch?v=Vd6bwz3ivVA


 * Other Socrata sites
 * https://data.austintexas.gov/
 * https://data.cityofchicago.org/
 * https://data.cityofnewyork.us/