Seattle open data
In this project, we will gather civic data from data.seattle.gov and use it to ask and answer important questions about the Emerald City!. We will start with a series of analyses of bike and pedestrian traffic patterns on the Burke-Gilman Trail.
We will learn how to collect that data from the Seattle's open data portal's API, filter and transform this data, and create timeseries graphs that show daily, weekly, and yearly traffic trends.
Goals[edit]
In this session, we will focus on...
- Learn how to pose useful research questions that can be asked and answered with civic data
- Learn how to filter, bucket, and format data for building timeseries graphs in a spreadsheet program
- Familiarizing ourselves with a new API
- Practice reading and extending other people's code
Setup[edit]
If you are confused by these steps, go back and refresh your memory with the Day 0 setup and tutorial.
Download the Seattle open data project[edit]
- Click the following link and save the file to your computer: https://github.com/jtmorgan/cdsw-2020/archive/master.zip
- Unzip cdsw-2020-master.zip folder and place the folder in your CDSW working directory (or just your desktop)
Test the Seattle open data API[edit]
- Test an API call to data.seattle.gov
- Open the Jupyter notebook SODA_API_demo.ipynb
- Run the first code cell in the notebook
The output of cell should look like:
"https://data.seattle.gov/resource/76t5-zqzr.json?$where=(PermitNum='6531736-PH')" [{'applieddate': '2016-10-07', 'contractorcompanyname': 'M A MORTENSON COMPANY', 'description': 'Construct institutional building (University of Washington, ' 'Computer Science and Engineering Dept.), occupy per plan.', 'estprojectcost': '23886804', 'expiresdate': '2020-04-03', 'housingunitsadded': '0', 'housingunitsremoved': '0', 'issueddate': '2017-04-03', 'latitude': '47.65300378', 'link': {'url': 'https://cosaccela.seattle.gov/portal/customize/LinkToRecord.aspx?altId=6531736-PH'}, 'location1': {'human_address': '{"address": "3800 EAST STEVENS WAY NE", ' '"city": "SEATTLE", "state": "WA", "zip": ' '"98195"}', 'latitude': '47.65300378', 'longitude': '-122.30500427'}, 'longitude': '-122.30500427', 'originaladdress1': '3800 EAST STEVENS WAY NE', 'originalcity': 'SEATTLE', 'originalstate': 'WA', 'originalzip': '98195', 'permitclass': 'Institutional', 'permitclassmapped': 'Non-Residential', 'permitnum': '6531736-PH', 'permittype': 'Building', 'permittypedesc': 'New', 'statuscurrent': 'Completed'}]
Analyzing traffic on the Burke-Gilman trail[edit]
We will spend the first part of the session today walking through the included notebook Burke-Gilman_commuter_traffic.ipynb. We will be reproducing this notebook section by section, coding as we go, until we culminate in exporting a CSV file that can be used to build the timeseries visualization above.
After that, you'll have time to explore next steps on your own, either tackling the "Challenge questions" below, exploring the capabilities of the SODA API, or asking your own research questions with any of the other datasets on data.seattle.gov!
Research questions we will answer in this session[edit]
- How many people used the Burke Gilman during commute hours in 2019?
- What were the busiest hours on the Burke Gilman in 2019?
- What are the busiest hours for bikes vs pedestrians?
- What are the busiest hours for bikes vs. peds AND northbound vs. southbound?
Challenge questions to apply what you've learned[edit]
These are questions you now have the basic tools to answer using the BGT dataset (potentially in combination with other open datasets listed below):
- What day of the week is busiest on the Burke Gilman?
- What day of the week is busiest for bikes? Is it the same as the busiest day for pedestrians?
- What month of the year is busiest? (aka do Seattlites really like to ride in the rain?)
- Has the Burke Gilman gotten busier over time? (the dataset we have goes back to 2014!)
- Do fewer people commute on the Burke Gilman when it's cold out? (hint: try combining this dataset with the dataset on road temperature over time!)
- Do more people commute into Seattle in the mornings by bike on the Burke Gilman, or on the the Mountain to Sound Trail?
SODA API tutorial[edit]
The included notebook SODA_API_demo.ipynb can help you familiarize yourself with the Socrata Open Data API (which is used on data.seattle.gov). This API allows you to write powerful queries to get exactly the data you want from any of these Seattle Open Data portal sites (as well as any other site that uses the SODA API!). If you'd like to spend more time in the session practicing with this API, grab a mentor!
Data sources that use this API[edit]
- https://data.medicare.gov/
- https://opendata.cityofnewyork.us/
- https://data.cityofchicago.org/
- Most (all?) of the sites listed at https://www.opendatanetwork.com/
Other open Seattle datasets to explore[edit]
- Fremont bridge bicycle counter: https://data.seattle.gov/Transportation/Fremont-Bridge-Bicycle-Counter/65db-xm6k
- Spokane Street bridge bicycle counter: https://data.seattle.gov/Transportation/Spokane-St-Bridge-Bicycle-Counter/upms-nr8w
- Mountain to Sound trail bicycle + pedestrian counter: https://data.seattle.gov/Transportation/MTS-Trail-west-of-I-90-Bridge-Bicycle-and-Pedestri/u38e-ybnc
- Seattle police Terry stops: https://data.seattle.gov/Public-Safety/Terry-Stops/28ny-9ts8
- Seattle building permits: https://data.seattle.gov/Permitting/Building-Permits/76t5-zqzr
- Seattle road temperature: https://data.seattle.gov/Public-Safety/Road-Weather-Information-Stations/egc4-d24i/data