Editing Seattle open data

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
[[File:Burke_gilman.jpg|thumb|right|250px|Who's riding on the Burke Gilman trail this week?]]
[[File:SeattleGovLogoHome.png|right|250px]]


In this project, we will gather civic data from [https://data.seattle.gov data.seattle.gov] and use it to ask and answer important questions about the Emerald City!. We will start with a series of analyses of bike and pedestrian traffic patterns on the [https://en.wikipedia.org/wiki/Burke-Gilman_Trail Burke-Gilman Trail].
In this project, we will explore a few ways to gather data from [https://data.seattle.gov data.seattle.gov]. Once we've done that, we will extend this to code to create our own datasets of civic data that will allow us to ask and answer questions about the Emerald City!


We will learn how to collect that data from the Seattle's open data portal's API, filter and transform this data, and create timeseries graphs that show daily, weekly, and yearly traffic trends.
TO FILL IN [[User:Jtmorgan|Jtmorgan]] ([[User talk:Jtmorgan|talk]]) 16:57, 20 January 2020 (EST)


== Goals ==
== Goals ==
[[File:SeattleGovLogoHome.png|right|250px]]
In this session, we will focus on...


* Learn how to pose useful research questions that can be asked and answered with civic data
* Learn how to gather datasets from data.seattle.gov with the Socrata API and the Open Data Portal
* Learn how to filter, bucket, and format data for building timeseries graphs in a spreadsheet program
* Identify interesting datasets and research questions that can be asked and answered with those datasets
* Familiarizing ourselves with a new API
* Practice reading and extending other people's code
* Practice reading and extending other people's code
* Create a few collections of different types of data from data.seattle.gov that you can do research with in the final section


== Setup ==
== Setup ==
Line 19: Line 16:


=== Download the Seattle open data project ===
=== Download the Seattle open data project ===
# Click the following link and save the file to your computer: https://github.com/jtmorgan/cdsw-2020/archive/master.zip
# Click the following link and save the file to your Desktop directory: TODOLINK
# Unzip <tt>cdsw-2020-master.zip</tt> folder and place the folder in your CDSW working directory (or just your desktop)
# Unzip FIXME.zip file


=== Test the Seattle open data API ===
=== Test the Seattle open data project ===
;Test an API call to data.seattle.gov
;Test an API call to data.seattle.gov


#Open the Jupyter notebook <tt>SODA_API_demo.ipynb</tt>
Open the Jupyter notebook FOO
#Run the first code cell in the notebook
 
Run the first X cells in the notebook in order
 
The output of cell FIXME should be
example output
 
;Test downloading a CSV file and opening it in a notebook
 
Open FIXMELINK in your browser


The output of cell should look like:
CLICK on DOWNLOADBUTTON FIXME
"https://data.seattle.gov/resource/76t5-zqzr.json?$where=(PermitNum='6531736-PH')"
[{'applieddate': '2016-10-07',
  'contractorcompanyname': 'M A MORTENSON COMPANY',
  'description': 'Construct institutional building (University of Washington, '
                'Computer Science and Engineering Dept.), occupy per plan.',
  'estprojectcost': '23886804',
  'expiresdate': '2020-04-03',
  'housingunitsadded': '0',
  'housingunitsremoved': '0',
  'issueddate': '2017-04-03',
  'latitude': '47.65300378',
  'link': {'url': 'https://cosaccela.seattle.gov/portal/customize/LinkToRecord.aspx?altId=6531736-PH'},
  'location1': {'human_address': '{"address": "3800 EAST STEVENS WAY NE", '
                                '"city": "SEATTLE", "state": "WA", "zip": '
                                '"98195"}',
                'latitude': '47.65300378',
                'longitude': '-122.30500427'},
  'longitude': '-122.30500427',
  'originaladdress1': '3800 EAST STEVENS WAY NE',
  'originalcity': 'SEATTLE',
  'originalstate': 'WA',
  'originalzip': '98195',
  'permitclass': 'Institutional',
  'permitclassmapped': 'Non-Residential',
  'permitnum': '6531736-PH',
  'permittype': 'Building',
  'permittypedesc': 'New',
  'statuscurrent': 'Completed'}]


== Analyzing traffic on the Burke-Gilman trail ==
SAVE FIXME.csv to the FIXME directory with your notebooks
[[File:Bgt_bikes_and_peds_2019.png|thumb|right|250px|In this session we'll learn how to analyze and transform data about traffic on the Burke-Gilman trail over time, and create useful timeseries visualizations like this one!]]
We will spend the first part of the session today walking through the included notebook <tt>Burke-Gilman_commuter_traffic.ipynb</tt>. We will be reproducing this notebook section by section, coding as we go, until we culminate in exporting a CSV file that can be used to build the timeseries visualization above.


After that, you'll have time to explore next steps on your own, either tackling the "Challenge questions" below, exploring the capabilities of the SODA API, or asking your own research questions with any of the other datasets on data.seattle.gov!
OPEN the Juypyter notebook FOO


=== Research questions we will answer in this session ===
Run the first X cells of the notebook in order
# How many people used the Burke Gilman during commute hours in 2019?
# What were the busiest hours on the Burke Gilman in 2019?
# What are the busiest hours for bikes vs pedestrians?
# What are the busiest hours for bikes vs. peds AND northbound vs. southbound?


=== Challenge questions to apply what you've learned ===
The output of the cell FIXME should be
''These are questions you now have the basic tools to answer using the BGT dataset (potentially in combination with other open datasets listed below):''
example output
# What day of the week is busiest on the Burke Gilman?
# What day of the week is busiest for bikes? Is it the same as the busiest day for pedestrians?
# What month of the year is busiest? (aka do Seattlites really like to ride in the rain?)
# Has the Burke Gilman gotten busier over time? (the dataset we have goes back to 2014!)
# Do fewer people commute on the Burke Gilman when it's cold out? (hint: try combining this dataset with the dataset on road temperature over time!)
# Do more people commute into Seattle in the mornings by bike on the Burke Gilman, or on the the Mountain to Sound Trail?


== SODA API tutorial ==
The included notebook <tt>SODA_API_demo.ipynb</tt> can help you familiarize yourself with the [https://dev.socrata.com/ Socrata Open Data API] (which is used on data.seattle.gov). This API allows you to write powerful queries to get exactly the data you want from any of these Seattle Open Data portal sites (as well as any other site that uses the SODA API!). If you'd like to spend more time in the session practicing with this API, grab a mentor!


=== Data sources that use this API ===
== Socrata API tutorial ==
* https://data.medicare.gov/
* https://opendata.cityofnewyork.us/
* https://data.cityofchicago.org/
* Most (all?) of the sites listed at https://www.opendatanetwork.com/


== Other open Seattle datasets to explore ==
== Datasets to explore ==
* Fremont bridge bicycle counter: https://data.seattle.gov/Transportation/Fremont-Bridge-Bicycle-Counter/65db-xm6k
* Spokane Street bridge bicycle counter: https://data.seattle.gov/Transportation/Spokane-St-Bridge-Bicycle-Counter/upms-nr8w
* Mountain to Sound trail bicycle + pedestrian counter: https://data.seattle.gov/Transportation/MTS-Trail-west-of-I-90-Bridge-Bicycle-and-Pedestri/u38e-ybnc
* Seattle police [https://en.wikipedia.org/wiki/Terry_stop Terry stops]: https://data.seattle.gov/Public-Safety/Terry-Stops/28ny-9ts8
* Seattle building permits: https://data.seattle.gov/Permitting/Building-Permits/76t5-zqzr
* Seattle road temperature: https://data.seattle.gov/Public-Safety/Road-Weather-Information-Stations/egc4-d24i/data


== External links ==
== External links ==
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)