Difference between revisions of "Community Data Science Workshops (Fall 2015)/Day 3 Projects/Civic data"

From CommunityData
 
(19 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[File:Seattle_building_cost_heatmap.png|right|250px]]
+
[[File:Seattle_building_cost_heatmap.png|thumb|right|250px|What neighborhoods have changed the most since the beginning of the Seattle [http://www.seattletimes.com/pacific-nw-magazine/seattles-building-boom-is-good-news-for-a-new-generation-of-workers/ building boom]?]]
[[File:SeattleGovLogoHome.png|right|150px]]
 
[[File:Socrata-square-color.png|right|75px]]
 
  
 
__NOTOC__
 
__NOTOC__
 
== Building and visualizing datasets using data.seattle.gov and Google apps ==
 
== Building and visualizing datasets using data.seattle.gov and Google apps ==
  
In this project, we will explore a few ways to gather data from [https://data.seattle.gov data.seattle.gov] using the [[w:Socrata|Socrata]] API, the Google Maps API, and Google Fusion tables.
+
In this project, we will explore a few ways to analyze and visualize data from [https://data.seattle.gov data.seattle.gov]. We will examine trends in the construction of houses, townhomes, and condos in Seattle over the last five years, in order to determine the true extent of the current "construction boom".
  
;Today's dataset: [https://data.seattle.gov/Permitting/Building-Permits-Current/mags-97de Click here to view the Seattle Building Permits database]
+
* We'll download data from two APIs  ([[w:Socrata|Socrata]] open data platform and Google Maps)
 +
* We'll analyze trends over time in our dataset
 +
* We'll visualize our data in graphs and heatmaps
  
;Today's scripts: [http://jtmorgan.net/cdsw/may9cdsw2.zip Click here to download the today's scripts]
+
=== Topics we will cover ===
 +
* Writing and testing complex API queries
 +
* Reading and writing CSV and JSON files
 +
* Filtering and aggregating data
 +
* Combining data from multiple APIs
 +
* Graphing and mapping data
 +
 
 +
<!-- We will also explore an important non-technical part of data science: thinking critically about data. Thinking critically about the data you have, and what conclusions you can draw from it, is especially important when you are visualizing data, because its easy to mislead people with visualizations. We'll discuss how visualizations based on incorrect data can lead people to make false conclusions, using an example from a recent visualization of building demolitions published in the Northwest design magazine ''Arcade''. -->
 +
 
 +
 
 +
== Part 1: Downloads ==
 +
''If you are confused by anything today, go back and refresh your memory with the [[Community Data Science Workshops (Fall 2015)/Day 0 setup and tutorial|Day 0 setup and tutorial]] and [[Community Data Science Workshops (Fall 2015)/Day 0 tutorial|Day 0 tutorial]]''
 +
 
 +
;Step 1: Download today's scripts: [http://jtmorgan.net/cdsw/nov7cdsw.zip Python scripts]
 +
 
 +
;Step 2: Get the dataset: Unzip the folder, navigate to it in your terminal, and run this script <pre>download_building_permit_data.py</pre>
 +
 
 +
Now open up the CSV output file to verify that you got all the data you asked for.
 +
 
 +
You can view the full set of permit applications at: https://data.seattle.gov/Permitting/Building-Permits-Current/mags-97de
 +
 
 +
;Challenge question: how could we change our API query to download the applicant names for all COMMERCIAL building permits issued within the past year?
 +
 
 +
== Part 2: charting new construction by month ==
 +
;Question: Has the rate of residential construction increased since 2010?
 +
 
 +
Now we want to learn whether there is, in fact, a housing boom in Seattle. We'll do this by counting how many new permits are issued each month, and then plot these on a graph.
 +
 
 +
Run the script <pre>residential_permits_by_month.py</pre>
 +
 
 +
Now open up the CSV output file and check your data. Does this look right to you?
 +
 
 +
;Challenge question: how could we separate out single family homes from apartments, and count/plot them separately?
 +
 
 +
== Part 3: charting new construction by neighborhood ==
 +
;Question: which Seattle neighborhoods have had the most multifamily residential construction (apartments and townhomes) since 2010?
  
 +
Now we want to learn where all this new construction is happening. We'll do this by sending the address for each MULTIFAMILY permit to the Google Geolocation API, which will return the neighborhood where that address is located.
  
=== Goals ===
+
Run the script <pre>multifamily_permits_by_neighborhood.py</pre>
  
* Collect data on building permits in Seattle, by neighborhood and by year awarded.
+
Now open up the CSV output file and check your data. Does this look right to you?
* Combine those data with neighborhood data from the Google Maps API
 
* Download the results in a CSV file
 
* Upload results to Google's [https://support.google.com/fusiontables/answer/2571232 Fusion Table] visualization engine
 
  
 +
;Challenge question: In which Seattle neighborhood is the cost of new apartment construction projects highest, on average?
 +
 +
== Part 4: Mapping new home construction in Seattle ==
 +
;Question: what locations have experienced the highest density of new construction since 2010?
 +
 +
Now we will try to get an even more detailed picture of where this construction is occuring, using Google Fusion tables, a powerful visualization application that makes it easy to plot data on a map.
 +
 +
Go [https://support.google.com/fusiontables/answer/2571232?hl=en here] and follow the steps to upload your '''new_all_2010-2015.csv''' file.
 +
 +
*Create a point map and heatmap.
 +
*Experiment with filtering and weighting the points on your map.
 +
 +
;Example map: https://www.google.com/fusiontables/data?docid=1gm0wVqnK7zQ7hgp5bsKoBIetMZ75jo-VMs5noDoJ#map:id=3
 +
 +
== Going further ==
 +
''Try to answer these additional questions that draw on the data and methods we're learning today. Ask a mentor if you get stuck!''
 +
#Which developer has spent the most on new construction in Seattle since 2010?
 +
#How many townhouses have been constructed in Seattle since 2010?
 +
#Where in Seattle are the most ''commercial'' buildings being constructed in 2015?
 +
#How does the rate of residential construction in Seattle from 2010-2015 compare to [https://data.seattle.gov/Permitting/Building-Permits-Older-than-5-years/47eb-r92t the previous 5 years]?
  
=== Topics we will cover ===
 
  
* creating API queries and filtering and grouping the results
 
* combining results from multiple APIs
 
* downloading API results as CSV files
 
* building visualizations of timeseries and geolocated data
 
  
 +
== Resources ==
  
=== Questions we will (attempt to) answer with our data ===
+
;Press about the construction boom
# which Seattle neighborhoods have had the most multifamily residential construction (apartments and townhomes) since 2010?
+
*http://www.theurbanist.org/2015/10/20/fact-check-no-explosion-in-demolitions/
# how much has the rate of apartment construction increased since 2010?
+
*http://arcadenw.org/article/changing-seattle
# where are the most expensive construction projects currently being built?
+
*http://www.seattlemag.com/article/demolitions-seattle-no-neighborhood-unaffected
 +
*http://www.seattletimes.com/pacific-nw-magazine/seattles-building-boom-is-good-news-for-a-new-generation-of-workers/
  
 +
=== Datasets ===
 +
* Permits, last 5 years (today's dataset): https://data.seattle.gov/Permitting/Building-Permits-Current/mags-97de
 +
* Permits older than 5 years: https://data.seattle.gov/Permitting/Building-Permits-Older-than-5-years/47eb-r92t
  
=== Other questions you can answer with these data ===
+
;Custom Socrata datasets
# which Seattle developer has built the most apartments and townhomes since 2010?
+
multifamily 2010-2015: https://data.seattle.gov/Permitting/Building-permits-new-multifamily-residential-const/ma3y-m69a
# where in Seattle are the most single-family homes being constructed?
+
single and multifamily 2010-2015: https://data.seattle.gov/Permitting/Building-permits-new-residential-construction/kdfe-reh3
  
 +
=== Sample visualizations ===
 +
*Building permit charts: https://docs.google.com/spreadsheets/d/15DBcWnCroga4B1_ss66YjW9hlJxEk3g1UK7khFLpbQM/edit#gid=0
 +
*Building permit Fusion table: https://www.google.com/fusiontables/DataSource?docid=1gm0wVqnK7zQ7hgp5bsKoBIetMZ75jo-VMs5noDoJ#rows:id=1
  
=== Resources ===
+
=== APIs ===
''If you are confused by anything today, go back and refresh your memory with the [[Community Data Science Workshops (Fall 2015)/Day 0 setup and tutorial|Day 0 setup and tutorial]] and [[Community Data Science Workshops (Fall 2015)/Day 0 tutorial|Day 0 tutorial]]''
+
* Hurl.it API testing tool: https://www.hurl.it
  
 
;Sample API queries
 
;Sample API queries
Line 48: Line 104:
 
* Google maps location (using lat/long): http://maps.googleapis.com/maps/api/geocode/json?latlng=47.66979666%2C-122.38570052
 
* Google maps location (using lat/long): http://maps.googleapis.com/maps/api/geocode/json?latlng=47.66979666%2C-122.38570052
  
;Sample charts/maps
 
* Google Fusion table map of permits 2010-2015 [https://www.google.com/fusiontables/DataSource?docid=1aOSWYuXXqh7U2bnUsBH7mM2w3JfsYHzjk29ihbi8]
 
  
;Help resources and inspiration
+
 
* About Google maps API: https://developers.google.com/maps/documentation/geocoding/#ReverseGeocoding
+
=== Google tools ===
* About Google Fusion tables: https://support.google.com/fusiontables/answer/2571232
+
*Fusion tables: https://support.google.com/fusiontables/answer/2571232
* Google Fusion Tables mapmaking tutorial: https://support.google.com/fusiontables/answer/2527132?hl=en&topic=2573107&ctx=topic
+
*About Fusion table heatmaps: https://support.google.com/fusiontables/answer/1152262
* API sandbox tool: https://www.hurl.it
+
*About Google maps geocoding API: https://developers.google.com/maps/documentation/geocoding/
* API-powered app: http://web6.seattle.gov/mnm/
+
 
 +
=== Socrata open data portal===  
 +
*More about the Socrata open data portal and API: http://www.socrata.com/products/open-data-portal/
 
* Socrata API help resources: http://dev.socrata.com/consumers/getting-started.html
 
* Socrata API help resources: http://dev.socrata.com/consumers/getting-started.html
 
:* filtering results: http://dev.socrata.com/docs/filtering.html
 
:* filtering results: http://dev.socrata.com/docs/filtering.html
Line 62: Line 118:
 
:* writing API queries: http://dev.socrata.com/docs/queries.html
 
:* writing API queries: http://dev.socrata.com/docs/queries.html
  
;Other data.seattle.gov datasets with neighborhood, timeseries, and/or location data
+
;Data portal-powered apps
*[https://data.seattle.gov/Community/Seattle-Cultural-Space-Inventory/vsxr-aydq Seattle Cultural Space Inventory]
+
*https://www.seattleinprogress.com/
*[https://data.seattle.gov/Transportation/MTS-Trail-west-of-I-90-Bridge/u38e-ybnc MTS trail bike/ped traffic]
+
*http://web6.seattle.gov/mnm/
*[https://data.seattle.gov/Transportation/Burke-Gilman-Trail-north-of-NE-70th-St-Bike-and-Pe/2z5v-ecg8 Burke Gilman trail bike/ped traffic]
 
*[https://data.seattle.gov/Transportation/Road-Weather-Information-Stations/egc4-d24i Road temps in Seattle]
 
*[https://data.seattle.gov/Public-Safety/Seattle-Police-Department-911-Incident-Response/3k2p-39jp SPD 911 incident respose]
 
  
;Instructional videos
+
;Socrata API instructional videos
 
* https://data.seattle.gov/videos
 
* https://data.seattle.gov/videos
 
* https://www.youtube.com/watch?v=YlKzXTrTLOQ
 
* https://www.youtube.com/watch?v=YlKzXTrTLOQ
Line 75: Line 128:
 
* https://www.youtube.com/watch?v=Vd6bwz3ivVA
 
* https://www.youtube.com/watch?v=Vd6bwz3ivVA
  
;Other Socrata sites that use this API
+
;Some other government website that use the Socrata API
 
* https://data.austintexas.gov/
 
* https://data.austintexas.gov/
 
* https://data.cityofchicago.org/
 
* https://data.cityofchicago.org/

Latest revision as of 23:48, 7 November 2015

What neighborhoods have changed the most since the beginning of the Seattle building boom?


Building and visualizing datasets using data.seattle.gov and Google apps[edit]

In this project, we will explore a few ways to analyze and visualize data from data.seattle.gov. We will examine trends in the construction of houses, townhomes, and condos in Seattle over the last five years, in order to determine the true extent of the current "construction boom".

  • We'll download data from two APIs (Socrata open data platform and Google Maps)
  • We'll analyze trends over time in our dataset
  • We'll visualize our data in graphs and heatmaps

Topics we will cover[edit]

  • Writing and testing complex API queries
  • Reading and writing CSV and JSON files
  • Filtering and aggregating data
  • Combining data from multiple APIs
  • Graphing and mapping data


Part 1: Downloads[edit]

If you are confused by anything today, go back and refresh your memory with the Day 0 setup and tutorial and Day 0 tutorial

Step 1
Download today's scripts: Python scripts
Step 2
Get the dataset: Unzip the folder, navigate to it in your terminal, and run this script
download_building_permit_data.py

Now open up the CSV output file to verify that you got all the data you asked for.

You can view the full set of permit applications at: https://data.seattle.gov/Permitting/Building-Permits-Current/mags-97de

Challenge question
how could we change our API query to download the applicant names for all COMMERCIAL building permits issued within the past year?

Part 2: charting new construction by month[edit]

Question
Has the rate of residential construction increased since 2010?

Now we want to learn whether there is, in fact, a housing boom in Seattle. We'll do this by counting how many new permits are issued each month, and then plot these on a graph.

Run the script

residential_permits_by_month.py

Now open up the CSV output file and check your data. Does this look right to you?

Challenge question
how could we separate out single family homes from apartments, and count/plot them separately?

Part 3: charting new construction by neighborhood[edit]

Question
which Seattle neighborhoods have had the most multifamily residential construction (apartments and townhomes) since 2010?

Now we want to learn where all this new construction is happening. We'll do this by sending the address for each MULTIFAMILY permit to the Google Geolocation API, which will return the neighborhood where that address is located.

Run the script

multifamily_permits_by_neighborhood.py

Now open up the CSV output file and check your data. Does this look right to you?

Challenge question
In which Seattle neighborhood is the cost of new apartment construction projects highest, on average?

Part 4: Mapping new home construction in Seattle[edit]

Question
what locations have experienced the highest density of new construction since 2010?

Now we will try to get an even more detailed picture of where this construction is occuring, using Google Fusion tables, a powerful visualization application that makes it easy to plot data on a map.

Go here and follow the steps to upload your new_all_2010-2015.csv file.

  • Create a point map and heatmap.
  • Experiment with filtering and weighting the points on your map.
Example map
https://www.google.com/fusiontables/data?docid=1gm0wVqnK7zQ7hgp5bsKoBIetMZ75jo-VMs5noDoJ#map:id=3

Going further[edit]

Try to answer these additional questions that draw on the data and methods we're learning today. Ask a mentor if you get stuck!

  1. Which developer has spent the most on new construction in Seattle since 2010?
  2. How many townhouses have been constructed in Seattle since 2010?
  3. Where in Seattle are the most commercial buildings being constructed in 2015?
  4. How does the rate of residential construction in Seattle from 2010-2015 compare to the previous 5 years?


Resources[edit]

Press about the construction boom

Datasets[edit]

Custom Socrata datasets

multifamily 2010-2015: https://data.seattle.gov/Permitting/Building-permits-new-multifamily-residential-const/ma3y-m69a single and multifamily 2010-2015: https://data.seattle.gov/Permitting/Building-permits-new-residential-construction/kdfe-reh3

Sample visualizations[edit]

APIs[edit]

Sample API queries


Google tools[edit]

Socrata open data portal[edit]

Data portal-powered apps
Socrata API instructional videos
Some other government website that use the Socrata API