Difference between revisions of "Community Data Science Workshops (Fall 2015)/Day 3 Projects/Civic data"

From CommunityData
Line 17: Line 17:
 
* Graphing and mapping data
 
* Graphing and mapping data
  
We will also explore an important non-technical part of data science: thinking critically about data. Thinking critically about the data you have, and what conclusions you can draw from it, is especially important when you are visualizing data, because its easy to mislead people with visualizations. We'll discuss how visualizations based on incorrect data can lead people to make false conclusions, using an example from a recent visualization of building demolitions published in the Northwest design magazine ''Arcade''.
+
<!-- We will also explore an important non-technical part of data science: thinking critically about data. Thinking critically about the data you have, and what conclusions you can draw from it, is especially important when you are visualizing data, because its easy to mislead people with visualizations. We'll discuss how visualizations based on incorrect data can lead people to make false conclusions, using an example from a recent visualization of building demolitions published in the Northwest design magazine ''Arcade''. -->
  
  
Line 28: Line 28:
  
 
== Part 1: getting the building permit data ==
 
== Part 1: getting the building permit data ==
;Question: Has the rate of apartment construction increased since 2010?
+
;Dataset: https://data.seattle.gov/Permitting/Building-Permits-Current/mags-97de
  
 +
Run the script <pre>download_building_permit_data.py</pre>
  
;Today's dataset: [https://data.seattle.gov/Permitting/Building-Permits-Current/mags-97de Click here to view the Seattle Building Permits database]
+
Now open up the CSV output file to verify that you got all the data you asked for.
  
== Part 2: new construction by month ==
 
  
 +
== Part 2: charting new construction by month ==
 +
;Question: Has the rate of residential construction increased since 2010?
  
== Part 3: new construction by neighborhood ==
+
Now we want to learn whether there is, in fact, a housing boom in Seattle. We'll do this by counting how many new permits are issued each month, and then plot these on a graph.
== Visualizing new home construction in Seattle by neighborhood ==
+
 
 +
Run the script <pre>building_permits_by_month.py</pre>
 +
 
 +
Now open up the CSV output file and check your data. Does this look right to you?
 +
 
 +
;Challenge question: how could we separate out single family homes from apartments, and count/plot them separately?
 +
 
 +
 
 +
== Part 3: charting new construction by neighborhood ==
 
;Question: which Seattle neighborhoods have had the most multifamily residential construction (apartments and townhomes) since 2010?
 
;Question: which Seattle neighborhoods have had the most multifamily residential construction (apartments and townhomes) since 2010?
  
 +
Now we want to learn where all this new construction is happening. We'll do this by sending the address for each MULTIFAMILY permit to the Google Geolocation API, which will return the neighborhood where that address is located.
 +
 +
Run the script <pre>building_permits_by_neighborhood.py</pre>
  
== Mapping new home construction in Seattle ==
+
Now open up the CSV output file and check your data. Does this look right to you?
 +
 
 +
;Challenge question: In which Seattle neighborhood is the cost of new apartment construction projects highest, on average?
 +
 
 +
== Part 4: Mapping new home construction in Seattle ==
 
;Question: what locations have experienced the highest density of new construction since 2010?
 
;Question: what locations have experienced the highest density of new construction since 2010?
  
== Challenge questions ==
+
Now we will try to get an even more detailed picture of where this construction is occuring, using Google Fusion tables, a powerful visualization application that makes it easy to plot data on a map.
 +
 
 +
Go to FIXME and upload your data.
 +
Create a point map and heatmap.
 +
 
 +
== Going further ==
 
''Try to answer these additional questions that draw on the data and methods we're learning today. Ask a mentor if you get stuck!''
 
''Try to answer these additional questions that draw on the data and methods we're learning today. Ask a mentor if you get stuck!''
#In which Seattle neighborhood is the cost of new construction projects highest, on average?
+
#Which developer has spent the most on new construction in Seattle since 2010?
#Where in Seattle are the most commercial buildings being constructed in 2015?
+
#How many townhouses have been constructed in Seattle since 2010?
#How does the rate of residential construction in Seattle from 2010-2015 compare to the previous 5 years?
+
#Where in Seattle are the most ''commercial'' buildings being constructed in 2015?
#How many townhouses have been constructed in Seattle since 2011?
+
#How does the rate of residential construction in Seattle from 2010-2015 compare to [https://data.seattle.gov/Permitting/Building-Permits-Older-than-5-years/47eb-r92t the previous 5 years]?
  
  

Revision as of 00:20, 7 November 2015

What neighborhoods have changed the most since the beginning of the Seattle building boom?


Building and visualizing datasets using data.seattle.gov and Google apps

In this project, we will explore a few ways to analyze and visualize data from data.seattle.gov. We will examine trends in the construction of houses, townhomes, and condos in Seattle over the last five years, in order to determine the true extent of the current "construction boom".

  • We'll download data from two APIs (Socrata open data platform and Google Maps)
  • We'll analyze trends over time in our dataset
  • We'll visualize our data in graphs and heatmaps

Topics we will cover

  • Writing and testing complex API queries
  • Reading and writing CSV and JSON files
  • Filtering and aggregating data
  • Combining data from multiple APIs
  • Graphing and mapping data


Preparation

Step 1
Download today's scripts: Python scripts
Step 2
Download today's datasets: datasets

If you are confused by anything today, go back and refresh your memory with the Day 0 setup and tutorial and Day 0 tutorial

Part 1: getting the building permit data

Dataset
https://data.seattle.gov/Permitting/Building-Permits-Current/mags-97de

Run the script

download_building_permit_data.py

Now open up the CSV output file to verify that you got all the data you asked for.


Part 2: charting new construction by month

Question
Has the rate of residential construction increased since 2010?

Now we want to learn whether there is, in fact, a housing boom in Seattle. We'll do this by counting how many new permits are issued each month, and then plot these on a graph.

Run the script

building_permits_by_month.py

Now open up the CSV output file and check your data. Does this look right to you?

Challenge question
how could we separate out single family homes from apartments, and count/plot them separately?


Part 3: charting new construction by neighborhood

Question
which Seattle neighborhoods have had the most multifamily residential construction (apartments and townhomes) since 2010?

Now we want to learn where all this new construction is happening. We'll do this by sending the address for each MULTIFAMILY permit to the Google Geolocation API, which will return the neighborhood where that address is located.

Run the script

building_permits_by_neighborhood.py

Now open up the CSV output file and check your data. Does this look right to you?

Challenge question
In which Seattle neighborhood is the cost of new apartment construction projects highest, on average?

Part 4: Mapping new home construction in Seattle

Question
what locations have experienced the highest density of new construction since 2010?

Now we will try to get an even more detailed picture of where this construction is occuring, using Google Fusion tables, a powerful visualization application that makes it easy to plot data on a map.

Go to FIXME and upload your data. Create a point map and heatmap.

Going further

Try to answer these additional questions that draw on the data and methods we're learning today. Ask a mentor if you get stuck!

  1. Which developer has spent the most on new construction in Seattle since 2010?
  2. How many townhouses have been constructed in Seattle since 2010?
  3. Where in Seattle are the most commercial buildings being constructed in 2015?
  4. How does the rate of residential construction in Seattle from 2010-2015 compare to the previous 5 years?


Resources