Community Data Science Course (Spring 2016)/Day 4 Notes: Difference between revisions
From CommunityData
(Created page with "'''We will be discussing [https://data.seattle.gov/Transportation/SDOT-Collisions/v7k9-7dn4 this data set].''' * One of the most important qualities of the Scientific Revolut...") |
(No difference)
|
Revision as of 05:06, 20 April 2017
We will be discussing this data set.
- One of the most important qualities of the Scientific Revolution was that results were broadly shared, so new results could build on top of existing knowledge.
- Repeatability is the key to science (even data science): your results are only scientific if they are repeatable by a third party.
Today's Lecture Let's go end to end on a data question: are there factors that predict injuries and fatalities in automobile accidents?
- Download data
- Explore the data: find missing values, identify categorical, numerical, ordinal data fields
- Transform (filter, project)
- Analyze (see todo for prompts)
- Find data. Let's start at Seattle Data.
- brief aside: Socrata
- Download it.
- Write exploratory scripts
- Using
open
to open a file in python.
- Using
- Write transformation script
- In groups, answer the todo prompts.