Editing Community Data Science Course (Spring 2017)

From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 12: Line 12:
In a world that is increasingly driven by software and data, developing a basic level of fluency with programming and the basic tools of data analysis is a crucial skill. This course will introduce basic programming and data science tools to give students the skills to use data to answer questions about social media and online communities.
In a world that is increasingly driven by software and data, developing a basic level of fluency with programming and the basic tools of data analysis is a crucial skill. This course will introduce basic programming and data science tools to give students the skills to use data to answer questions about social media and online communities.


In particular, the class will cover the basics of the Python programming language, an introduction to web APIs, and will teach basic tools and techniques for data analysis and visualization. In order to efficiently cover an end to end data analysis project, we will focus on publicly available data sets from the United States Government and the City of Seattle. Our goal is to enable you to gather and analyze data from any available source, but there are often subtle differences between data providers, and I would prefer that we see the full process once than get bogged down in data collection. Time will also be reserved to cover data access for popular social media platforms including Twitter.
In particular, the class will cover the basics of the Python programming language, an introduction to web APIs, and will teach basic tools and techniques for data analysis and visualization. In order to efficiently cover an end to end data analysis project, we will focus on publicly available data sets from the United States Government and the City of Seattle. We will also cover data collection from Twitter.


As part of the class, participants will learn to write software in Python to collect data from web APIs and process that data to produce numbers, hypothesis tests, tables, and graphical visualizations that answer real questions. The class will be built around student-designed independent projects. Every student will pick a question or issue they are interested in pursuing in the first week and will work with the instructor to build from that question toward a completed analysis of data that the student has collected using software they have written.
As part of the class, participants will learn to write software in Python to collect data from web APIs and process that data to produce numbers, hypothesis tests, tables, and graphical visualizations that answer real questions. The class will be built around student-designed independent projects. Every student will pick a question or issue they are interested in pursuing in the first week and will work with the instructor to build from that question toward a completed analysis of data that the student has collected using software they have written.
Line 21: Line 21:
I will consider this class a complete success if, at the end, every student can:
I will consider this class a complete success if, at the end, every student can:


* Write or modify a program to collect a dataset from a publicly available data source.
* Write or modify a program to collect a dataset from the Wikipedia and Twitter APIs.
* Read web API documentation and write Python software to parse and understand a new and unfamiliar web API.
* Read web API documentation and write Python software to parse and understand a new and unfamiliar JSON-based web API.
* Use both Python-based tools like MatPlotLib as well as tools like LibreOffice, Google Docs, or Microsoft Excel to effectively graph and analyze data.
* Use both Python-based tools like MatPlotLib as well as tools like LibreOffice, Google Docs, or Microsoft Excel to effectively graph and analyze data.
* Use web-based data to effective answer a substantively interesting question and to present this data effectively in the context of both a formal presentation and a written report.
* Use web-based data to effective answer a substantively interesting question and to present this data effectively in the context of both a formal presentation and a written report.
* The ideal outcome is that students will have the working knowledge to more effectively collaborate with data professionals in their careers. They will be both more informed about the process and more likely to spot undeclared assumptions in their colleague's work.
* The ideal outcome is that students will have the working knowledge to more effectively collaborate with data professionals in their careers. They will be both more informed about the process and more likely to spot un-declared assumptions in their colleague's work.


== Note About This Syllabus ==  
== Note About This Syllabus ==  
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)