Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Page
Discussion
Edit
View history
Editing
Community Data Science Workshops (Spring 2014)/Reflections
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Session 3: Data Analysis and Visualization == The goal of this session was to get users to the point where they could take data from a web API and ask and answer basic data science questions by using Python to manipulating data and by creating simple visualizations. Our philosophy in Session 3 was to teach users to get data into tools they already know and use. We thought this would be a better use of their time and help make users independent earlier. Based on feedback from the application, we know that almost every user who attended our sessions had at least basic experience with spreadsheets and using spreadsheets to create simple charts. We tried to help users process data using Python into formats that they could load them up in existing tools like ''LibreOffice'', ''Microsoft Excel'', or ''Google Docs''. === Lecture === Because much of our analysis was going to take place outside of Python, the lecture focused on review and on new concept for data manipulation. The lecture began with a detailed walk-through of code [[User:Mako|Mako]] wrote to build a dataset of metadata for all revisions to articles about [https://en.wikipedia.org/wiki/Harry_Potter Harry Potter] on English Wikipedia. After this review, we focused on counting, binning, and grouping data in order to ask and answer simple questions like: * What proportion of edits to ''Harry Potter'' articles are minor? * What proportion of edits to ''Harry Potter'' articles are made by "anonymous" contributors? * What are the most edited ''Harry Potter'' articles? * Who are the most active editors on ''Harry Potter'' articles? Becuse it did not require installation of software and because it ran on every platform, we did sorting and visualization in [http://docs.google.com Google Docs]. === Projects === In the afternoon projects, one group continued with work on the ''Harry Potter'' dataset from English Wikipedia. In this case, the group on building a time series dataset. We were able to bin edits by day and to graph the time series of edits to English Wikipedia over time. Users could easily see the release of the ''Harry Potter'' books and movies from the time series and this was a major ''ahah'' moment for many of the participants. A second project focused on [http://matplotlib.org/ Matplotlib] and generated heatmaps of contributions to articles about men and women in Wikipedia based on time in Wikipedia's lifetime and time of the subjects lifetime. The heatmaps were popular with participants and were something that could not be easily done with spreadsheets. [[File:Matplotlib-hist2d.png|400px]] The challenge with ''matplotlib'' was mostly around installation which took an enormous amount of time when several learners ran into trouble. In the future, we will use [https://store.continuum.io/cshop/anaconda/ Anaconda] which we hope will address these issues because ''Anaconda'' includes ''Matplotlib''.
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information