PyData Seattle 2015 proposal: Difference between revisions

From CommunityData
(Created page with ";Title: ;Short Description (400 char): The Community Data Science Workshops are the - new curriculum developed in 2014-2015 - taught three times in seattle - focused on abs...")
 
No edit summary
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
;Title:
;Title: Democratizing Data Science


;Short Description (400 char):
;Short Description (400 char):


The Community Data Science Workshops are the
What if programming and data science was something everybody learned? As part of the Community Data Science Workshops, more than 50 volunteers have taught than 200 complete beginners the basics of Python and data analysis in a series of 4-day workshops. We'll talk about our approach and describe the successes and challenges of approaching Python and data science as basic literacies.
- new curriculum developed in 2014-2015
- taught three times in seattle
- focused on absolute beginners
- nearly 400 people have sign up for the first three sessions
- and 200 were admitted and attended one of the workshop series
- NNN volunteer mentors
- curriculum has been adapted and run elsewhere with more in progress


;Abstract:
;Abstract:


The Community Data Science Workshops are a series of project-based workshops for anyone interested in learning how to use programming and data science tools to ask and answer questions about online communities like Wikipedia, Twitter, free and open source software, and civic media.
The Community Data Science Workshops (CDSW) are a series of project-based workshops for anyone interested in learning how to use programming and data science tools to ask and answer questions about online communities like Wikipedia, Twitter, free and open source software, and civic media. The workshops are for people with no previous programming experience. The workshops bring together researchers, academics, and participants and leaders in online communities. Run three times in 2014 and 2015, the workshops have all been free of charge and are open to the public.


The workshops are for people with no previous programming experience. The goal is to bring together both researchers and academics as well as participants and leaders in online communities. The workshops have all been free of charge and are open to the public.
The sessions are scheduled for one Friday evening and three Saturdays all day. Each session involves a period for lecture and technical demonstrations in the morning. The rest of the day consists of self-directed work on programming and data science projects supported by more experienced mentors.


The sessions are schedule for one Friday evening and three Saturdays all day. Each session involves a period for lecture and technical demonstrations in the morning. This is followed by a lunch. The rest of the day consists of self-directed work on programming and data science projects supported by more experienced mentors.
Our goal is that, after the three workshops, participants will be able to use data to produce numbers, hypothesis tests, tables, and graphical visualizations to answer questions like: Are new contributors to an article in Wikipedia sticking around longer or contributing more than people who joined last year? Who are the most active or influential users of a particular Twitter hashtag? Are people who participated in a Wikipedia outreach event staying involved? How do they compare to people that joined the project outside of the event?


Our goal is that, after the three workshops, participants will be able to use data to produce numbers, hypothesis tests, tables, and graphical visualizations to answer questions like:
Our very first workshops was originally modeled after the Boston Python Workshops but most our curriculum is new and has been developed and modified by the mentors and with feedback from the participants. The CDSW curriculum, now being taught outside Seattle by others inspired by our model, is entirely based on Python. Our most recent round of workshops in Spring 2015 was taught entirely using Python 3.


- Are new contributors to an article in Wikipedia sticking around longer or contributing more than people who joined last year?
Teaching data science over only four days to people who begin without any familiarity with concepts like the command line or variables is a major departure from traditional data science curricula that assume at least some familiarity with programming and statistics.
- Who are the most active or influential users of a particular Twitter hashtag?
- Are people who participated in a Wikipedia outreach event staying involved? How do they compare to people that joined the project outside of the event?


Our very first workshops was originally modeled after the Boston Python Workshop but most our curriculum is brand new and has been developed and modified by the mentors and with feedback from the participants.
This talk will describe the approach we have taken to refine our material over the three times we have run the workshops and will share details of our experience. CDSW's organizers are professional programmers and data scientists and several of us have experience teaching data science in more traditional university and corporate settings. Our talk will describe how "democratized" data science is similar to — and sometimes extremely different from — these more traditional approaches. We will talk about some of the challenges we have faced and highlight some of our most inspirational successes.  


features:
<!-- LocalWords: CDSW CDSW's
 
-->
- teaching to complete novices
- focusing on using data from day1 (e.g., not like BPW)
- focusing on data best practices along the way
 
techical details:
- moved to anaconda as default (reduced number of)
- moved to python 3 in 2015 with zero hitches (solved /many/ encoding issues)

Latest revision as of 05:21, 8 June 2015

Title
Democratizing Data Science
Short Description (400 char)

What if programming and data science was something everybody learned? As part of the Community Data Science Workshops, more than 50 volunteers have taught than 200 complete beginners the basics of Python and data analysis in a series of 4-day workshops. We'll talk about our approach and describe the successes and challenges of approaching Python and data science as basic literacies.

Abstract

The Community Data Science Workshops (CDSW) are a series of project-based workshops for anyone interested in learning how to use programming and data science tools to ask and answer questions about online communities like Wikipedia, Twitter, free and open source software, and civic media. The workshops are for people with no previous programming experience. The workshops bring together researchers, academics, and participants and leaders in online communities. Run three times in 2014 and 2015, the workshops have all been free of charge and are open to the public.

The sessions are scheduled for one Friday evening and three Saturdays all day. Each session involves a period for lecture and technical demonstrations in the morning. The rest of the day consists of self-directed work on programming and data science projects supported by more experienced mentors.

Our goal is that, after the three workshops, participants will be able to use data to produce numbers, hypothesis tests, tables, and graphical visualizations to answer questions like: Are new contributors to an article in Wikipedia sticking around longer or contributing more than people who joined last year? Who are the most active or influential users of a particular Twitter hashtag? Are people who participated in a Wikipedia outreach event staying involved? How do they compare to people that joined the project outside of the event?

Our very first workshops was originally modeled after the Boston Python Workshops but most our curriculum is new and has been developed and modified by the mentors and with feedback from the participants. The CDSW curriculum, now being taught outside Seattle by others inspired by our model, is entirely based on Python. Our most recent round of workshops in Spring 2015 was taught entirely using Python 3.

Teaching data science over only four days to people who begin without any familiarity with concepts like the command line or variables is a major departure from traditional data science curricula that assume at least some familiarity with programming and statistics.

This talk will describe the approach we have taken to refine our material over the three times we have run the workshops and will share details of our experience. CDSW's organizers are professional programmers and data scientists and several of us have experience teaching data science in more traditional university and corporate settings. Our talk will describe how "democratized" data science is similar to — and sometimes extremely different from — these more traditional approaches. We will talk about some of the challenges we have faced and highlight some of our most inspirational successes.