Community Data Science Workshops (Fall 2015)/Debrief

From CommunityData

Not sure who these notes should be shared with. We like to talk, eh?….


Topics disccussed during a freeflowin’ debrief session on CDSW Fall 2015:

Jason has some thoughts about improving the MatPlotLib session. Ask Jason.

How can we make best use of the many mentors we have? Should we do a mentor survey?

  • find out what their experience is like / improving the mentor experience
  • their ideas on improving the student experience
  • recruit for more specific activities like logistics help - possible workshops or other follow-ons?

By session 3, did we give people enough confidence with dictionaries and lists?

How can we create better continuity and/or better telegraph the continuity between the morning and afternoon sessions?

Would summarizing the “big points” at the end of the morning lecture be helpful?

What if we reversed the order of the workshops so that analytics came before data collection?

What if we started the first workshop session 1 with a few examples of what they’ll be able to do by the end?

Chris thinks ipython notebook would help make things easier for students and set up

How to make the afternoon sessions consistently more interactive?

  • comes with practice?

Should we expand Day 1 afternoon sessions beyond Baby Names?

  • No. It’s good to have everyone doing the same thing.
  • Yes. Let’s give people data that is structured on par with Baby Names, yet gives them a feel for the data they’ll play with in later sessions: Twitter, Wikipedia, CDC, Seattle / King County; Rotten Tomatoes

We talk about the options for workshops in terms of where the data comes from, are we considering the how and what of each data set? Are we helping people work with both continuous and categorical data?

What is too much to offer? E.g. Should we offer stats in python as an afternoon session? Yes if we make it clear what is being taught and what is a prerequisite.

Should we separate the morning and afternoon admissions so more people could attend in the afternoon?

Why have so few students proposed to work on their own thing?

Should we do follow-on activities for students?

  • Jason is willing to kick back up the monthly meet-ups.

Improving engagement between sessions?

Improving networking within workshops?

Outcomes: Expanding collaboration opportunities. I spoke a bit with one of the Fall15 CDSW students who happens to run the HCDE MS program (Liz Sanocki) and she said she did not see herself programming more on her own in the future, but did feel she gained a conceptual understanding that would make her more confident in collaborations and/or project management.


What is the mission of this workshop? How does CDSW differentiate from other offerings like Software Carpentry?

Mika: Our mission is to teach ordinary people to learn python and do data science

Student input
  • What are the “types” this workshop appeals to? Their goals?
  • An all time participant survey?
  • Follow-up with specific individual students e.g.
  • What do we think about so many HCDE and HCI+D students coming?
Potential metrics so far

How well do we do by the people who have atttended? people who did science with workshop skills e.g. published papers follow-up to get the examples? students who became mentors (Mika; Julia; Monica; Dharma... others?) got jobs created other community resources (Miku’s meet-ups; Illana offered her own CDSW workshop at ...? after mentoring at CDSW helps students to identify other resources and/or network to find other helpers? ????

What are the backgrounds and types that the workshop serves well now? Are we content with that? How to improve for them? For others?

  • JMo: People playing with medium sized data sets.

Next steps[edit]


Who are the target people for this workshop?

  • Start-ups? Students? Government workers? Non-profits?

Reaching more people outside of the University such as from city government and non-profits?

  • Commit to working with data or other regional data?
  • Promote afternoon workshops specifically to these audiences.

  • Offering things off campus?
  • Encouraging mentors to take the workshop elsewhere
  • Scholarships to bring people from elsewhere to mentor a bit then bring CDSW back to where they come from
  • Expanding to having two simultaneous morning sessions in the two interactive classrooms at Odegaard.
  • Getting more mentors involved in helping out with logistics (food, space scheduling)
  • Getting Twitter to sponsor a portion of the workshop
  • Continue to improve the online materials
  • Recruit fresh faces to run the lectures
  • Let’s recruite asap, so people have time to get confident about it / prep get involved in group planning

  • Sound isolation is an issue in the back of the Research Commons
  • Projector is hard to see in the back of the Commons
  • Research Commons has most flexibility in terms of flexing the size of break-out groups

The interactive classrooms at Odegaard make it hard to read the room while instructing, mentors sitting at tables helps, though it seems students mostly look at the screens.

Notes on Twitter session 3[edit]

  • start with pseudocode
  • walk through an example line by line
  • give people time to try things on their own
  • end with a walk through example/s created in the workshop

Excel was a problem for students who got messy CSV files out of the programs they wrote in the free time. Chris argues iPython notebook would help avoid this issue.

In spite of having several mentors in the workshop, one student (an undergrad) didn’t get her free-time project working.

Notes on Twitter session 2 (API session)

No one who attended had prior experience working with Twitter data.

Examples of things students said they’d like to be able to do with Twitter data:

  • Most retweeted tweets
  • Networks of users
  • Popular hashtags
  • Trends within and event like Diwali
  • Identifying structure on twitter (?)
  • Identifying words associated with virality (e.g. clickbait)