Community Data Science Workshops (Fall 2014)/Reflections

From CommunityData

Community_Data_Science_Workshops_(Fall_2014)/Reflections

Scheduling the next workshop: Not too close to the end of the quarter.

Session 0 (Friday November 7th Evening 6:30-9:30pm) Went smoothly. No person reported “many problems with set-up.” All respondants reported either no problems or a few problems.

Anaconda was key to smoothyness compared to the first workshop series. However, Anaconda is not open source. It reduced issues, but was not 100% issue free. When one person’s home directory is in Chinese, Anaconda got confused. This was fixed by a mentor who changed the path.

At least one mentor was confused about whether mentees should self-report they’d completed the steps or whether the mentor should verify that the steps were all taken. In future, email mentors ahead of time to let them know.

Improvements for session 0 for next time. The process people use to flag for mentor help: We didn’t model enough using sticky notes during lectures early on.

Technology improvements: Get less ambiguous sticky notes.


Space improvements: Set up/arrange/select the space to facilitate better circulation of mentors. When mentors can circulate easily things are better for mentees.

Streamling the instructions for set-up for next time.

Q: How to reduce the number of steps and the number of operating system specific version?

This time CDSW moved from Powershell from CMD. Powershell doesn’t work well on PCs. People were instructed to “find the Windows mentor.” No one had XP. Next time, we might be able to move away from separate instructions for Linux / Mac / and Window.

Consider writing install instructions that do not rely on Anaconda so people have a fully open source option.

In the first two workshop sequences, no mentees were running Linux. Possibly, in future a Linux workshop would be good. Presently, Linux help/instructions may be moot.

Maintanence errors on the wiki. There was a need for several on-the-fly corrections of the instructions and files on the wiki during the workshop.


Q: How to handle when mentees want to refer back to the workshop material that they experienced?

A: Create and archive template for the page they are looking at during the workshop. Each project can be its own namespace as opposed to having event-specific pages.

Mentors should post the code generated in the break-outs. Encourage them to capture the code.


General observation about mentoring: Being a mentor is kind of hard, especially being a good mentor. Some steps were skipped in helping mentors that were in place last time.

It was hard to tell who was a mentor and who wasn’t.

Improvement: Help the mentors to be visually identifiable. E.g. Paper them head to foot in sticky notes.

Questions about mentorship: How to help the mentors to mentor well?

Suggestion for mentors: Walk around to every single person. Ask, “How are you doing? What are you working on? Show me what you’re doing.”

How much do you help somebody?

Should there be a page of guidelines for mentors?

Where is uniformity needed in mentor style and where do we want to encourage diverse approaches?

Let’s have a mentors workshop! At a bar! With BEER! and PIZZA!

The pizza party, er, mentor workshop could cover: norms, best practices, goals. Planning, etc.


Should only fully open source tools be selected for workshops? A: Our job is not to extoll the virtues of open source. Our job is to help mentees solve their data problem. “We are teaching you how to do things with data that help you achieve your goal.” However, open source tools are desirable.

Q: Should we be teaching Python 3? A: Yes, but when? It may solve some technical issues that are occurring now.


Student demographics. This time there was more gender balance in both students and mentors.

2/3 of mentees were from UW. Included students from random places including someone who works for the city of Seattle.Many random Wikipedians were there. It's cool that people who are not doing research but are part of online communities were in the mix with the researchers.

We had 16 students from HCDE were there, but also a bunch of mentors. They were good mentors.

Demographics of Applicants. Several people applied who are already good at programming. Why do they apply? Maybe they want more exposure to data science?


Desired applicants. The constraint on scaling the workshop is the number of mentors. Every mentor means that the workshop can accommodate four more mentees.

Is it good to have mentees who have some programming skills along with those who don’t have any? Or is it a better use of the seats to only take those with no programming background?

Who are the priorities? Get more of the newbies and invest more in them?

Improving Retention: Anecdotally, there is a sense that those who are dropping are those who had more trouble but didn’t struggle visibly.

Q: Would it help with retention if we show people what will happen in the following weeks?

A: Several mentors say “yes.” We’re doing that, but let’s do more of it.

Pair programming for those who want it might be helpful. Working in groups is another possibility.


Mining research interests/goals. Could we help match up people with similar interests?


How can we support self-directed projects?

Can we give mentees more guidance to support their project interests? It’s easier to do that if people are pre-clustered.

Bring up people’s ideas at the end.

The size of the breakout workshops varied and that means different degrees of engagement were feasible.


The BIG feedback from the first series of workshops: Bring people back together more often. Bringing people together in the end was effective this time. We need a go between for each session to remind people to reconvene. An emcee.

Flow of the workshops Q: What degree of dependencies should there be between workshops?

Feedback on lectures. About half found Frances’s lecture either too fast or too slow and about half found the lecture to be just right.

Getting other people to do some of the lectures. Diversity is desired. Mako does not want to be the only one.


Selecting workshops for next time. Do we need more break-out sessions? OR do we need to break out best of the break-out sessions? Two mentors thumb wrestle.

Wrestler one: Smaller groups of the same break-out session might be good.

Precanned sessions make it easier for new mentors to feel confident and be successful.

Wrestler two: Diversity of projects inspires people to do the kinds of things that people can do with this new knowledge. 


What else can encourage generative-ness? Giving mentees generative moments within sessions and lectures might be empowering. Perhaps, calling out mentees who are doing generative things.

Basic statistical analysis in Python would be a fun thing to teach (says Mako) and at least ten mentees would be enthusiastic about it.

Some people love R some people don’t. The world goes round.


Q: How can we strengthen the relationship between the lectures and the break-outs?

Baby names is good project because it feel data-science-y. Baby Names does everything that Word Play does but it has the stink of science about it. Next time, let’s have two small rooms doing the exact same thing. Wordplay is kind of boring. Twitter had too many people in it. If you ask people do some steps in advance and not others mayhem ensues. Next time have them download all resources. A bitly URL that helps people find the download easier streamlined things.

A bunch of people found the Twitter session way too fast. TweePie is not well documented. Squeeze the JSON out of it before the mentees have to cope with it. Get the mentors on it before hand. Yay!


Wikipedia workshop. The mentor explained stuff very clearly. That was frustrating for those who didn’t need it, BUT super great for people that wanted/needed a lot of explanation.

Graduated challenges in a workhshop that go from less challenging to more and more challenging helps with the fact there is a range of mentee levels.

SQL workshop. Seemed to work really well. Did a good job of giving people an overview of a data science and a way to hook themselves in to it. Next session, also do a workshop that closes the loop between SQL and Python. Can we host an open SQL database somewhere?


Session 3: AM lecture. The goal of the lecture was to walk people through the actual mess of making a code.

Maybe the week 2 lecture should introduce APIs and functions. People thought that week 2 lecture was slow, so adding functions would be good. Functions can be reinforced in the week 2 workshops. Lecture 2 is the earliest that makes sense to introduce functions and the latest. Introduce the idea that code is reusable.

Afternoon of Session 3:

The spreadsheets session. People were modifying the code to build their own dataset and did their own visualizations. At least a few people. That was cool!

The MatPlotLib session. Most people in the session were deeply lost. The mentors who taught it were not at any of the other sessions and therefore didn’t go in with a good sense of where the mentees were at. Several people left and went to other room. In future, ensure mentor success by having them loop in better to where the mentees are at. Consider next time, encouraging new mentors do a practice session with some friendly folks before they let loose. Also, next session, consider using SeaBorn instead of MatPlotLib.


The ethnographers get the last word: Some observations about the culture of mentoring from a first time mentor: There are some distinct values that came through strongly. There is a clear vision of empowerment through programming. The degree of inclusivity is impressive. The culture of feedback, iteration, and reflection was really surprising such as the amount of effort that goes into improving the materials and the teaching. As is the way that other organizations are able to (and are) using the materials. The way that this is building the community. For example, how mentees are organizing their own meet-ups (though that could be encouraged even more).

The pragmatism of what is taught demonstrates a clear value. It would be helpful to make sure that all mentors are clear that part of what is expected of them they give pragmatic coaching. That is they should lead mentees to something that works rather than telling them what an expert would do.