Community Data Science Workshops (Fall 2014)/Reflections: Difference between revisions
No edit summary |
(add more raw notes) |
||
Line 1: | Line 1: | ||
''' | ''' | ||
Scheduling the next workshop''': Not too close to the end of the quarter. | Scheduling the next workshop''': Not too close to the end of the quarter. | ||
Line 178: | Line 175: | ||
The pragmatism of what is taught demonstrates a clear value. It would be helpful to make sure that all mentors are clear that part of what is expected of them they give pragmatic coaching. That is they should lead mentees to something that works rather than telling them what an expert would do. | The pragmatism of what is taught demonstrates a clear value. It would be helpful to make sure that all mentors are clear that part of what is expected of them they give pragmatic coaching. That is they should lead mentees to something that works rather than telling them what an expert would do. | ||
== Mako's Raw Notes == | |||
* general | |||
anaconda solved problems | |||
- next time recruit less mentors for hte first session | |||
sticky notes didn't work as well this time | |||
- especially during the lectures | |||
- we did this better last time | |||
-> hjave better sticky notes would ahve been helpful | |||
rooms: | |||
maybe get the oodegard room | |||
architecture of the space has quickly become the limiting factor | |||
checkout | |||
- not everybody loves the checkout, maybe there's a way we can make it more fun? | |||
communiate the whole setup process to the mentors ahead of time | |||
-> maybe stream line the process | |||
-> finding the directory continues to be hard | |||
we moved the curriculum from cmd to powershell | |||
- windows xp is broke now. make sure you ahve a person with xp skills on hand | |||
not a single person in our session had ip | |||
we can move away from 3 separate installations in the setup information. | |||
- everybody can use zip instead of a zip/tar.gz both | |||
maybe we can consoldiate the wiki pages into a singel page which will be much eaiser to instlla nd keep updated in the future | |||
generally, lets stop copying and pasting new stuff into the wiki. we when archive the old version, we can create links to teh old version of the wiki pages (intstall the templates from english wikipedia) | |||
get rid of pages that are event specific | |||
* friday evening | |||
better material/training and information for mentors on what to expect | |||
mentors should meet 15-20 minutes early to get to know each oand go over things | |||
- maybe t-shirts buttons, etc or something to distinguish mentors | |||
- encourage peoplt o reach out | |||
topics to cover: | |||
how much should you help? (not too much) | |||
anaconda | |||
- non-free and we're unhappy witht hat | |||
- linux seems like we might actually want ot do sidewizse but it does work | |||
- if something fully free and almost as good comes along, we'll use it | |||
-> write installation instructions for linux | |||
3 people who used it out of 80 had problems | |||
-> anaconda choked on a person unicode path because the users homedir was in simplified chinese | |||
broader unicode support wont be fixed until we can move to python3 and we still seem a little while away from that | |||
* demorgraphics | |||
people come in: departments? maybe build a table? | |||
* org suggestions | |||
let people joiun int he later session | |||
making letting peopel skpi #1 could be usefuil | |||
-> maybe we can accept people after words | |||
-> alternatively, we can try to accetp more newsiebes and improve retention | |||
mixed feelings | |||
ways to improve and retain people | |||
-> layout what we're going to do int he next sessions | |||
go to show why are learning things up front | |||
focus on broad research questions | |||
pair programming? | |||
encouraging people to work iun teams or with other on problems they suggest | |||
next time maybe mine the registration for a list of research questions | |||
note: | |||
next time make it explicit that folks can work in grousp | |||
tip: introduce mentors to everybody very clearly | |||
introductions would have been good but are hard to do | |||
* sesion 1 | |||
bring folks back together to go over things | |||
- post examples of code used in teh lectures | |||
- create code base | |||
- turn on loggin gin the concsol and post it after the lecture | |||
mentor workshop: | |||
- get people together before | |||
encourage people to get involved maybe bar meetup | |||
- track diversity of people along more dimensions | |||
- the sql workshops was well received although slight mixed in terms of feedback | |||
more breakout session next time | |||
colorwall was gone and nobody missed it | |||
* babynames | |||
- try to integrate more year | |||
- huge success | |||
- may split rooms into two baby names | |||
list questions up front and let folks choose what to work on and what to bring back together | |||
generally: | |||
- note places to bring folks back together | |||
* session 3 | |||
generally: | |||
showcase what students ahve accomplished and places people can change things and do things differently | |||
e.g., the fergeson thing with the exmaple from ha=rry party | |||
strong connection between the lecture and the introduction | |||
-> more connections and takeaways to emphasize the session more clearly | |||
how to tap mentors on topics more effectively | |||
wordplay | |||
- kinda borning | |||
next time | |||
- public healtha nd epi data session | |||
end of semseter was too late. maybe have it early next year | |||
twitter: | |||
- have people do the setup ahead of time | |||
-> that was clear ahead of time and it happened in the beginnginf of class. either fix the instruction and make sure that everybody is doing the same thing | |||
speed was an issue | |||
the opaqueness of tweepy was a problem.. option to creat ea version of tweppty that just gives you json | |||
or miku or michael for details onhow to do that | |||
dharma might be able to do this. | |||
sql session: | |||
- maybe split this into two session next time | |||
- merge in some more python this time | |||
#1 intro into sql | |||
#2 using pythong o tgra data and bring python and pandas | |||
wikipedia | |||
- too slow | |||
we can do it faster | |||
lecture | |||
*stress defining functions more and earlier.. maybe in the first project and certain in session #2 so we can use it in the afternoon projects and tweepy | |||
session 3: | |||
show and tell at the end was very effectively | |||
we need a designated mc who can go =between rooms | |||
bring people up to the | |||
matplot lib | |||
- maybe replace it with seaborn? | |||
- tommy will teach it | |||
ideomatic ptyhon | |||
talk to chris to try to fix those things |
Revision as of 02:18, 24 December 2014
Scheduling the next workshop: Not too close to the end of the quarter.
Session 0 (Friday November 7th Evening 6:30-9:30pm) Went smoothly. No person reported “many problems with set-up.” All respondants reported either no problems or a few problems.
Anaconda was key to smoothyness compared to the first workshop series. However, Anaconda is not open source. It reduced issues, but was not 100% issue free. When one person’s home directory is in Chinese, Anaconda got confused. This was fixed by a mentor who changed the path.
At least one mentor was confused about whether mentees should self-report they’d completed the steps or whether the mentor should verify that the steps were all taken. In future, email mentors ahead of time to let them know.
Improvements for session 0 for next time. The process people use to flag for mentor help: We didn’t model enough using sticky notes during lectures early on.
Technology improvements: Get less ambiguous sticky notes.
Space improvements:
Set up/arrange/select the space to facilitate better circulation of mentors.
When mentors can circulate easily things are better for mentees.
Streamling the instructions for set-up for next time.
Q: How to reduce the number of steps and the number of operating system specific version?
This time CDSW moved from Powershell from CMD. Powershell doesn’t work well on PCs. People were instructed to “find the Windows mentor.” No one had XP. Next time, we might be able to move away from separate instructions for Linux / Mac / and Window.
Consider writing install instructions that do not rely on Anaconda so people have a fully open source option.
In the first two workshop sequences, no mentees were running Linux. Possibly, in future a Linux workshop would be good. Presently, Linux help/instructions may be moot.
Maintanence errors on the wiki. There was a need for several on-the-fly corrections of the instructions and files on the wiki during the workshop.
Q: How to handle when mentees want to refer back to the workshop material that they experienced?
A: Create and archive template for the page they are looking at during the workshop. Each project can be its own namespace as opposed to having event-specific pages.
Mentors should post the code generated in the break-outs. Encourage them to capture the code.
General observation about mentoring: Being a mentor is kind of hard, especially being a good mentor. Some steps were skipped in helping mentors that were in place last time.
It was hard to tell who was a mentor and who wasn’t.
Improvement: Help the mentors to be visually identifiable. E.g. Paper them head to foot in sticky notes.
Questions about mentorship: How to help the mentors to mentor well?
Suggestion for mentors: Walk around to every single person. Ask, “How are you doing? What are you working on? Show me what you’re doing.”
How much do you help somebody?
Should there be a page of guidelines for mentors?
Where is uniformity needed in mentor style and where do we want to encourage diverse approaches?
Let’s have a mentors workshop! At a bar! With BEER! and PIZZA!
The pizza party, er, mentor workshop could cover: norms, best practices, goals. Planning, etc.
Should only fully open source tools be selected for workshops?
A: Our job is not to extoll the virtues of open source. Our job is to help mentees solve their data problem. “We are teaching you how to do things with data that help you achieve your goal.” However, open source tools are desirable.
Q: Should we be teaching Python 3? A: Yes, but when? It may solve some technical issues that are occurring now.
Student demographics.
This time there was more gender balance in both students and mentors.
2/3 of mentees were from UW. Included students from random places including someone who works for the city of Seattle.Many random Wikipedians were there. It's cool that people who are not doing research but are part of online communities were in the mix with the researchers.
We had 16 students from HCDE were there, but also a bunch of mentors. They were good mentors.
Demographics of Applicants. Several people applied who are already good at programming. Why do they apply? Maybe they want more exposure to data science?
Desired applicants.
The constraint on scaling the workshop is the number of mentors. Every mentor means that the workshop can accommodate four more mentees.
Is it good to have mentees who have some programming skills along with those who don’t have any? Or is it a better use of the seats to only take those with no programming background?
Who are the priorities? Get more of the newbies and invest more in them?
Improving Retention: Anecdotally, there is a sense that those who are dropping are those who had more trouble but didn’t struggle visibly.
Q: Would it help with retention if we show people what will happen in the following weeks?
A: Several mentors say “yes.” We’re doing that, but let’s do more of it.
Pair programming for those who want it might be helpful. Working in groups is another possibility.
Mining research interests/goals.
Could we help match up people with similar interests?
How can we support self-directed projects?
Can we give mentees more guidance to support their project interests? It’s easier to do that if people are pre-clustered.
Bring up people’s ideas at the end.
The size of the breakout workshops varied and that means different degrees of engagement were feasible.
The BIG feedback from the first series of workshops: Bring people back together more often. Bringing people together in the end was effective this time. We need a go between for each session to remind people to reconvene. An emcee.
Flow of the workshops Q: What degree of dependencies should there be between workshops?
Feedback on lectures. About half found Frances’s lecture either too fast or too slow and about half found the lecture to be just right.
Getting other people to do some of the lectures. Diversity is desired. Mako does not want to be the only one.
Selecting workshops for next time.
Do we need more break-out sessions? OR do we need to break out best of the break-out sessions? Two mentors thumb wrestle.
Wrestler one: Smaller groups of the same break-out session might be good.
Precanned sessions make it easier for new mentors to feel confident and be successful.
Wrestler two: Diversity of projects inspires people to do the kinds of things that people can do with this new knowledge.
What else can encourage generative-ness? Giving mentees generative moments within sessions and lectures might be empowering. Perhaps, calling out mentees who are doing generative things.
Basic statistical analysis in Python would be a fun thing to teach (says Mako) and at least ten mentees would be enthusiastic about it.
Some people love R some people don’t. The world goes round.
Q: How can we strengthen the relationship between the lectures and the break-outs?
Baby names is good project because it feel data-science-y. Baby Names does everything that Word Play does but it has the stink of science about it. Next time, let’s have two small rooms doing the exact same thing. Wordplay is kind of boring. Twitter had too many people in it. If you ask people do some steps in advance and not others mayhem ensues. Next time have them download all resources. A bitly URL that helps people find the download easier streamlined things.
A bunch of people found the Twitter session way too fast. TweePie is not well documented. Squeeze the JSON out of it before the mentees have to cope with it. Get the mentors on it before hand. Yay!
Wikipedia workshop. The mentor explained stuff very clearly. That was frustrating for those who didn’t need it, BUT super great for people that wanted/needed a lot of explanation.
Graduated challenges in a workhshop that go from less challenging to more and more challenging helps with the fact there is a range of mentee levels.
SQL workshop. Seemed to work really well. Did a good job of giving people an overview of a data science and a way to hook themselves in to it. Next session, also do a workshop that closes the loop between SQL and Python. Can we host an open SQL database somewhere?
Session 3:
AM lecture. The goal of the lecture was to walk people through the actual mess of making a code.
Maybe the week 2 lecture should introduce APIs and functions. People thought that week 2 lecture was slow, so adding functions would be good. Functions can be reinforced in the week 2 workshops. Lecture 2 is the earliest that makes sense to introduce functions and the latest. Introduce the idea that code is reusable.
Afternoon of Session 3:
The spreadsheets session. People were modifying the code to build their own dataset and did their own visualizations. At least a few people. That was cool!
The MatPlotLib session. Most people in the session were deeply lost. The mentors who taught it were not at any of the other sessions and therefore didn’t go in with a good sense of where the mentees were at. Several people left and went to other room. In future, ensure mentor success by having them loop in better to where the mentees are at. Consider next time, encouraging new mentors do a practice session with some friendly folks before they let loose. Also, next session, consider using SeaBorn instead of MatPlotLib.
The ethnographers get the last word:
Some observations about the culture of mentoring from a first time mentor: There are some distinct values that came through strongly. There is a clear vision of empowerment through programming. The degree of inclusivity is impressive. The culture of feedback, iteration, and reflection was really surprising such as the amount of effort that goes into improving the materials and the teaching. As is the way that other organizations are able to (and are) using the materials. The way that this is building the community. For example, how mentees are organizing their own meet-ups (though that could be encouraged even more).
The pragmatism of what is taught demonstrates a clear value. It would be helpful to make sure that all mentors are clear that part of what is expected of them they give pragmatic coaching. That is they should lead mentees to something that works rather than telling them what an expert would do.
Mako's Raw Notes
- general
anaconda solved problems
- next time recruit less mentors for hte first session
sticky notes didn't work as well this time
- especially during the lectures - we did this better last time
-> hjave better sticky notes would ahve been helpful
rooms:
maybe get the oodegard room architecture of the space has quickly become the limiting factor
checkout - not everybody loves the checkout, maybe there's a way we can make it more fun?
communiate the whole setup process to the mentors ahead of time
-> maybe stream line the process
-> finding the directory continues to be hard
we moved the curriculum from cmd to powershell
- windows xp is broke now. make sure you ahve a person with xp skills on hand
not a single person in our session had ip
we can move away from 3 separate installations in the setup information.
- everybody can use zip instead of a zip/tar.gz both
maybe we can consoldiate the wiki pages into a singel page which will be much eaiser to instlla nd keep updated in the future
generally, lets stop copying and pasting new stuff into the wiki. we when archive the old version, we can create links to teh old version of the wiki pages (intstall the templates from english wikipedia)
get rid of pages that are event specific
- friday evening
better material/training and information for mentors on what to expect
mentors should meet 15-20 minutes early to get to know each oand go over things
- maybe t-shirts buttons, etc or something to distinguish mentors
- encourage peoplt o reach out
topics to cover:
how much should you help? (not too much)
anaconda
- non-free and we're unhappy witht hat
- linux seems like we might actually want ot do sidewizse but it does work
- if something fully free and almost as good comes along, we'll use it
-> write installation instructions for linux
3 people who used it out of 80 had problems
-> anaconda choked on a person unicode path because the users homedir was in simplified chinese
broader unicode support wont be fixed until we can move to python3 and we still seem a little while away from that
- demorgraphics
people come in: departments? maybe build a table?
- org suggestions
let people joiun int he later session
making letting peopel skpi #1 could be usefuil
-> maybe we can accept people after words
-> alternatively, we can try to accetp more newsiebes and improve retention
mixed feelings
ways to improve and retain people
-> layout what we're going to do int he next sessions
go to show why are learning things up front
focus on broad research questions
pair programming?
encouraging people to work iun teams or with other on problems they suggest
next time maybe mine the registration for a list of research questions
note:
next time make it explicit that folks can work in grousp
tip: introduce mentors to everybody very clearly
introductions would have been good but are hard to do
- sesion 1
bring folks back together to go over things
- post examples of code used in teh lectures
- create code base
- turn on loggin gin the concsol and post it after the lecture
mentor workshop:
- get people together before encourage people to get involved maybe bar meetup
- track diversity of people along more dimensions
- the sql workshops was well received although slight mixed in terms of feedback
more breakout session next time colorwall was gone and nobody missed it
- babynames
- try to integrate more year - huge success - may split rooms into two baby names
list questions up front and let folks choose what to work on and what to bring back together
generally:
- note places to bring folks back together
- session 3
generally:
showcase what students ahve accomplished and places people can change things and do things differently
e.g., the fergeson thing with the exmaple from ha=rry party
strong connection between the lecture and the introduction
-> more connections and takeaways to emphasize the session more clearly
how to tap mentors on topics more effectively
wordplay
- kinda borning
next time
- public healtha nd epi data session
end of semseter was too late. maybe have it early next year
twitter:
- have people do the setup ahead of time
-> that was clear ahead of time and it happened in the beginnginf of class. either fix the instruction and make sure that everybody is doing the same thing
speed was an issue
the opaqueness of tweepy was a problem.. option to creat ea version of tweppty that just gives you json
or miku or michael for details onhow to do that
dharma might be able to do this.
sql session:
- maybe split this into two session next time
- merge in some more python this time
- 1 intro into sql
- 2 using pythong o tgra data and bring python and pandas
wikipedia
- too slow
we can do it faster
lecture
- stress defining functions more and earlier.. maybe in the first project and certain in session #2 so we can use it in the afternoon projects and tweepy
session 3:
show and tell at the end was very effectively
we need a designated mc who can go =between rooms
bring people up to the
matplot lib
- maybe replace it with seaborn? - tommy will teach it
ideomatic ptyhon
talk to chris to try to fix those things