Day 1 pre-lecture

From CommunityData
Revision as of 01:31, 5 October 2015 by Jtmorgan (talk | contribs)

Why Programming and Data Science

The question of who controls our technology, our information, and our data, is increasingly the question of who controls our experience of the world and each other. Programming is the power to define technology. It can be in, this sense, deeply empowering.

In a technological and data driven world, being able to programming and data science is a kind of literacy. Imagine a world in which everybody could read by only some people could write?

Our goal here is not turn you into the programming equivalent of novelists or journalists. Our goal is to demystify things and give you enough information to become dangerous.

Programming, you will also find — probably a little today and a lot more later on — is also enormously fun. For me, it's like meditation and problem solving. It's exactly as frustrating as a difficult puzzle and even more rewarding because your solution accomplish something else you were trying to do.

Why Python

I know a dozen programming languages and write 4-5 regularly. But Python is the right one.

Python is a fantastic language to learn

Believe it or not, compared to other programming languages:

  • Python has a low "syntatic overhead".
  • It's easy to get work done quickly.
  • It's relatively forgiving.

Python is versatile useful for a range of applications

There are easier programming languages to learn. But Python is important because it is not a toy. In designing the curriculum for these workshops, we have tried to only teach tools that we, as professional data scientists and programmers, use ourselves and find useful.

Python is used for:

  • Web applications (Instagram, Pintrest, and the Washington Post all run websites written largely in Python).
  • Python can be used to extend existing applications. You can use it to script many graphical applications.
  • Python is fantastic for dealing with and manipulating text.
  • Python can be used to build graphical games (Frets on Fire)
  • Python really shines when it comes to dealing with data and with the web.

Outcomes

  • Examples of projects people did: File:Matplotlib-hist2d.png
  • For academic types: I know of at least one person who has finished and submitted a paper for publication using data they collected from the Reddit online community.
  • Miku Lenentine has organized a series of monthly meetings with participants!
  • I am beyond proud to announce that we have at least several mentors this time who were enrolled as students last time!
  • Ben Lewis started out as a mentor and is now an organizer!
  • Dharma Dailey started out as a student and then became a mentor and then become an organizer!

Housekeeping Notes Before We Begin

  • Several people from eSciences are here to help learn about how folks are learning data science. They are also here to learn and teach. You can read more about their project in this information sheet they have prepared. If you're bothered by their presence, let them know or let me know.

Schedule

  • Broad overview: three sessions
  • Lecture
  • Lunch over in CMU Building
  • Projects until 3:30: three projects, think about them

Final Notes

  • Choose sessions: Baby Names Room 1 and Room 2 Code Academy (and then Wordplay)
  • Food will be out in the atrium outside CMU 126. I'm told it's here.
  • We will get rooms divided up and put them on the whiteboard in CMU 104. Go there to see where things are.