Day 1 pre-lecture

From CommunityData

Slides for the talk are available in Google Docs. This page includes rough notes.

Why Programming and Data Science[edit]

The question of who controls our technology, our information, and our data, is increasingly the question of who controls our experience of the world and each other. Programming is the power to define technology. It can be in, this sense, deeply empowering.

In a technological and data driven world, being able to programming and data science is a kind of literacy. Imagine a world in which everybody could read by only some people could write?

Our goal here is not turn you into the programming or data science equivalent of novelists or journalists ­— at least not in these three weekends! Our goal is to demystify things and give you enough information to become dangerous.

Programming, you will also find — probably a little today and a lot more later on — is also enormously fun. For me, it's like meditation and problem solving. It's exactly as frustrating as a difficult puzzle and even more rewarding because your solution accomplish something else you were trying to do.

Why Python[edit]

I know a dozen programming languages and write 4-5 regularly. But Python is the correct first language to learn.

Python is a fantastic language to learn[edit]

Believe it or not, compared to other programming languages:

  • Python has a low "syntatic overhead".
  • It's easy to get work done quickly.
  • It's relatively forgiving.

Python is versatile useful for a range of applications[edit]

There are easier programming languages to learn. But Python is important because it is not a toy. In designing the curriculum for these workshops, we have tried to only teach tools that we, as professional data scientists and programmers, use ourselves and find useful.

Python is used for:

  • Big Companies — Big companies like Google use Python to run much of the code that collects all the webpages that go into their search results.
  • Web applications (Instagram, Pintrest, and the Washington Post all run websites written largely in Python).
  • Python can be used to extend existing applications. You can use it to script many graphical applications.
  • Python can be used to build graphical games (e.g., Civilization 4; Frets on Fire is a free version of Guitar Hero)
  • Python is used for movies and graphics (e.g., Industrial Light and Magic; Disney Feature Animation)
  • One of the langauges for science (national weather services, NASA, genomics reseachers)
  • Python really shines when it comes to dealing with data and with the web.

Outcomes[edit]

  • Examples of projects people did: File:Matplotlib-hist2d.png
  • For academic types: I know of at least one person who has finished and submitted a paper for publication using data they collected from the Reddit online community.
  • Miku Lenentine and several others have organized a series of monthly meetings with participants!
  • I am beyond proud to announce that we have at least several mentors this time who were enrolled as students last time!
  • Ben Lewis started out as a mentor and is now an organizer!
  • Dharma Dailey started out as a student and then became a mentor and has been an organizer for the last two sessions!
  • Many returning mentors!

Schedule[edit]

  • Broad overview: three sessions
  • Lecture
  • Lunch downstairs in the hallway — many more mentors will join us
    • lunches get "better" (or at least more expensive!) over time: pizza today; Indian food next week; greek the third time
    • everything is vegetarian and there always many vegan and gluten free options. other lunch options are available on campus
  • Projects until 3:30: 2.5 projects, think about them

Final Notes[edit]

  • Choose sessions: Baby Names Room 1 and Room 2 Code Academy (and then Wordplay)
  • We will get rooms divided up and put them on the whiteboard at the whiteboard here. Go there to see where things are.