Day 1 pre-lecture

Why Programming and Data Science
The question of who controls our technology, our information, and our data, is increasingly the question of who controls our experience of the world and each other. Programming is the power to define technology. It can be in, this sense, deeply empowering.

In a technological and data driven world, being able to programming and data science is a kind of literacy. Imagine a world in which everybody could read by only some people could write?

Our goal here is not turn you into the programming or data science equivalent of novelists or journalists ­— at least not in these three weekends! Our goal is to demystify things and give you enough information to become dangerous.

Programming, you will also find — probably a little today and a lot more later on — is also enormously fun. For me, it's like meditation and problem solving. It's exactly as frustrating as a difficult puzzle and even more rewarding because your solution accomplish something else you were trying to do.

Why Python
I know a dozen programming languages and write 4-5 regularly. But Python is the correct first language to learn.

Python is a fantastic language to learn
Believe it or not, compared to other programming languages:


 * Python has a low "syntatic overhead".
 * It's easy to get work done quickly.
 * It's relatively forgiving.

Python is versatile useful for a range of applications
There are easier programming languages to learn. But Python is important because it is not a toy. In designing the curriculum for these workshops, we have tried to only teach tools that we, as professional data scientists and programmers, use ourselves and find useful.

Python is used for:


 * Big Companies — Big companies like Google use Python to run much of the code that collects all the webpages that go into their search results.
 * Web applications (Instagram, Pintrest, and the Washington Post all run websites written largely in Python).
 * Python can be used to extend existing applications. You can use it to script many graphical applications.
 * Python can be used to build graphical games (e.g., Civilization 4; Frets on Fire is a free version of Guitar Hero)
 * Python is used for movies and graphics (e.g., Industrial Light and Magic; Disney Feature Animation)
 * One of the langauges for science (national weather services, NASA, genomics reseachers)
 * Python really shines when it comes to dealing with data and with the web.

Outcomes

 * Examples of projects people did: File:Matplotlib-hist2d.png
 * For academic types: I know of at least one person who has finished and submitted a paper for publication using data they collected from the Reddit online community.
 * Miku Lenentine has organized a series of monthly meetings with participants!
 * I am beyond proud to announce that we have at least several mentors this time who were enrolled as students last time!
 * Ben Lewis started out as a mentor and is now an organizer!
 * Dharma Dailey started out as a student and then became a mentor and then become an organizer!

Housekeeping Notes Before We Begin

 * Several people from eSciences are here to help learn about how folks are learning data science. They are also here to learn and teach. You can read more about their project in this information sheet they have prepared. If you're bothered by their presence, let them know or let me know.

Schedule

 * Broad overview: three sessions


 * Lecture
 * Lunch over in CMU Building
 * Projects until 3:30: three projects, think about them

Final Notes

 * Choose sessions: Baby Names Room 1 and Room 2 Code Academy (and then Wordplay)
 * Food will be out in the atrium outside CMU 126. I'm told it's here.
 * We will get rooms divided up and put them on the whiteboard in CMU 104. Go there to see where things are.