Editing Intro to Programming and Data Science (Summer 2021)

From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
{{Old Class}}
= Course Information =
= Course Information =
:'''COM 674: Introduction to Programming and Data Science'''
:'''COM 674: Introduction to Programming and Data Science'''
:'''Location:''' Discord
:'''Location:''' https://meet.jit.si/COM674
:'''Class Hours:''' M-F, 10 am - 12 pm
:'''Class Hours:''' M-F, 10 am - 12 pm


Line 9: Line 7:
:'''Instructor:''' [https://jeremydfoote.com Jeremy Foote]  
:'''Instructor:''' [https://jeremydfoote.com Jeremy Foote]  
:'''Email:''' jdfoote@purdue.edu
:'''Email:''' jdfoote@purdue.edu
:'''Office Hours:''' By appointment; on Discord
:'''Office Hours:''' By appointment; at https://meet.jit.si/JeremyOffice
 


<div style="float:right;">__TOC__</div>
<div style="float:right;">__TOC__</div>
Line 42: Line 41:
== Readings ==
== Readings ==


* Required text: '''[https://www.py4e.com/book Python for Everybody]''' [[https://www.py4e.com/html3/ HTML Version]] [[http://do1.dr-chuck.com/pythonlearn/EN_us/pythonlearn.pdf PDF version]] by Charles R. Severance. The book is [https://creativecommons.org/licenses/by/3.0/us/ freely licensed] and available online for free. You can also buy the book if you prefer a hard copy.
* Required text: '''[https://www.py4e.com/book Python for Everybody]''' by Charles R. Severance. The book is [https://creativecommons.org/licenses/by/3.0/us/ freely licensed] and available online for free. You can also buy the book if you prefer a hard copy.


I will list required chapters in the schedule below. In general, you should expect to spend far more time working on programming tasks than reading. Much like math or other technical courses, this course will build on itself every day. You should make every effort to cover the reading and exercise material every day in preparation for the next day.
I will list required chapters in the schedule below. In general, you should expect to spend far more time working on programming tasks than reading. Much like math or other technical courses, this course will build on itself every day. You should make every effort to cover the reading and exercise material every day in preparation for the next day.
Line 54: Line 53:
== Note About This Syllabus ==
== Note About This Syllabus ==


Although the core expectations for this class are fixed, the details of readings and assignments may shift based on how the class goes. As a result, there are three important things to keep in mind:
This is my first time teaching this course as a summer module. Although the core expectations for this class are fixed, the details of readings and assignments may shift based on how the class goes. As a result, there are three important things to keep in mind:


# Although details on this syllabus will change, I will not change readings or assignments less than three days before they are due. If you plan to read more than three days ahead, contact me first.
# Although details on this syllabus will change, I will not change readings or assignments less than three days before they are due. If you plan to read more than three days ahead, contact me first.
# Closely monitor Discord. Because this a wiki, you will be able to track every change by clicking the ''history'' button on this page. I will also summarize these changes in a announcements that will be posted in the #announcements channel on Discord.
# Closely monitor your email. Because this a wiki, you will be able to track every change by clicking the ''history'' button on this page. I will also summarize these changes in an announcement that will be emailed to everybody in the class.
# I will ask the class for voluntary anonymous feedback frequently. Please let me know what is working and what can be improved.
# I will ask the class for voluntary anonymous feedback frequently. Please let me know what is working and what can be improved.


== Lectures ==
== Lectures ==


The synchronous part of the course will be held starting at 1pm every day, on Discord. The typical format will be a discussion of the reading for the day followed by a brief lecture about the topic for that week followed by a discussion of the previous day's homework questions followed by optional co-working time to start on the next day's assignment.
This is an online course and I am not requiring any synchronous participation. I am planning to hold an online lecture from approximately 10 am - 11 am every day at meet.jit.si/COM674, where I introduce the concepts for the upcoming reading and assignment. I will record these lectures and will upload them to Brightspace. Following the lecture, I will stick around to answer questions as folks begin working on the coding challenges.


I highly encourage you to attend as many of our synchronous sessions as possible. In general, my teaching style is more conversational than a formal lecture. I prefer that students feel they can "politely interrupt" at any time to seek clarification or make a well-informed point, and the lectures will be much better if I can get real-time feedback about what is and isn't making sense.
I highly encourage those who can attend synchronously to do so. In general, my teaching style is more conversational than a formal lecture. I prefer that students feel they can "politely interrupt" at any time to seek clarification or make a well-informed point, and the lectures will be much better if I can get real-time feedback about what is and isn't making sense.


== Office hours and email ==
== Office hours and email ==
Line 76: Line 75:


# '''Research Project:''' The main outcome of this course will be a research project exploring a social science question using Python, and the bulk of your grade will be based on that project. Submit these via Brightspace
# '''Research Project:''' The main outcome of this course will be a research project exploring a social science question using Python, and the bulk of your grade will be based on that project. Submit these via Brightspace
# '''Coding Challenges:''' There will be daily programming assignments that I will ask you to turn in on Brightspace but which will only be graded as complete/incomplete. I will also randomly assign someone to present their solution to each of the problems during our synchronous sessions.
# '''Coding Challenges:''' There will be daily programming assignments that I will ask you to turn in on Brightspace but which will only be graded as complete/incomplete. I will also randomly assign someone to present their solution to each of the problems, on [piazza.com/purdue/summer2020/com674 Piazza].
# '''Paper Discussion:''' Each day we will read and discuss a paper which uses computational approaches to address social science questions.
# '''Paper Discussion:''' Each day we will read and discuss a paper which uses computational approaches to address social science questions.


Line 88: Line 87:
* '''Ensure that your work is replicable''' — You will need to provide code and data for your analysis in a way that makes your work replicable by other researchers.
* '''Ensure that your work is replicable''' — You will need to provide code and data for your analysis in a way that makes your work replicable by other researchers.


''I strongly urge you'' to work on a project that will further your academic career outside of the class. There are many ways that this can happen. Some obvious options are to prepare a project that you can submit for publication, that you can use as pilot analysis that you can report in a grant or thesis proposal, and/or that fulfills a degree requirement. I prefer that you do projects on your own but it may be possible to work as a small team (maximum 3 people). Team projects are expected to be more ambitious than individual projects. Preliminary assignments will help you to develop your idea and to get feedback from me and others.
''I strongly urge you'' to produce a project that will further your academic career outside of the class. There are many ways that this can happen. Some obvious options are to prepare a project that you can submit for publication, that you can use as pilot analysis that you can report in a grant or thesis proposal, and/or that fulfills a degree requirement. I prefer that you do projects on your own but it may be possible to work as a small team (maximum 3 people). Team projects are expected to be more ambitious than individual projects. Preliminary assignments will help you to develop your idea and to get feedback from me and others.


There are several intermediate milestones and deadlines to help you accomplish a successful research project. Unless otherwise noted, all deliverables should be submitted via Brightspace.
There are several intermediate milestones and deadlines to help you accomplish a successful research project. Unless otherwise noted, all deliverables should be submitted via Brightspace.
Line 94: Line 93:
=== Project idea and dataset identification ===
=== Project idea and dataset identification ===


;Due date: May 19
;Due date: May 20
;Maximum length: 500 words (~1-2 pages)
;Maximum length: 500 words (~1-2 pages)


Line 100: Line 99:


* An abstract of the proposed study including the topic, research question, theoretical motivation, object(s) of study, and anticipated research contribution.
* An abstract of the proposed study including the topic, research question, theoretical motivation, object(s) of study, and anticipated research contribution.
* An identification of the dataset you will use and a description of the columns or type of data it will include. If you do not currently have access to these data, explain why not and when you will have access (If you need ideas, [[Data_Into_Insights_(Spring_2021)/Final_project#Datasets|this page]] from one of my undergrad classes lists some open datasets).
* An identification of the dataset you will use and a description of the columns or type of data it will include. If you do not currently have access to these data, explain why and when you will.
* A short (several sentences) description of how the project will fit into your career trajectory.
* A short (several sentences) description of how the project will fit into your career trajectory.


=== Project planning document ===
=== Project planning document ===


;Due date: May 27
;Due date: May 29
;Maximum length: ~4-5 pages
;Maximum length: ~4-5 pages


Line 117: Line 116:
=== Project presentation and report ===
=== Project presentation and report ===


;Report due date: June 11
;Report due date: June 12
;Maximum length: 4000 words (~15 pages)
;Maximum length: 4000 words (~15 pages)


;Presentation due date: June 10
;Presentation due date: June 11
;Maximum length: 8 minutes
;Maximum length: 8 minutes


Line 143: Line 142:
Nearly every day I will give you a set of coding challenges before the end of class that will involve writing code or adding to code that I've given you. These coding challenges will be turned in on Brightspace but will not be graded. I encourage you to work together on these challenges but to make sure that you understand the concepts yourself.
Nearly every day I will give you a set of coding challenges before the end of class that will involve writing code or adding to code that I've given you. These coding challenges will be turned in on Brightspace but will not be graded. I encourage you to work together on these challenges but to make sure that you understand the concepts yourself.


Each day I will randomly select a set of students to share their solutions to a selected exercise. This will involve putting your solution on Discord at least one hour before the next day's lecture starts, and being prepared to walk us through the solution. If you can't figure out the problem that's been assigned to you, then explain where you got stuck and what you tried. I encourage you to also use Discord to ask and answer each other's questions as you work on the assignments. We will use some of our lecture time to review the problems and I will make sure that a correct solution is posted by the end of that day. As you will see over the course of the module, there are many possible solutions to many programming problems and my own approaches will often be different than yours. That's completely fine! Coding is a creative act!
Each day I will randomly select a set of students to share their solutions to a selected exercise on [piazza.com/purdue/summer2020/com674 Piazza], at least one hour before the next day's lecture starts. I encourage you to also use Piazza to ask and answer each other's questions as you work on the assignments. We will use some of our lecture time to review the problems and I will make sure that a correct solution is posted by the end of that day. As you will see over the course of the module, there are many possible solutions to many programming problems and my own approaches will often be different than yours. That's completely fine! Coding is a creative act!
 
=== DataCamp ===
 
DataCamp is an online coding education site, with lots of great resources about Python. I set up our class for free access for six months - you can sign up at [https://www.datacamp.com/groups/shared_links/c474713e6d04d94d410bb7a04fa0e9bad0f7c2ab47bb39a72a2323958787bcb9 this link].


I will put a few courses as assignments on DataCamp which will be optional. I'll signal when you might want to do these courses in the schedule.


== Paper Discussions ==
== Paper Discussions ==


Every day we will review a paper that uses computational methods. On the first day, I will ask you to sign up to lead the discussion for one of these papers. When leading the discussion, you will prepare a presentation as though you were presenting the paper at a conference and then lead a discussion about it.  
Every day we will review a paper that uses computational methods. On the first day, I will ask you to sign up to lead the discussion two times. When leading the discussion, you will post a video explaining the paper and asking a few discussion questions to Brightspace. Everyone else will respond with their thoughts about the paper.


== Reflection papers ==
== Reflection papers ==
Line 156: Line 160:
= Grades =
= Grades =


This course will follow a "self-assessment" philosophy. I am more interested in helping you to learn things that will be useful to you than in assigning grades. The university still requires grades, so you will be leading the evaluation of your work. In week two and again at the end of the course, you will reflect on what you have accomplished thus far, how it has met, not met, or exceeded expectations, based both on rubrics and personal goals and objectives. At each of these stages you will receive feedback on your assessments. By the end of the semester, you should have a clear vision of your accomplishments and growth, which you will turn into a grade. As the instructor-of-record, I maintain the right to disagree with your assessment and alter grades as I see fit, but any time that I do this it will be accompanied by an explanation and discussion. These personal assessments, reflecting both honest and meaningful reflection of your work will be the most important factor in final grades.
This course will follow a "self-assessment" philosophy. I am more interested in helping you to learn things that will be useful to you than in assigning grades. The university still requires grades, so you will be leading the evaluation of your work. This will be completed with me in four stages, at the end of weeks 4, 8, 12, and 16. In each stage, you will reflect on what you have accomplished thus far, how it has met, not met, or exceeded expectations, based both on rubrics and personal goals and objectives. At each of these stages you will receive feedback on your assessments. By the end of the semester, you should have a clear vision of your accomplishments and growth, which you will turn into a grade. As the instructor-of-record, I maintain the right to disagree with your assessment and alter grades as I see fit, but any time that I do this it will be accompanied by an explanation and discussion. These personal assessments, reflecting both honest and meaningful reflection of your work will be the most important factor in final grades.


I suggest that we use the following rubric in our assessment:
I suggest that we use the following rubric in our assessment:


* 25%: class participation, including attendance, participation in discussions and group work, and significant effort towards weekly assignments.
* 15%: class participation, including attendance, participation in discussions and group work, and significant effort towards weekly assignments.
* 5%: Final Project Idea.
* 5%: Final Project Idea.
* 10%: Final Project Proposal.
* 10%: Final Project Proposal.
* 40%: Final Project paper/Jupyter notebook.
* 50%: Final Project paper/Jupyter notebook.
* 20%: Final Presentation including your slides and presentation.
* 20%: Final Presentation including your slides and presentation.


Line 199: Line 203:
'''NOTE''':  This section will be modified throughout the course to meet the class's needs. Check back in often.
'''NOTE''':  This section will be modified throughout the course to meet the class's needs. Check back in often.


There are links to each day's slides. Note that these are slides from an earlier version of the class and will typically be updated the day of each class.
In general, the lecture for a certain day will cover the same material as the P4E reading for that day. You are welcome to read P4E either before or after the lecture.




== Day 1: Introduction to Python and Computational Thinking (May 17) ==
== Day 1: Introduction to Python and Computational Thinking (May 18) ==


'''Assignment Due:'''  
'''Assignment Due:'''  
Line 212: Line 216:
'''Agenda:'''
'''Agenda:'''
* Class overview and expectations — We'll walk through this syllabus.
* Class overview and expectations — We'll walk through this syllabus.
* [[/Day_1_Coding_Challenge| Day 1 Coding challenge]] - Includes installing Python and going through a number of exercises.
* [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Day_1_Coding_Challenge| Day 1 Coding challenge]] - Includes installing Python and going through a number of exercises.
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_1/lecture/day_1.html Today's slides]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_1.html Today's slides]


'''By the end of class you will:'''
'''By the end of class you will:'''
Line 219: Line 223:
* Have written your first program in the python language.
* Have written your first program in the python language.


== Day 2: Variables, conditionals, and functions (May 18) ==
== Day 2: Variables, conditionals, and functions (May 19) ==


'''Assignments Due:'''  
'''Assignment Due:'''  
* Finish Day 1 exercises and tutorials
* Finish Day 1 exercises and tutorials
* Fill out this [https://docs.google.com/forms/d/e/1FAIpQLSfUiGogs2jDXIHaXz1ooVBZFkRF2NdMaf00IgZvk7f69rby9w/viewform?usp=sf_link short survey]
* Fill out this [https://docs.google.com/forms/d/e/1FAIpQLSfUiGogs2jDXIHaXz1ooVBZFkRF2NdMaf00IgZvk7f69rby9w/viewform?usp=sf_link short survey]
* Sign up to be a discussant [https://docs.google.com/spreadsheets/d/1uSo-Ya5DghaLu1BYk94EVU2kBVmExRWwOa1586GbFUU/edit?usp=sharing here]
* Sign up to be a discussant [https://docs.google.com/spreadsheets/d/1uSo-Ya5DghaLu1BYk94EVU2kBVmExRWwOa1586GbFUU/edit?usp=sharing here] (Make sure to sign up for '''2''' readings)
* [[/Day_2_Coding_Challenges|Day 2 Coding Challenge]] (turn in on Brightspace)


'''Readings (before class):'''  
'''Readings (before class):'''  
* Python for Everybody, chapters 1-4
* Bit By bit, [https://www.bitbybitbook.com/en/1st-ed/introduction/ Introduction]
* Bit By bit, [https://www.bitbybitbook.com/en/1st-ed/introduction/ Introduction]
* Python for Everybody, chapters 1-4
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_2/day_2.ipynb Today's Jupyter Notebook]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6387775/View Notebook walkthrough]


'''Agenda:'''
'''Agenda:'''
* Review Day 1 and Day 2 Exercises
* Variables, conditionals, functions
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_2/lecture/day_2.html Today's slides]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_2.html Today's slides]
* Introduce wordplay project
 
'''Code Challenge:'''
* [[Intro to Programming and Data Science (Summer 2021)/Day_2_Coding_Challenges|Day 2 Coding Challenge]]


== Day 3: Iteration, strings, and lists (May 19) ==
== Day 3: Iteration, strings, and lists (May 20) ==


'''Assignment Due:'''
'''Assignment Due:'''
* Final project dataset and idea (turn in on Brightspace).
* Final project dataset and idea (turn in on Brightspace).
* [[/Day_3_Coding_Challenges|Day 3 Coding Challenge]]
* Finish [[Intro to Programming and Data Science (Summer 2021)/Day_2_Coding_Challenges|Day 2 Coding Challenge]] (turn in on Brightspace)


'''Readings:'''  
'''Readings:'''  
* Python for Everybody
* Python for Everybody
  chapters_to_read = [5, 6, 8]
  chapters_to_read = [5, 6, 8]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_3/day_3.ipynb Today's Jupyter Notebook]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6390578/View Notebook walkthrough]
* Foote, J., Shaw, A., & Hill, B.M. (2017). [https://jeremydfoote.com/files/foote_computational_2017.pdf Computational analysis of social media scholarship]. In Burgess, J., Poell, T., Marwick, A. (Eds.), The Sage Handbook of Social Media. Sage.
* Foote, J., Shaw, A., & Hill, B.M. (2017). [https://jeremydfoote.com/files/foote_computational_2017.pdf Computational analysis of social media scholarship]. In Burgess, J., Poell, T., Marwick, A. (Eds.), The Sage Handbook of Social Media. Sage.
** Discussant: Juan Pablo
** Discussant: Tamara


'''Agenda:'''
'''Agenda:'''
* Programming principles (iteration, strings, and lists)
* Programming principles (iteration, strings, and lists)
** Follow along with [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_3/day_3.ipynb this Jupyter Notebook]
* Go over last day's assignment
* Go over last day's assignment
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_3/lecture/day_3.html Today's slides]
* Introduce wordplay project
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_3.html Today's slides]
 
'''Coding Challenge'''
* [[Intro to Programming and Data Science (Summer 2021)/Day 3 Coding Challenges|Day 3 Coding Challenges]]
* (Optional) [https://learn.datacamp.com/courses/intro-to-python-for-data-science DataCamp Chapters 1-3]


== Day 4: Reading from and writing to files (May 20) ==
== Day 4: Reading from and writing to files (May 21) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day 4 Coding Challenges|Day 4 Coding Challenges]]
* [[Intro to Programming and Data Science (Summer 2021)/Day 3 Coding Challenges|Day 3 Coding Challenges]]


'''Readings:'''  
'''Readings:'''  
Line 268: Line 275:
         read(chapter)
         read(chapter)
  book.close()
  book.close()
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_4/day_4.ipynb Today's Jupyter Notebook]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6399269/View Notebook walkthrough]
* Nelson, Laura K. 2017. "[https://doi.org/10.1177%2F0049124117729703 Computational Grounded Theory: A Methodological Framework]." Sociological Methods and Research.
* Nelson, Laura K. 2017. "[https://doi.org/10.1177%2F0049124117729703 Computational Grounded Theory: A Methodological Framework]." Sociological Methods and Research.
** Discussant: Beth Ann
** Discussant: Tiwalade


'''Agenda:'''
'''Agenda:'''
* Reading from and writing to files
* Reading from and writing to files
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_4/lecture/day_4.html Today's slides]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_4/day_4.ipynb Today's Jupyter Notebook]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_4.html Today's slides]


== Day 5: Dictionaries and Tuples (May 21) ==
'''Coding Challenge:'''
* [[Intro to Programming and Data Science (Summer 2021)/Day 4 Coding Challenges|Day 4 Coding Challenges]]
 
== Day 5: Dictionaries and Tuples (May 22) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day 5 Coding Challenges|Day 5 Coding Challenges]]
* [[Intro to Programming and Data Science (Summer 2021)/Day 4 Coding Challenges|Day 4 Coding Challenges]]
* Do the [[/Twitter_authentication_setup|Twitter Authentication Setup]]


'''Readings:'''
'''Readings:'''
* Python for Everybody, chapters 9 and 10
* Python for Everybody, chapters 9 and 10
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_5/day_5.ipynb Today's Jupyter Notebook]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6415595/View Video walkthrough]
* Margolin, D. B., Hannak, A., & Weber, I. (2018). [https://doi.org/10.1080/10584609.2017.1334018 Political Fact-Checking on Twitter: When Do Corrections Have an Effect?] Political Communication, 35(2), 196–219.
* Margolin, D. B., Hannak, A., & Weber, I. (2018). [https://doi.org/10.1080/10584609.2017.1334018 Political Fact-Checking on Twitter: When Do Corrections Have an Effect?] Political Communication, 35(2), 196–219.
** Discussant: Katelyn
** Discussant: Vanessa


'''Agenda:'''
'''Agenda:'''
* Dictionaries
* Dictionaries
* Tuples
* Tuples
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_5/lecture/day_5.html Today's slides]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_5/day_5.ipynb Today's Jupyter Notebook]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_5.html Today's slides]


== Day 6: Dataframes and Visualization (May 24) ==
'''Coding challenge:'''
* [[Intro to Programming and Data Science (Summer 2021)/Day 5 Coding Challenges|Day 5 Coding Challenges]]
 
== Day 6: Dataframes and Visualization (May 26) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day 6 Coding Challenges|Day 6 Coding Challenges]]
* Turn in (on Brightspace) your solutions to the Day 5 coding challenges


'''Readings:'''
'''Readings:'''
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_6/day_6.ipynb Day 6 notebook]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6419983/View Notebook walkthrough]
* Benefield, G. A., Shen, C., & Leavitt, A. (2016). [https://doi.org/10.1145/2818048.2819935 Virtual Team Networks: How Group Social Capital Affects Team Success in a Massively Multiplayer Online Game]. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, 679–690.
* Benefield, G. A., Shen, C., & Leavitt, A. (2016). [https://doi.org/10.1145/2818048.2819935 Virtual Team Networks: How Group Social Capital Affects Team Success in a Massively Multiplayer Online Game]. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, 679–690.
** Discussant: Anna
** Discussant: Nate




'''Agenda:'''
'''Agenda:'''
* Dataframes and visualization
* Dataframes and visualization
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_6/lecture/day_6.html Today's slides]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_6/day_6.ipynb Day 6 notebook]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_6.html Today's slides]


== Day 7: Dataframes and visualization (continued) (May 25) ==
'''Coding challenges:'''
* [[Intro to Programming and Data Science (Summer 2021)/Day 6 Coding Challenges|Day 6 Coding Challenges]]
* (Optional) Begin work on [https://campus.datacamp.com/courses/intermediate-python-for-data-science DataCamp Intermediate Python], Chapters 1-3
 
== Day 7: Dataframes and visualization (continued) (May 27) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day 7 Coding Challenges|Day 7 Coding Challenges]]
* Finish Day 6 Coding Challenges
* Do the [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Twitter_authentication_setup|Twitter Authentication Setup]]


'''Readings:'''
'''Readings:'''
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_7/day_7.ipynb Day 7 notebook]
* Lazer, D., & Radford, J. (2017). Data ex Machina: Introduction to Big Data. Annual Review of Sociology, 43(1), 19–39. https://doi.org/10.1146/annurev-soc-060116-053457
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6429518/View Notebook walkthrough]
** Discussant:
* Lazer, D., & Radford, J. (2017). [https://doi.org/10.1146/annurev-soc-060116-053457 Data ex Machina: Introduction to Big Data]. Annual Review of Sociology, 43(1), 19–39.
** Discussant: Yong
 


'''Agenda:'''
'''Agenda:'''
* Visualizations in Seaborn
* Visualizations in Seaborn
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_7/lecture/day_7.html Today's slides]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_7/day_7.ipynb Day 7 notebook]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_7.html Today's slides]
 
'''Coding challenges:'''
* [[Intro to Programming and Data Science (Summer 2021)/Day 7 Coding Challenges|Day 7 Coding Challenges]]


== Day 8: Collecting Data with APIs (May 26) ==
== Day 8: Collecting Data with APIs (May 28) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day 8 Coding Challenges|Day 8 Coding Challenges]].
* Turn in Day 7 Coding Challenges
** [https://youtu.be/TASX3evcgG4 Video instructions to install tweepy]
* Before class make sure you've installed tweepy ([https://youtu.be/TASX3evcgG4 video instructions)]


'''Readings:'''
'''Readings:'''
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_8/day_8.ipynb Intro to APIs Notebook]
** (Long) [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6430558/View walkthrough of notebook]
* Kieran Healy and James Moody (2014). “[https://doi.org/10.1146/annurev-soc-071312-145551 Data Visualization in Sociology].” American Review of Sociology. 40: 105-28.
* Kieran Healy and James Moody (2014). “[https://doi.org/10.1146/annurev-soc-071312-145551 Data Visualization in Sociology].” American Review of Sociology. 40: 105-28.
** Discussant: Pearlynne
** Discussant: Jessie


'''Agenda:'''
'''Agenda:'''
* Introduce the [https://2.python-requests.org/en/master/ requests] library
* Introduce the [https://2.python-requests.org/en/master/ requests] library
* Discuss the main kinds of online data gathering: downloading, scraping, and APIs.
* Discuss the main kinds of online data gathering: downloading, scraping, and APIs.
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_8/lecture/day_8.html Today's slides]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_8/day_8.ipynb Intro to APIs Notebook]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_8.html Today's slides]
 
'''Coding challenges:'''
* [[Intro to Programming and Data Science (Summer 2021)/Day 8 Coding Challenges|Day 8 Coding Challenges]].
* (Optional) [https://learn.datacamp.com/courses/introduction-to-data-science-in-python DataCamp: Intro to Data Science] and/or [https://learn.datacamp.com/courses/intermediate-python Intermediate Python]


== Day 9: Collecting Data with APIs (continued) (May 27) ==
== Day 9: Collecting Data with APIs (continued) (May 29) ==


'''Assignment Due:'''
'''Assignment Due:'''
* Start on [[Intro to Programming and Data Science (Summer 2021)/Day 9 Coding Challenges|Day 9 Coding Challenges]]
 
* First [[Self_Assessment_Reflection | self-assessment reflection]] is due (on Brightspace).
* First [[Self_Assessment_Reflection | self-assessment reflection]] is due (on Brightspace).
* Project Planning Document Due
* Project Planning Document Due
Line 353: Line 371:


'''Readings:'''
'''Readings:'''
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_9/day_9.ipynb Day 9 Notebook]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6437812/View Notebook walkthrough]
* Python for Everybody, Chapter 13
* Python for Everybody, Chapter 13
* Vitak, J., Shilton, K., & Ashktorab, Z. (2016). [https://doi.org/10.1145/2818048.2820078 Beyond the Belmont Principles: Ethical Challenges, Practices, and Beliefs in the Online Data Research Community]. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, 941–953.
* Vitak, J., Shilton, K., & Ashktorab, Z. (2016). [https://doi.org/10.1145/2818048.2820078 Beyond the Belmont Principles: Ethical Challenges, Practices, and Beliefs in the Online Data Research Community]. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, 941–953.
** Discussant: Casey Lynn
** Discussant: Zhaozhe
 
* (Optional) Williams, M. L., Burnap, P., & Sloan, L. (2017). [https://doi.org/10.1177/0038038517708140 Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation]: Sociology.
* (Optional) Williams, M. L., Burnap, P., & Sloan, L. (2017). [https://doi.org/10.1177/0038038517708140 Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation]: Sociology.
* (Optional) Salganik, M. [https://www.bitbybitbook.com/en/1st-ed/ethics/ Ethics] chapter from Bit By Bit.  
* (Optional) Salganik, M. [https://www.bitbybitbook.com/en/1st-ed/ethics/ Ethics] chapter from Bit By Bit.  
Line 366: Line 381:
'''Agenda:'''
'''Agenda:'''
* A workflow for doing work with APIs
* A workflow for doing work with APIs
* Ethics of digital trace data
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_9/day_9.ipynb Day 9 Notebook]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_9/lecture/day_9.html Today's slides]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_9.html Today's slides]
 
'''Coding Challenge:'''
* [[Intro to Programming and Data Science (Summer 2021)/Day 9 Coding Challenges|Day 9 Coding Challenges]]
* (Optional) [https://learn.datacamp.com/courses/analyzing-social-media-data-in-python DataCamp: Analyzing Social Media]


== Day 10: Introduction to Computational Text Analysis (May 28) ==
== Day 10: Introduction to Computational Text Analysis (June 1) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day 9 Coding Challenges|Day 9 Coding Challenges]]
* [[Intro to Programming and Data Science (Summer 2021)/Day 8 Coding Challenges|Day 8 Coding Challenges]]
* [[/Day 10 Coding Challenges|Day 10 Coding Challenges]]
* [[Intro to Programming and Data Science (Summer 2021)/Day 9 Coding Challenges|Day 9 Coding Challenges]]


'''Readings:'''
'''Readings:'''
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_10/day_10.ipynb Today's Notebook]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6445663/View Notebook walkthrough]
* Christopher A. Bail et al. 2018. [https://doi.org/10.1073/pnas.1804840115 Exposure to opposing views on social media can increase political polarization]. PNAS 115(37): 9216-9221
* Christopher A. Bail et al. 2018. [https://doi.org/10.1073/pnas.1804840115 Exposure to opposing views on social media can increase political polarization]. PNAS 115(37): 9216-9221
** Discussant: Caitlyn
** Discussant: Zhaozhe


'''Agenda:'''
'''Agenda:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_10/lecture/day_10.html Today's slides]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_10/day_10.ipynb Today's Notebook]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_10.html Today's slides]


'''Resources:'''
'''Resources:'''
Line 388: Line 406:
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/blob/master/resources/solutions/Twitter_answers.ipynb My answers to the Day 8 problems]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/blob/master/resources/solutions/Twitter_answers.ipynb My answers to the Day 8 problems]


== Day 11: Data cleaning and operationalization (June 1) ==
* Coding Challenges:
* [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Day_10_Coding_Challenges|Day 10 Coding Challenges]]
 
== Day 11: Data cleaning and operationalization (June 2) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_11/day_11.ipynb Day 11 Coding Challenges]
* [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Day_10_Coding_Challenges|Day 10 Coding Challenges]]




'''Readings:'''
'''Readings:'''
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_11/day_11.ipynb Today's Notebook]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6450755/View Notebook walkthrough]
* Robert K. Merton. 1948. [https://www-jstor-org.ezproxy.lib.purdue.edu/stable/2087142?sid=primo&origin=crossref&seq=1#metadata_info_tab_contents The Bearing of Empirical Research Upon the Development of Social Theory]. American Sociological Review 13(5): 505-515.
* Robert K. Merton. 1948. [https://www-jstor-org.ezproxy.lib.purdue.edu/stable/2087142?sid=primo&origin=crossref&seq=1#metadata_info_tab_contents The Bearing of Empirical Research Upon the Development of Social Theory]. American Sociological Review 13(5): 505-515.
* Sara Klingenstein, Tim Hitchcock, and Simon DeDeo. 2014. [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084475/ The civilizing process in London’s Old Baily]. Proceedings of the National Academy of Sciences 111(26): 9419-9424.
* Sara Klingenstein, Tim Hitchcock, and Simon DeDeo. 2014. [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084475/ The civilizing process in London’s Old Baily]. Proceedings of the National Academy of Sciences 111(26): 9419-9424.
** Discussant: Jeremy
** Discussant: Carly


'''Resources:'''
'''Resources:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_11/lecture/day_11.html Today's slides]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_11/day_11.ipynb Today's Notebook]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_11.html Today's slides]
 
'''Coding Challenges:'''
* [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Day_11_Coding_Challenges|Day 11 Coding Challenges]]


== Day 12: Organizing and storing computational projects (June 2) ==
== Day 12: Organizing and storing computational projects (June 3) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day_12_Coding_Challenges|Day 12 Coding Challenges]]
* Day 11 Coding Challenges


'''Readings:'''
'''Readings:'''
* [https://youtu.be/-_mjC3lAKL4 Video introducing a way to organize code and data] (from the Spring 2020 version of the class)
* [https://www.youtube.com/watch?v=SWYqp7iY_Tc Git & GitHub Crash Course For Beginners] - YouTube video (not by me) introducing Git and Github
* [https://learngitbranching.js.org/ Interactive git branching tutorial]
* DellaPosta, D., Shi, Y., & Macy, M. (2015). [https://doi.org/10.1086/681254 Why Do Liberals Drink Lattes]? American Journal of Sociology, 120(5), 1473–1511.
* DellaPosta, D., Shi, Y., & Macy, M. (2015). [https://doi.org/10.1086/681254 Why Do Liberals Drink Lattes]? American Journal of Sociology, 120(5), 1473–1511.
** Discussant: Lucy
** Discussant: Naomi


'''Agenda:'''
'''Agenda:'''
* Tour of Github
* Tour of Github
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_12/lecture/day_12.html Today's slides]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_12.html Today's slides]
* [https://youtu.be/-_mjC3lAKL4 Video introducing a way to organize code and data] (from the Spring version of the class)


'''Resources:'''
* [https://www.youtube.com/watch?v=SWYqp7iY_Tc Git & GitHub Crash Course For Beginners] - YouTube video (not by me) introducing Git and Github
* [https://learngitbranching.js.org/ Interactive git branching tutorial]


'''Resources:'''
'''Coding Challenge'''
* [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Day_12_Coding_Challenges|Day 12 Coding Challenges]]


== Day 13: Statistical summaries and tests (June 3) ==
== Day 13: Statistical summaries and tests (June 4) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day_13_Coding_Challenges|Day 13 Coding Challenges]]
 
* [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Day_12_Coding_Challenges|Day 12 Coding Challenges]]




'''Readings:'''
'''Readings:'''
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_13/day_13.ipynb Day 13 Notebook]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6459652/View Notebook walkthrough]
* Tan, C. (2018). [https://aaai.org/ocs/index.php/ICWSM/ICWSM18/paper/view/17811 Tracing community genealogy: How new communities emerge from the old]. Proceedings of the Twelfth International Conference on Web and Social Media (ICWSM ’18), 395–404.
* Tan, C. (2018). [https://aaai.org/ocs/index.php/ICWSM/ICWSM18/paper/view/17811 Tracing community genealogy: How new communities emerge from the old]. Proceedings of the Twelfth International Conference on Web and Social Media (ICWSM ’18), 395–404.
** Discussant:  
** Discussant: Ji-young


'''Agenda:'''
'''Agenda:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_13/lecture/day_13.html Today's slides]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2020/day_13/day_13.ipynb Day 13 Notebook]
* [https://youtu.be/j8e8JPWAHr8 Video explanation of notebook from Spring]
* [https://jeremydfoote.com/teaching/2020-summer/intro_to_programming/day_13.html Today's slides]
 
'''Coding Challenges:'''
* [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Day_13_Coding_Challenges|Day 13 Coding Challenges]]


== Day 14: Screen scraping (June 4) ==
== Day 14: Screen scraping (June 5) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Day_13_Coding_Challenges|Day 13 Coding Challenges]]


'''Readings:'''
'''Readings:'''
* Shaw, A., & Hill, B. M. (2014). [https://doi.org/10.1111/jcom.12082 Laboratories of oligarchy? How the iron law extends to peer production]. Journal of Communication, 64(2), 215–238.
* Shaw, A., & Hill, B. M. (2014). [https://doi.org/10.1111/jcom.12082 Laboratories of oligarchy? How the iron law extends to peer production]. Journal of Communication, 64(2), 215–238.
** Discussant:  
** Discussant: Yihan
* [https://towardsdatascience.com/ethics-in-web-scraping-b96b18136f01 Ethics in Web Scraping] by James Densmore
* [https://towardsdatascience.com/ethics-in-web-scraping-b96b18136f01 Ethics in Web Scraping] by James Densmore


Line 451: Line 481:
* [https://youtu.be/daUuC-PMZc4 Very brief lecture on web scraping from Spring 2020].
* [https://youtu.be/daUuC-PMZc4 Very brief lecture on web scraping from Spring 2020].


== Day 15-17: Work on final project (June 7-9) ==
== Day 15-17: Work on final project (June 8-10) ==


'''Agenda:'''
'''Agenda:'''
Line 465: Line 495:
* [https://bbengfort.github.io/snippets/2018/06/22/corenlp-nltk-parses.html Tutorial on syntax parsing in Python] (It's complicated!)
* [https://bbengfort.github.io/snippets/2018/06/22/corenlp-nltk-parses.html Tutorial on syntax parsing in Python] (It's complicated!)


== Day 18: Final project presentation (June 10) ==
== Day 18: Final project presentation (June 11) ==


'''Assignment Due:'''
'''Assignment Due:'''
Line 479: Line 509:




== Day 19: Final Paper Due (June 11) ==
== Day 19: Final Paper Due (June 12) ==


'''Assignment Due:'''
'''Assignment Due:'''
* Final paper due
* Final paper due
* [[/Final_self_reflection|Final self reflection]] due
* [[Intro_to_Programming_and_Data_Science_(Summer_2020)/Final_self_reflection|Final self reflection]] due


= Administrative Notes =
= Administrative Notes =
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)

Template used on this page: