Editing Intro to Programming and Data Science (Fall 2021)

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
{{Old Class}}
= Course Information =
= Course Information =
:'''COM 674: Introduction to Programming and Data Science'''
:'''COM 674: Introduction to Programming and Data Science'''
:'''Location:''' BRNG 2273
:'''Location:''' Discord
:'''Class Hours:''' Thursdays, 5–7:50 pm
:'''Class Hours:''' M-F, 10 am - 12 pm


== Instructor ==
== Instructor ==
:'''Instructor:''' [https://jeremydfoote.com Jeremy Foote]  
:'''Instructor:''' [https://jeremydfoote.com Jeremy Foote]  
:'''Email:''' jdfoote@purdue.edu
:'''Email:''' jdfoote@purdue.edu
:'''Office Hours:''' Thursdays, 2–4 pm; BRNG 2156 or at https://meet.jit.si/JeremyOffice
:'''Office Hours:''' By appointment; on Discord


<div style="float:right;">__TOC__</div>
<div style="float:right;">__TOC__</div>
Line 62: Line 60:
== Lectures ==
== Lectures ==


Our class time will follow the "flipped" classroom model. I will provide asynchronous materials (readings, recorded lectures, assignments, etc.) which you will work on before class and we will use our class time to review concepts, identify confusion, and synthesize.
The synchronous part of the course will be held starting at 1pm every day, on Discord. The typical format will be a discussion of the reading for the day followed by a brief lecture about the topic for that week followed by a discussion of the previous day's homework questions followed by optional co-working time to start on the next day's assignment.
 
The class sessions will typically follow the same format. First, we will discuss the reading for the day, with discussions generally led by a student. We will then have a discussion about the concepts that are still confusing. Next, we will go over the homework questions followed by optional co-working time to start on the next assignment.


I try to make the classes useful and to tailor them to the needs of the students, and I highly encourage you to attend as many of our sessions as possible. My teaching style is very conversational and relies on students being willing to speak up to express confusion, seek clarification, or make a well-informed point, so please participate!
I highly encourage you to attend as many of our synchronous sessions as possible. In general, my teaching style is more conversational than a formal lecture. I prefer that students feel they can "politely interrupt" at any time to seek clarification or make a well-informed point, and the lectures will be much better if I can get real-time feedback about what is and isn't making sense.


== Office hours and email ==
== Office hours and email ==


* I will hold office hours from 2-4 on Thursdays and by appointment. If you come with a programming question, I will expect that you have already tried to solve it yourself in multiple ways and that you have discussed it with a classmate. This policy lets me have time to help more students, but it's also a useful strategy. Often [https://en.wikipedia.org/wiki/Rubber_duck_debugging just trying to explain your code] can help you to recognize where you've gone wrong.
* I will hold office hours by appointment. If you come with a programming question, I will expect that you have already tried to solve it yourself in multiple ways and that you have discussed it with a classmate. This policy lets me have time to help more students, but it's also a useful strategy. Often [https://en.wikipedia.org/wiki/Rubber_duck_debugging just trying to explain your code] can help you to recognize where you've gone wrong.
* I am also available by email. You can reach me at [mailto:jdfoote@purdue.edu jdfoote@purdue.edu]. I try hard to maintain a boundary between work and home and I typically respond only on weekdays during business hours (~9-5) but during the week I will generally respond within 24 hours.
* I am also available by email. You can reach me at [mailto:jdfoote@purdue.edu jdfoote@purdue.edu]. I try hard to maintain a boundary between work and home and I typically respond only on weekdays during business hours (~9-5) but during the week I will generally respond within 24 hours.


Line 96: Line 92:
=== Project idea and dataset identification ===
=== Project idea and dataset identification ===


;Due date: September 9
;Due date: May 19
;Maximum length: 500 words (~1-2 pages)
;Maximum length: 500 words (~1-2 pages)


Line 102: Line 98:


* An abstract of the proposed study including the topic, research question, theoretical motivation, object(s) of study, and anticipated research contribution.
* An abstract of the proposed study including the topic, research question, theoretical motivation, object(s) of study, and anticipated research contribution.
* An identification of the dataset you will use and a description of the columns or type of data it will include. If you do not currently have access to these data, explain why not and when you will have access (If you need ideas, [[/Datasets|this page]] lists some open datasets).
* An identification of the dataset you will use and a description of the columns or type of data it will include. If you do not currently have access to these data, explain why not and when you will have access (If you need ideas, [[Data_Into_Insights_(Spring_2021)/Final_project#Datasets|this page]] from one of my undergrad classes lists some open datasets).
* A short (several sentences) description of how the project will fit into your career trajectory.
* A short (several sentences) description of how the project will fit into your career trajectory.


=== Project planning document ===
=== Project planning document ===


;Due date: October 21
;Due date: May 27
;Maximum length: ~4-5 pages
;Maximum length: ~4-5 pages


Line 119: Line 115:
=== Project presentation and report ===
=== Project presentation and report ===


;Report due date: December 15
;Report due date: June 11
;Maximum length: 4000 words (~15 pages)
;Maximum length: 4000 words (~15 pages)


;Presentation due date: December 9
;Presentation due date: June 10
;Maximum length: 8 minutes
;Maximum length: 8 minutes


==== The project report ====
==== The project report ====


You will write a document or a Jupyter Notebook that will ideally provide the foundation for a high quality short research paper that you might revise and submit for publication. I do not expect the report to be ready for publication, but it should contain polished drafts of all the necessary components of a scholarly quantitative empirical research study. In terms of the structure, please see the page on the [[structure of a quantitative empirical research paper]].
You will craft a Jupyter Notebook that will ideally provide the foundation for a high quality short research paper that you might revise and submit for publication. I do not expect the report to be ready for publication, but it should contain polished drafts of all the necessary components of a scholarly quantitative empirical research study. In terms of the structure, please see the page on the [[structure of a quantitative empirical research paper]].


The great thing about a Jupyter Notebook is that it allows you to provide data, code, and any documentation sufficient to enable the replication of all analysis and visualizations. If you choose to write the report as a Word document, then you will need to include the code in a separate file.
The great thing about a Jupyter Notebook is that it allows you to provide data, code, and any documentation sufficient to enable the replication of all analysis and visualizations. If that is not possible/appropriate for some reason, please talk to me so that we can find another solution.


Because the emphasis in this class is on methods and because I'm not an expert in each of your fields, I'm happy to assume that your paper, proposal, or thesis chapter has already established the relevance and significance of your study and has a comprehensive literature review, well-grounded conceptual approach, and compelling reason why this research is important. As a result, you need not focus on these elements of the work in your written submission. Instead, feel free to start with a brief summary of the purpose and importance of this research followed by an introduction of your research questions or hypotheses. If you provide more detail, that's fine, but I won't give you detailed feedback on these parts and they will not figure prominently in my assessment of the work.
Because the emphasis in this class is on methods and because I'm not an expert in each of your fields, I'm happy to assume that your paper, proposal, or thesis chapter has already established the relevance and significance of your study and has a comprehensive literature review, well-grounded conceptual approach, and compelling reason why this research is important. As a result, you need not focus on these elements of the work in your written submission. Instead, feel free to start with a brief summary of the purpose and importance of this research followed by an introduction of your research questions or hypotheses. If you provide more detail, that's fine, but I won't give you detailed feedback on these parts and they will not figure prominently in my assessment of the work.
Line 139: Line 135:
The presentation will provide an opportunity to share a brief summary of your project and findings with the other members of the class. However, don't treat it as a comprehensive overview of your paper: I would rather you tell a subset of the story well than the whole story in a rushed fashion. For instance, you can give a completely successful presentation by describing the motivation and walking through one plot in your paper. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills.
The presentation will provide an opportunity to share a brief summary of your project and findings with the other members of the class. However, don't treat it as a comprehensive overview of your paper: I would rather you tell a subset of the story well than the whole story in a rushed fashion. For instance, you can give a completely successful presentation by describing the motivation and walking through one plot in your paper. Since you will all give other research presentations throughout your career, I strongly encourage you to take the opportunity to refine your academic presentation skills.


I anticipate that most people will either create a PowerPoint presentation or will walk us through a simple Jupyter Notebook. All presentations will need to be ''a maximum of 8 minutes long''. Concisely communicating an idea in the time allotted is an important skill in its own right. Presentations should be uploaded to the Discussion forum on Brightspace created for this purpose.
I anticipate that most people will either create a PowerPoint presentation or will walk us through their Jupyter Notebook. All presentations will need to be ''a maximum of 8 minutes long''. Concisely communicating an idea in the time allotted is an important skill in its own right. Presentations should be uploaded to the Discussion forum on Brightspace created for this purpose.


== Daily Coding Challenges ==
== Daily Coding Challenges ==
Line 150: Line 146:
== Paper Discussions ==
== Paper Discussions ==


Every day we will review a paper that uses computational methods. On the first day, I will ask you to sign up to lead the discussion for one or more of these papers. When leading the discussion, you will prepare a presentation as though you were presenting the paper at a conference and then lead a discussion about it.
Every day we will review a paper that uses computational methods. On the first day, I will ask you to sign up to lead the discussion for one of these papers. When leading the discussion, you will prepare a presentation as though you were presenting the paper at a conference and then lead a discussion about it.  


== Reflection papers ==
== Reflection papers ==
Line 158: Line 154:
= Grades =
= Grades =


This course will follow a "self-assessment" philosophy. I am more interested in helping you to learn things that will be useful to you than in assigning grades. The university still requires grades, so you will be leading the evaluation of your work. At the beginning of the course, I will encourage you to think about and write down what you hope to get out of the course. Three times during the course you will reflect on what you have accomplished thus far, how it has met, not met, or exceeded expectations, based both on rubrics and personal goals and objectives. At each of these stages you will receive feedback on your assessments. By the end of the semester, you should have a clear vision of your accomplishments and growth, which you will turn into a grade. As the instructor-of-record, I maintain the right to disagree with your assessment and alter grades as I see fit, but any time that I do this it will be accompanied by an explanation and discussion. These personal assessments, reflecting both honest and meaningful reflection of your work will be the most important factor in final grades.
This course will follow a "self-assessment" philosophy. I am more interested in helping you to learn things that will be useful to you than in assigning grades. The university still requires grades, so you will be leading the evaluation of your work. In week two and again at the end of the course, you will reflect on what you have accomplished thus far, how it has met, not met, or exceeded expectations, based both on rubrics and personal goals and objectives. At each of these stages you will receive feedback on your assessments. By the end of the semester, you should have a clear vision of your accomplishments and growth, which you will turn into a grade. As the instructor-of-record, I maintain the right to disagree with your assessment and alter grades as I see fit, but any time that I do this it will be accompanied by an explanation and discussion. These personal assessments, reflecting both honest and meaningful reflection of your work will be the most important factor in final grades.


I suggest that we use the following rubric in our assessment:
I suggest that we use the following rubric in our assessment:
Line 176: Line 172:
* Sharing work early allowing extra time for engagement with others.
* Sharing work early allowing extra time for engagement with others.
* Write reflections that grapple meaningfully with lessons learned as well as challenges.
* Write reflections that grapple meaningfully with lessons learned as well as challenges.
* Complete all or nearly all assignments at a high level.
* Complete most, if not all programming assignments at a high level.


B: Reflects strong work. Work at this level will be of consistently high quality. Students reaching this level of achievement will:
B: Reflects strong work. Work at this level will be of consistently high quality. Students reaching this level of achievement will:
* Be more safe or consistent than the work described above.
* Be more safe or consistent than the work described above.
* Ask meaningful questions of peers and engage them in fruitful discussion.
* Ask meaningful questions of peers and engage them in fruitful discussion.
* Exceed requirements, but in fairly straightforward ways
* Exceed requirements, but in fairly straightforward ways - e.g., an additional post in discussion every week.
* Compose complete and sufficiently detailed reflections.
* Compose complete and sufficiently detailed reflections.
* Complete many of the programming assignments at a high level
* Complete many of the programming assignments.


C: This reflects meeting the minimum expectations of the course. Students reaching this level of achievement
C: This reflects meeting the minimum expectations of the course. Students reaching this level of achievement
Line 194: Line 190:
D/F: These are reserved for cases in which students do not complete work or participate. Students may also be
D/F: These are reserved for cases in which students do not complete work or participate. Students may also be
impeding the ability of others to learn.
impeding the ability of others to learn.


= Schedule =
= Schedule =
Line 202: Line 200:




== Day 1: Introduction to Python and Computational Thinking (August 26) ==
== Day 1: Introduction to Python and Computational Thinking (May 17) ==


'''Assignment Due:'''  
'''Assignment Due:'''  
Line 219: Line 217:
* Have written your first program in the python language.
* Have written your first program in the python language.


== Day 2: Variables, conditionals, and functions (September 2) ==
== Day 2: Variables, conditionals, and functions (May 18) ==


'''Assignments Due:'''  
'''Assignments Due:'''  
Line 225: Line 223:
* Fill out this [https://docs.google.com/forms/d/e/1FAIpQLSfUiGogs2jDXIHaXz1ooVBZFkRF2NdMaf00IgZvk7f69rby9w/viewform?usp=sf_link short survey]
* Fill out this [https://docs.google.com/forms/d/e/1FAIpQLSfUiGogs2jDXIHaXz1ooVBZFkRF2NdMaf00IgZvk7f69rby9w/viewform?usp=sf_link short survey]
* Sign up to be a discussant [https://docs.google.com/spreadsheets/d/1uSo-Ya5DghaLu1BYk94EVU2kBVmExRWwOa1586GbFUU/edit?usp=sharing here]
* Sign up to be a discussant [https://docs.google.com/spreadsheets/d/1uSo-Ya5DghaLu1BYk94EVU2kBVmExRWwOa1586GbFUU/edit?usp=sharing here]
* [[/Discord Signup|Sign up for Discord]] and introduce yourself
* [[/Day_2_Coding_Challenges|Day 2 Coding Challenge]] (turn in on Brightspace)
* [[/Day_2_Coding_Challenges|Day 2 Coding Challenge]] (turn in on Brightspace)


Line 231: Line 228:
* Bit By bit, [https://www.bitbybitbook.com/en/1st-ed/introduction/ Introduction]
* Bit By bit, [https://www.bitbybitbook.com/en/1st-ed/introduction/ Introduction]
* Python for Everybody, chapters 1-4
* Python for Everybody, chapters 1-4
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_2/day_2.ipynb Today's Jupyter Notebook] (Right-click, save, and open in Jupyter)
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_2/day_2.ipynb Today's Jupyter Notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819498/View Notebook walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6387775/View Notebook walkthrough]


'''Agenda:'''
'''Agenda:'''
Line 239: Line 236:
* Introduce wordplay project
* Introduce wordplay project


== Day 3: Iteration, strings, and lists (September 9) ==
== Day 3: Iteration, strings, and lists (May 19) ==


'''Assignment Due:'''
'''Assignment Due:'''
Line 248: Line 245:
* Python for Everybody
* Python for Everybody
  chapters_to_read = [5, 6, 8]
  chapters_to_read = [5, 6, 8]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_3/day_3.ipynb Today's Jupyter Notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_3/day_3.ipynb Today's Jupyter Notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819499/View Notebook walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6390578/View Notebook walkthrough]
* Foote, J., Shaw, A., & Hill, B.M. (2017). [https://jeremydfoote.com/files/foote_computational_2017.pdf Computational analysis of social media scholarship]. In Burgess, J., Poell, T., Marwick, A. (Eds.), The Sage Handbook of Social Media. Sage.
* Foote, J., Shaw, A., & Hill, B.M. (2017). [https://jeremydfoote.com/files/foote_computational_2017.pdf Computational analysis of social media scholarship]. In Burgess, J., Poell, T., Marwick, A. (Eds.), The Sage Handbook of Social Media. Sage.
** Discussant:  
** Discussant: Juan Pablo


'''Agenda:'''
'''Agenda:'''
Line 258: Line 255:
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_3/lecture/day_3.html Today's slides]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_3/lecture/day_3.html Today's slides]


== Day 4: Reading from and writing to files (September 16) ==
== Day 4: Reading from and writing to files (May 20) ==


'''Assignment Due:'''
'''Assignment Due:'''
Line 269: Line 266:
         read(chapter)
         read(chapter)
  book.close()
  book.close()
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_4/day_4.ipynb Today's Jupyter Notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_4/day_4.ipynb Today's Jupyter Notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819500/View Notebook walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6399269/View Notebook walkthrough]
* Nelson, Laura K. 2017. "[https://doi.org/10.1177%2F0049124117729703 Computational Grounded Theory: A Methodological Framework]." Sociological Methods and Research.
* Nelson, Laura K. 2017. "[https://doi.org/10.1177%2F0049124117729703 Computational Grounded Theory: A Methodological Framework]." Sociological Methods and Research.
** Discussant: Elizabeth
** Discussant: Beth Ann


'''Agenda:'''
'''Agenda:'''
Line 278: Line 275:
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_4/lecture/day_4.html Today's slides]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_4/lecture/day_4.html Today's slides]


== Day 5: Dictionaries and Tuples (September 23) ==
== Day 5: Dictionaries and Tuples (May 21) ==


'''Assignment Due:'''
'''Assignment Due:'''
Line 286: Line 283:
'''Readings:'''
'''Readings:'''
* Python for Everybody, chapters 9 and 10
* Python for Everybody, chapters 9 and 10
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_5/day_5.ipynb Today's Jupyter Notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_5/day_5.ipynb Today's Jupyter Notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819501/View Video walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6415595/View Video walkthrough]
* Margolin, D. B., Hannak, A., & Weber, I. (2018). [https://doi.org/10.1080/10584609.2017.1334018 Political Fact-Checking on Twitter: When Do Corrections Have an Effect?] Political Communication, 35(2), 196–219.
* Margolin, D. B., Hannak, A., & Weber, I. (2018). [https://doi.org/10.1080/10584609.2017.1334018 Political Fact-Checking on Twitter: When Do Corrections Have an Effect?] Political Communication, 35(2), 196–219.
** Discussant:  
** Discussant: Katelyn


'''Agenda:'''
'''Agenda:'''
Line 296: Line 293:
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_5/lecture/day_5.html Today's slides]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_5/lecture/day_5.html Today's slides]


== CATCH UP Week (September 30) ==
== Day 6: Dataframes and Visualization (May 24) ==
 
'''Readings:'''
* Shen, C., Monge, P., & Williams, D. (2014). [https://libkey.io/libraries/228/articles/5013123/full-text-file Virtual brokerage and closure: Network structure and soci8al capital in a massively multiplayer online game]. Communication Research. 41(4): 459–480.
 
 
== Day 6: Dataframes and Visualization (October 7) ==


'''Assignment Due:'''
'''Assignment Due:'''
Line 308: Line 299:


'''Readings:'''
'''Readings:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_6/day_6.ipynb Day 6 notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_6/day_6.ipynb Day 6 notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819502/View Notebook walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6419983/View Notebook walkthrough]
* Shaw, A., & Hill, B. M. (2014). Laboratories of oligarchy? How the iron law extends to peer production. Journal of Communication, 64(2), 215–238. https://doi.org/10.1111/jcom.12082
* Benefield, G. A., Shen, C., & Leavitt, A. (2016). [https://doi.org/10.1145/2818048.2819935 Virtual Team Networks: How Group Social Capital Affects Team Success in a Massively Multiplayer Online Game]. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, 679–690.
** Discussant: Anna




Line 317: Line 309:
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_6/lecture/day_6.html Today's slides]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_6/lecture/day_6.html Today's slides]


== Day 7: Dataframes and visualization (continued) (October 14) ==
== Day 7: Dataframes and visualization (continued) (May 25) ==


'''Assignment Due:'''
'''Assignment Due:'''
Line 323: Line 315:


'''Readings:'''
'''Readings:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_7/day_7.ipynb Day 7 notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_7/day_7.ipynb Day 7 notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819503/View Notebook walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6429518/View Notebook walkthrough]
* Lazer, D., & Radford, J. (2017). [https://doi.org/10.1146/annurev-soc-060116-053457 Data ex Machina: Introduction to Big Data]. Annual Review of Sociology, 43(1), 19–39.
* Lazer, D., & Radford, J. (2017). [https://doi.org/10.1146/annurev-soc-060116-053457 Data ex Machina: Introduction to Big Data]. Annual Review of Sociology, 43(1), 19–39.
** Discussant: Elizabeth
** Discussant: Yong




Line 333: Line 325:
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_7/lecture/day_7.html Today's slides]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_7/lecture/day_7.html Today's slides]


== Day 8: Collecting Data with APIs (October 21) ==
== Day 8: Collecting Data with APIs (May 26) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day 8 Coding Challenges|Day 8 Coding Challenges]].
* [[/Day 8 Coding Challenges|Day 8 Coding Challenges]].
** [https://youtu.be/TASX3evcgG4 Video instructions to install tweepy]
** [https://youtu.be/TASX3evcgG4 Video instructions to install tweepy]
* First [[Self_Assessment_Reflection | self-assessment reflection]] is due (on Brightspace).
* Project Planning Document Due


'''Readings:'''
'''Readings:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_8/day_8.ipynb Intro to APIs Notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_8/day_8.ipynb Intro to APIs Notebook]
** (Long) [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819504/View walkthrough of notebook]
** (Long) [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6430558/View walkthrough of notebook]
* Kieran Healy and James Moody (2014). “[https://doi.org/10.1146/annurev-soc-071312-145551 Data Visualization in Sociology].” American Review of Sociology. 40: 105-28.
* Kieran Healy and James Moody (2014). “[https://doi.org/10.1146/annurev-soc-071312-145551 Data Visualization in Sociology].” American Review of Sociology. 40: 105-28.
** Discussant:
** Discussant: Pearlynne


'''Agenda:'''
'''Agenda:'''
Line 352: Line 342:
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_8/lecture/day_8.html Today's slides]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_8/lecture/day_8.html Today's slides]


== Day 9: Collecting Data with APIs (continued) (October 28) ==
== Day 9: Collecting Data with APIs (continued) (May 27) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day 9 Coding Challenges|Day 9 Coding Challenges]]
* Start on [[Intro to Programming and Data Science (Fall 2021)/Day 9 Coding Challenges|Day 9 Coding Challenges]]
 
* First [[Self_Assessment_Reflection | self-assessment reflection]] is due (on Brightspace).
* Project Planning Document Due




'''Readings:'''
'''Readings:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_9/day_9.ipynb Day 9 Notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_9/day_9.ipynb Day 9 Notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819505/View Notebook walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6437812/View Notebook walkthrough]
* Python for Everybody, Chapter 13
* Python for Everybody, Chapter 13
* Vitak, J., Shilton, K., & Ashktorab, Z. (2016). [https://doi.org/10.1145/2818048.2820078 Beyond the Belmont Principles: Ethical Challenges, Practices, and Beliefs in the Online Data Research Community]. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, 941–953.
* Vitak, J., Shilton, K., & Ashktorab, Z. (2016). [https://doi.org/10.1145/2818048.2820078 Beyond the Belmont Principles: Ethical Challenges, Practices, and Beliefs in the Online Data Research Community]. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, 941–953.
** Discussant: Diane
** Discussant: Casey Lynn


* (Optional) Williams, M. L., Burnap, P., & Sloan, L. (2017). [https://doi.org/10.1177/0038038517708140 Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation]: Sociology.
* (Optional) Williams, M. L., Burnap, P., & Sloan, L. (2017). [https://doi.org/10.1177/0038038517708140 Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation]: Sociology.
* (Optional) Salganik, M. [https://www.bitbybitbook.com/en/1st-ed/ethics/ Ethics] chapter from Bit By Bit.  
* (Optional) Salganik, M. [https://www.bitbybitbook.com/en/1st-ed/ethics/ Ethics] chapter from Bit By Bit.  
* (Optional) Crawford, K., & Finn, M. (2015). [https://doi.org/10.1007/s10708-014-9597-z The limits of crisis data: Analytical and ethical challenges of using social and mobile data to understand disasters]. GeoJournal, 80(4), 491–502.
* (Optional) Crawford, K., & Finn, M. (2015). [https://doi.org/10.1007/s10708-014-9597-z The limits of crisis data: Analytical and ethical challenges of using social and mobile data to understand disasters]. GeoJournal, 80(4), 491–502.
* If you are interested in doing web scraping, then look at this [https://github.com/CU-ITSS/Web-Data-Scraping-S2019 incredible mini-course on the topic]. It is all done with Jupyter Notebooks and you have all of the prerequisite knowledge to understand it.
* [https://youtu.be/daUuC-PMZc4 Very brief lecture on web scraping from Spring 2020].




Line 380: Line 367:
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_9/lecture/day_9.html Today's slides]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_9/lecture/day_9.html Today's slides]


== Day 10: Introduction to Computational Text Analysis (November 4) ==
== Day 10: Introduction to Computational Text Analysis (May 28) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [[/Day 9 Coding Challenges|Day 9 Coding Challenges]]
* [[/Day 10 Coding Challenges|Day 10 Coding Challenges]]
* [[/Day 10 Coding Challenges|Day 10 Coding Challenges]]


'''Readings:'''
'''Readings:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_10/day_10.ipynb Today's Notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_10/day_10.ipynb Today's Notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819506/View Notebook walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6445663/View Notebook walkthrough]
* Christopher A. Bail et al. 2018. [https://doi.org/10.1073/pnas.1804840115 Exposure to opposing views on social media can increase political polarization]. PNAS 115(37): 9216-9221
* Christopher A. Bail et al. 2018. [https://doi.org/10.1073/pnas.1804840115 Exposure to opposing views on social media can increase political polarization]. PNAS 115(37): 9216-9221
** Discussant: Diane
** Discussant: Caitlyn


'''Agenda:'''
'''Agenda:'''
Line 398: Line 386:
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/blob/master/resources/solutions/Twitter_answers.ipynb My answers to the Day 8 problems]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/blob/master/resources/solutions/Twitter_answers.ipynb My answers to the Day 8 problems]


== Day 11: Data cleaning and operationalization (November 11) ==
== Day 11: Data cleaning and operationalization (June 1) ==


'''Assignment Due:'''
'''Assignment Due:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_11/day_11.ipynb Day 11 Coding Challenges]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_11/day_11.ipynb Day 11 Coding Challenges]




'''Readings:'''
'''Readings:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_11/day_11.ipynb Today's Notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_11/day_11.ipynb Today's Notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819507/View Notebook walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6450755/View Notebook walkthrough]
* Robert K. Merton. 1948. [https://www-jstor-org.ezproxy.lib.purdue.edu/stable/2087142?sid=primo&origin=crossref&seq=1#metadata_info_tab_contents The Bearing of Empirical Research Upon the Development of Social Theory]. American Sociological Review 13(5): 505-515.
* Robert K. Merton. 1948. [https://www-jstor-org.ezproxy.lib.purdue.edu/stable/2087142?sid=primo&origin=crossref&seq=1#metadata_info_tab_contents The Bearing of Empirical Research Upon the Development of Social Theory]. American Sociological Review 13(5): 505-515.
* Sara Klingenstein, Tim Hitchcock, and Simon DeDeo. 2014. [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084475/ The civilizing process in London’s Old Baily]. Proceedings of the National Academy of Sciences 111(26): 9419-9424.
* Sara Klingenstein, Tim Hitchcock, and Simon DeDeo. 2014. [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084475/ The civilizing process in London’s Old Baily]. Proceedings of the National Academy of Sciences 111(26): 9419-9424.
** Discussant:  
** Discussant: Jeremy


'''Resources:'''
'''Resources:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_11/lecture/day_11.html Today's slides]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_11/lecture/day_11.html Today's slides]


== Day 12: Organizing and storing computational projects (November 18) ==
== Day 12: Organizing and storing computational projects (June 2) ==


'''Assignment Due:'''
'''Assignment Due:'''
Line 424: Line 412:
* [https://learngitbranching.js.org/ Interactive git branching tutorial]
* [https://learngitbranching.js.org/ Interactive git branching tutorial]
* DellaPosta, D., Shi, Y., & Macy, M. (2015). [https://doi.org/10.1086/681254 Why Do Liberals Drink Lattes]? American Journal of Sociology, 120(5), 1473–1511.
* DellaPosta, D., Shi, Y., & Macy, M. (2015). [https://doi.org/10.1086/681254 Why Do Liberals Drink Lattes]? American Journal of Sociology, 120(5), 1473–1511.
** Discussant:   
** Discussant:  Lucy


'''Agenda:'''
'''Agenda:'''
Line 433: Line 421:
'''Resources:'''
'''Resources:'''


== Day 13: Statistical summaries and tests (December 2) ==
== Day 13: Statistical summaries and tests (June 3) ==


'''Assignment Due:'''
'''Assignment Due:'''
Line 440: Line 428:


'''Readings:'''
'''Readings:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_13/day_13.ipynb Day 13 Notebook]
* [https://github.com/jdfoote/Intro-to-Programming-and-Data-Science/raw/summer2021/day_13/day_13.ipynb Day 13 Notebook]
** [https://purdue.brightspace.com/d2l/le/content/335095/viewContent/6819508/View Notebook walkthrough]
** [https://purdue.brightspace.com/d2l/le/content/306917/viewContent/6459652/View Notebook walkthrough]
* Tan, C. (2018). [https://aaai.org/ocs/index.php/ICWSM/ICWSM18/paper/view/17811 Tracing community genealogy: How new communities emerge from the old]. Proceedings of the Twelfth International Conference on Web and Social Media (ICWSM ’18), 395–404.
* Tan, C. (2018). [https://aaai.org/ocs/index.php/ICWSM/ICWSM18/paper/view/17811 Tracing community genealogy: How new communities emerge from the old]. Proceedings of the Twelfth International Conference on Web and Social Media (ICWSM ’18), 395–404.
** Discussant: Sandra
** Discussant:  


'''Agenda:'''
'''Agenda:'''
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_13/lecture/day_13.html Today's slides]
* [https://jeremydfoote.com/Intro-to-Programming-and-Data-Science/day_13/lecture/day_13.html Today's slides]


== Day 14: Screen scraping (June 4) ==
'''Assignment Due:'''


== Day 15: Final Project Presentation (December 9) ==
'''Readings:'''
* Shaw, A., & Hill, B. M. (2014). [https://doi.org/10.1111/jcom.12082 Laboratories of oligarchy? How the iron law extends to peer production]. Journal of Communication, 64(2), 215–238.
** Discussant:
* [https://towardsdatascience.com/ethics-in-web-scraping-b96b18136f01 Ethics in Web Scraping] by James Densmore
 
'''Agenda:'''
* If you are interested in doing web scraping, then look at this [https://github.com/CU-ITSS/Web-Data-Scraping-S2019 incredible mini-course on the topic]. It is all done with Jupyter Notebooks and you have all of the prerequisite knowledge to understand it.
* [https://youtu.be/daUuC-PMZc4 Very brief lecture on web scraping from Spring 2020].
 
== Day 15-17: Work on final project (June 7-9) ==
 
'''Agenda:'''
* I will be available to answer questions and provide help
 
 
'''Additional Resources:'''
* [https://www.youtube.com/watch?v=K8L6KVGG-7o Regular Expressions]
* [https://www.youtube.com/watch?v=3dt4OGnU5sM List Comprehensions]
* [https://youtu.be/flwcAf1_1RU Network Analysis]
* [https://youtu.be/KBDJhhz4oXA Getting data from Reddit]
* [https://www.youtube.com/watch?v=ZDa-Z5JzLYM Classes and Object-oriented programming] (This is a set of videos)
* [https://bbengfort.github.io/snippets/2018/06/22/corenlp-nltk-parses.html Tutorial on syntax parsing in Python] (It's complicated!)
 
== Day 18: Final project presentation (June 10) ==


'''Assignment Due:'''
'''Assignment Due:'''
Line 463: Line 477:




 
== Day 19: Final Paper Due (June 11) ==
== Day 16: Final Paper Due (December 16) ==


'''Assignment Due:'''
'''Assignment Due:'''
* Final paper due
* Final paper due
* [[/Final_self_reflection|Final self reflection]] due
* [[/Final_self_reflection|Final self reflection]] due
= Additional Resources =
These are some topics we touched on in class covered in more depth
* [https://youtu.be/rQEsIs9LERM Using Tweepy to do full historical search on Twitter]
* [https://www.youtube.com/watch?v=K8L6KVGG-7o Regular Expressions]
* [https://www.youtube.com/watch?v=3dt4OGnU5sM List Comprehensions]
* [https://youtu.be/flwcAf1_1RU Network Analysis]
* [https://youtu.be/KBDJhhz4oXA Getting data from Reddit]
* [https://www.youtube.com/watch?v=ZDa-Z5JzLYM Classes and Object-oriented programming] (This is a set of videos)
* [https://bbengfort.github.io/snippets/2018/06/22/corenlp-nltk-parses.html Tutorial on syntax parsing in Python] (It's complicated!)


= Administrative Notes =
= Administrative Notes =
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)

Template used on this page: