Internet Research Methods (Spring 2016)

From CommunityData
Designing Internet Research
COM528 - Department of Communication, University of Washington
Instructor: Benjamin Mako Hill (University of Washington)
Course Websites:
Course Catalog Description:
Focuses on designing Internet research, assessing the adaptation of proven methods to Internet tools and environments, and developing new methods in view of particular capacities and characteristics of Internet applications. Legal and ethical aspects of Internet research receive ongoing consideration.

Overview and Learning Objectives

What new lines of inquiry and approaches to social research are made possible and necessary by the Internet? In what ways have established research methods been affected by the Internet? How does the Internet challenge established methods of social research? How are researchers responding to these challenges?

These are some of the key questions we will explore in this course. The course will focus on assessing the incorporation of Internet tools in established and emergent methods of social research, the adaptation of social research methods to study online phenomena, and the development of new methods and tools that correspond with the particular capacities and characteristics of the Internet. The readings will include both descriptions of Internet-related research methods with an eye to introducing skills and examples of studies that use them.The legal and ethical aspects of Internet research will receive ongoing consideration throughout the course. The purpose of this course is to help prepare students to design high quality research projects that use the Internet to study online communicative, social, cultural, and political phenomena.

I will consider the course a complete success if every student is able to do all of these things at the end of the quarter:

  • Discuss and compare distinct types of Internet research including: web archiving; textual analysis; ethnography; interviews; network analyses of social and hyperlink networks; analysis of digital trace data, traditional, natural, and field experiments; design research; interviewing; survey research; and narrative and visual analyses.
  • Describe particular challenges and threats to research validity associated with each method.
  • For at least one method, be able to provide a detailed description of a research project and feel comfortable embarking on a formative study using this methodology.
  • Given a manuscript (e.g., in the context of a request for peer review), be able to evaluate a Internet-based study in terms of its use its methodological choices.
  • Use a modern programming language (e.g., Python) to collect a dataset from a web API like the APIs from Twitter and Wikipedia.

Note About This Syllabus

You should expect this syllabus to be a dynamic document and you will notice that there are a few places marked "To Be Determined." Although the core expectations for this class are fixed, the details of readings and assignments will shift. As a result, there are three important things to keep in mind:

  1. Although details on this syllabus will change, I will not change readings or assignments less than one week before they are due. If I don't fill in a "To Be Determined" one week before it's due, it is dropped. If you plan to read more than one week ahead, contact me first.
  2. Closely monitor your email or the announcements section on the course website on Canvas. When I make changes, these changes will be recorded in the history of this page so that you can track what has changed and I will summarize these changes in an announcement on Canvas that will be emailed to everybody in the class.
  3. I will ask the class for voluntary anonymous feedback frequently — especially toward the beginning of the quarter. Please let me know what is working and what can be improved. In the past, I have made many adjustments based on this feedback.

Books

This book has no textbook and I am not requiring you to buy any books for this class. That said, several required readings and many suggested readings, will come from several excellent books which you should consider purchasing for your library.

These books include:

  1. Rogers, R. (2013). Digital Methods. Cambridge, Massachusetts: The MIT Press.
  2. Hesse-Biber, S. N. (Ed.). (2011). The Handbook of Emergent Technologies in Social Research (1st edition). New York: Oxford University Press.
  3. Ackland, R. (2013). Web Social Science. SAGE Publications Ltd.

Technical Skills

Nearly all of our structured in-person meetings and all of our readings will focus on teaching conceptual skills related to Internet research. These skills involve the "softer" skills of understanding, designing, and critiquing research plans. These are harder to teach, evaluate, and learn but are ultimately what will make a research project interesting, useful, or valid. When the course has been taught in the past by other faculty, it has been entirely focused on these types of conceptual skills.

That said, I also believe that any skilled Internet researcher must be comfortable writing code to collect a dataset from the web or, at the very least, should have enough experience doing so that they know what is involved and what is possible and impossible. This is essential even if your only goals is to manage somebody else writing code and gathering data. As a result, being successful in this class will also require technical skills.

Because students are going to come to the class with different technical skillsets, we well be devoting a relatively small chunk of class time to developing technical skills. Instead, I'm requiring that students build these skills outside of the class if they do not have them already.

In particular, I want every student to have the following three things:

  1. Basic skills in a general purpose high-level programming language used for Internet-based data collection and analysis. I strongly recommend the Python programming language although other programming languages like Ruby and Perl are also good choices. Generally speaking, statistical programming languages like R, Stata, Matlab are not well suited for this.
  2. Familiarity with the technologies of web APIs. In particular, students should understand what APIs are, how they work, and should be able to read, interpret, and process data in JSON.
  3. Knowledge of how to process and and move data from a website or API into a format that they will be able to use for analysis. The final format will depend on the nature of the result but this might be a statistical programming environment like R, Stata, SAS, SPSS, etc or a qualitative data analysis tools like ATLAS.ti, NVivo or Dedoose.

If you are already comfortable doing these things, great.

If you are not yet comfortable, I am going to be organizing three free workshops called the Community Data Science Workshops on Saturdays in April and May and I extremely strongly recommend that you attend them. The workshops will teach exactly the skills I'm expecting you to have and attending the workshops will be enough to fulfill this requirement.

The workshops will meet four times so please block these out on your calendar now:

  1. Friday 4/8 6-9pm
  2. Saturday 4/9 9:45am-4pm
  3. Saturday 4/23 9:45am-4pm
  4. Saturday 5/7 9:45am-4pm

I have taught these workshops twice before in the Spring and Fall quarters of 2014 and 2015. If you have taken them in the past, you do not need to take them again. If you are feeling unsure about your skills, you will be welcome to come back to review and brush up on the material.

If you do not have the technical skills required above and you will not attend the workshops, you're going to be responsible for learning this material on your own. If you know you will be in this situation, contact me before the quarter starts.

Assignments

The assignments in this class are designed to give you an opportunity to try your hand at using the conceptual material taught in the class. There will be no exams or quizzes. Unless otherwise noted, all assignments are due at the end of the day (i.e., 11:59pm on the day they are due).

Method Presentation

Related to participation, every student will be assigned a research method and asked to investigate how it is being adapted to or developing within Internet studies and to report on these results in a new Wikipedia article or in a major revision of a existing article.

The article should include several links to, and examples of, the method from published literature, an assessment of the potential affordances and constraints of this method for Internet research, a neutral and even-handed critique of some of the ways it has been employed in Internet research to date, and a list of references. All of these should be formatted according to Wikipedia policies.

Links to articles will be distributed ahead of class and all students will be expected to read them before we meet.

Wikipedia Task #1 - Create an account and Wikipedia orientation

Due
April 3
Deliverables
Make contributions in Wikipedia
  • Finish the online student orientation for our Wikipedia course. During this training, you will create an account, make edits in a sandbox, and learn the basic rules of the Wikipedia community.
  • Create a user page, and sign up on the list of students on the course page.
  • To practice editing and communicating on Wikipedia, introduce yourself to me and at least one classmate on Wikipedia.

Wikipedia Task #2 - Draft of Wikipedia Article

Maximum Length
2000 words
Deliverables
Make contributions in Wikipedia and share link in Canvas discussion
Due Date
11:59 on the day before the class session in which we will discuss the method
  • Compile a bibliography of relevant research.
  • Write or expand a Wikipedia article on the method you have selected — with citations — in your Wikipedia sandbox.
  • Add your sandboxed article to the class's course page with the template.

Wikipedia Task #3 - Finalize and Peer Review Your Classmates Articles

Deliverables
Make contributions in Wikipedia
Due Date
June 12
  • Move sandbox articles into the main namespace.
  • Peer review two of your classmates' articles. Leave suggestions on the article talk pages.
  • Copy-edit the two reviewed articles.
  • Make edits to your article based on peers' feedback. If you disagree with a suggestion, use talk pages to politely discuss and come to a consensus on your edit.

Discussion Facilitation

Due Date
Class session in which we will discuss the method
  • In addition to the essay, you will be responsible for facilitating the discussion of your assigned method in class. This means you should come prepared with questions and notes.

Research Project

As a demonstration of your learning in this course, you will design a plan for an internet research project and will, if possible, also collect (at least) an initial sample of a dataset that you will use to complete the project.

The genre of the paper you can produce can one of the following three things:

  1. A draft of a manuscript for submission to a conference or journal.
  2. A proposal for funding (e.g., for submission for the NSF for a graduate student fellowship).
  3. A draft of the methods chapter of your dissertation.

In any the three paths, I expect you take this opportunity to produce a document that will further your to academic career outside of the class.

Project Identification

Due Date
April 10
Maximum paper length
500 words (~1-2 page)
Deliverables
Turn in in Canvas

Early on, I want you to identify your final project. Your proposal should be short and can be either paragraphs or bullets. It should include the following things:

  • The genre of the project and a short description of how it fits into your career trajectory.
  • A one paragraph abstract of the proposed study and research question, theory, community, and/or groups you plan to study.
  • A short description of the type of data you plan to collect as part of your final project.

Final Project

Outline Due Date
May 8
Maximum outline length
2 pages
Paper Due Date
June 12
Maximum outline length
6000 words (~20 pages)
Presentation Date
June 2
All Deliverables
Turn in in Canvas

Because the emphasis in this class is on methods and because I'm not an expert in each of your areas or fields, I'm happy to assume that your paper, proposal, or thesis chapter has already established the relevance and significance of your study and has a comprehensive literature review, well-grounded conceptual approach, and compelling reason why this research is so important. Instead of providing all of this details, instead feel free to start with a brief summary of the purpose and importance of this research, and an introduction of your research questions or hypotheses. If your provide more detail, that's fine, but I won't give you detailed feedback on this parts.

The final paper should include:

  • a statement of the purpose, central focus, relevance and significance of this research;
  • a description of the specific Internet application(s) and/or environment(s) and/or objects to be studied and employed in the research;
  • key research questions or hypotheses;
  • operationalization of key concepts;
  • a description and rationale of the specific method(s), (if more than one method will be used, explain how the methods will produce complementary findings);
  • a description of the step-by-step plan for data collection;
  • description and rationale of the level(s), unit(s) and process of analysis (if more than one kind of data are generated, explain how each kind will be analyzed individually and/or comparatively);
  • an explanation of how these analyses will enable you to answer the RQs
  • a sample instrument (as appropriate);
  • a sample dataset and description of a formative analysis you have completed;
  • a description of actual or anticipated results and any potential problems with their interpretation;
  • a plan for publishing/disseminating the findings from this research
  • a summary of technical, ethical, human subjects and legal issues that may be encountered in this research, and how you will address them;
  • a schedule (using specific dates) and proposed budget.

Although I'm not going to require it, I would love for you to actually begin collect data for your project and describe your progress in this regard this in your paper. If collecting data for a proposed project is impractical (e.g., because of IRB applications, funding, etc) I would love for you to engage in the collection of public dataset as part of a pilot or formative study. If this is not feasible or useful, we can discuss other options.

I have a strong preference for you to write this paper individually but I'm open to the idea that you may want to work with others in the class.

Participation

The course relies heavily on participation and discussion. It is important to realize that we will not summarize reading in class and I will not cover it in lecture. I expect you all to have read it and we will jump in and start discussing it. The "Participation Rubric" section of my detailed page on assessment gives the rubric I will use in evaluating participation.

Grading

I have put together a very detailed page that describes grading rubric I will be using in this course. Please read it carefully I will assign grades for each of following items on the UW 4.0 grade scale according to the weights below:

  • Participation: 25%
  • Presentation of method/approach: 15%
  • Proposal identification: 5%
  • Final paper outline: 5%
  • Final Presentation: 10%
  • Final Paper: 40%

Schedule

Week 1: Monday March 28: Introduction and Framing

Required Readings:

  • Agre, Philip, “Internet Research: For and Against,” in Mia Consalvo, Nancy Baym, Jeremy Hunsinger, Klaus Bruhn Jensen, John Logie, Monica Murero, and Leslie Regan Shade, eds, Internet Research Annual, Volume 1: Selected Papers from the Association of Internet Researchers Conferences 2000-2002, New York: Peter Lang, 2004. [Free Online]
  • Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabasi, A.-L., Brewer, D., … Van Alstyne, M. (2009). Computational Social Science. Science, 323(5915), 721–723. [Available through UW Libraries]
  • Sandvig, Christian, 2010, "Why the Internet is On the Verge of Blowing Up All Our Methods Courses" [Free Online]

Optional Reading:

  • Gane, Nicholas, and Beer, David, 2008, "Introduction: Concepts and Media", from New Media: The Key Concepts, Berg, pp. 1-13. [Available in Canvas]
  • Ragin, Charles (1994), "The Goals of Social Research" (pp. 31-48 can be found here, skip 49-50, page 51 is here), and “The Process of Social Research,” from Constructing Social Research, Pine Forge Press. [Available in Canvas]
  • Bruhn Jensen, Klaus, 2011, "New media, old methods — Internet methodologies and the online/offline divide," in Consalvo & Ess (Eds.), The Handbook of Internet Studies, Blackwell, pp. 43-58. [Available in Canvas]
  • Hesse-Biber, Sharlene Nagy, "Emergent Technologies in Social Research: Pushing Against the Boundaries of Research Praxis," [HET], pp. 3-24. [Available in Canvas]
  • December, John. (March, 1996). "Units of Analysis for Internet Communication," Journal of Computer-Mediated Communication, V.1, N.4. [Available through UW libraries]
  • Steven M. Schneider & Kirsten A. Foot, "The Web as an Object of Study", New Media and Society, V. 6, N.1, 114-122, 2004. [Free Online]
  • Gunkel, David, "To Tell the Truth: The Internet and Emergent Epistemological Challenges in Social Research," [HET], pp. 47-64. [Available in Canvas]
  • Baym, Nancy. (2006). "Finding the Quality in Qualitative Internet Research," in Critical Cyberculture Studies, David Silver and Adrienne Massanari, eds., New York University Press, NY. pp. 79-87. [Available in Canvas]
  • Rogers, Introduction, Chapters 1-2 from Digital Methods, pp. 1-60.
  • Digital Methods Initiative. (2009). The Spheres. [Free Online]
  • Hackett, Edward, "Possible dreams: Research technologies and the transformation of the human sciences," Ch 1 in HET. [Available in Canvas]

Week 1: Wednesday March 30: Ethics

Required Readings:

  • Association of Internet Researchers, Ethics Working Committee, 2011, "Ethics Guidelines" Review Draft. [Free Online]
  • Kramer, A. D. I., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences, 111(24), 8788–8790. [Available through UW Libraries]
  • [Look Over Briefly] Grimmelmann, James. (2014) The Facebook Emotional Manipulation Study: Sources. [Free Online]
  • Carr, N. (2014, September 14). The Manipulators: Facebook’s Social Engineering Project. Retrieved March 26, 2015. [Free Online]
  • Bernstein, M. (2014, July 7). The Destructive Silence of Social Computing Researchers. Retrieved March 26, 2015. [Free Online]
  • Lampe, C. (2014, July 8). Facebook Is Good for Science. [Free Online]

Optional Readings:

  • The Belmont Report. (1979).
  • American Association for the Advancement of Science, 1999, “Ethical and Legal Aspects of Human Subjects Research in Cyberspace.” [Free Online]
  • Digital Millenium Copyright Act and these explanatory/commentary essays & sites:
    • The Electronic Frontier Foundation's page on the DMCA.
    • Templeton, Brad's A Brief Intro to Copyright & 10 Big Myths about Copyright Explained
    • Sections on Copyright, Privacy, and Social Media in the "Internet Case Digest" of the Perkins Coie LLP site.

Week 2: Monday April 4: NO CLASS

Week 2: Wednesday April 6: Web Archiving

Required Readings:

   Bruegger, Niels, "Web archiving — Between past, present, and future," in Consalvo & * Ess, The Handbook of Internet Studies, Blackwell, pp. 24-42. [Available in Canvas]
   Weber, M. S. (2014). Observing the Web by Understanding the Past: Archival Internet Research. In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion (pp. 1031–1036). Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee. [Available through UW Libraries]
   Rogers, Richard, Chapter 3 "The Website as Archived Object" from Digital Methods, pp. 61-82. [Available through Canvas]

Optional Readings:

   Schneider, Steven, Kirsten Foot, and Paul Wouters, 2009, “Web Archiving as E-Research,” in e-Research: Transformation in Scholarly Practice, Nicholas Jankowski (Ed.), Routledge, pp. 205-221. [Available in Canvas]
   Vargo, C. J., Guo, L., McCombs, M., & Shaw, D. L. (2014). Network Issue Agendas on Twitter During the 2012 U.S. Presidential Election. Journal of Communication, 64(2), 296–316. [Available through UW Libraries]
   boyd, danah, and Kate Crawford, (2012) "Critical Questions for Big Data," Information, Communication, & Society, May. [Available through UW Libraries]
   Gherab-Martin, Karim, "Digital repositories, folksonomies, and interdisciplinary research: New social epistemology tools," Ch. 10 in HET. [Available in Canvas]
   Rogers, Richard, Chapter 4 "Googlication and the Inculpable Search Engine" from Digital Methods.

Week 2: Friday April 8: CSCW Session 0

As description in the section on technical skills above, I expect everybody who is not comfortable with at least basic programming and data collection to attend the Community Data Science Workshops (Spring 2016) which I am running concurrently with this class.

This session will run from 6-9pm and is the only session which can probably be missed. Please do contact me, however, if you will not be able to attend it.

Week 2: Saturday April 9: CSCW Session 1

As description in the section on technical skills above, I expect everybody who is not comfortable with at least basic programming and data collection to attend the Community Data Science Workshops (Spring 2016) which I am running concurrently with this class.

This session will run from 9am-3pm.

Week 3: Monday April 11: Digital Ethnography & Trace Ethnography

Week 3: Wednesday April 13: Crowdsourced Data Analysis

Week 4: Monday April 18: Textual Analyses

Week 4: Wednesday April 20: Visual Analysis

Week 4: Saturday April 23: CSCW Session 2

Week 5: Monday April 25: Social Network Analysis

Week 5: Wednesday April 27: Crowdsourcing

Week 6: Monday May 2: Online "Lab" Experiments

Week 6: Wednesday May 4: Field Experiments

Week 6: Saturday May 7: CSCW Session 3

Week 7: Monday May 9: NO CLASS

Week 7: Wednesday May 11: Hyperlink Networks

Week 8: Monday May 16: Sensor Data

Week 8: Wednesday May 18: Visual Analysis

Week 9: Monday May 23: Narrative Analysis

Week 9: Wednesday May 25: ????

Week 10: Monday May 30: Final Presentions

Week 10: Wednesday June 1: Final Presentations

Surveys

Online Interviews