Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Page
Discussion
Edit
View history
Editing
Designing Internet Research (Winter 2020)
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Week 2: Tuesday January 14: (I) Internet Data Collection (II) Textual Analysis === ==== Part I: Internet Data Collection ==== '''Required Readings:''' * Mislove, Alan, and Christo Wilson. 2018. “A Practitioner’s Guide to Ethical Web Data Collection.” In The Oxford Handbook of Networked Communication, edited by Brooke Foucault Welles and Sandra González-Bailón. London, UK: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780190460518.001.0001. ''[[https://doi.org/10.1093/oxfordhb/9780190460518.001.0001 Available through UW libraries]]'' * Brügger, Niels. 2018. “Web History and Social Media.” In The SAGE Handbook of Social Media, edited by Jean Burgess, Alice Marwick, and Thomas Poell, 196–212. London, UK: SAGE Publications Ltd. https://doi.org/10.4135/9781473984066. ''[[https://doi.org/10.4135/9781473984066 Available through UW Libraries]]'' * Shumate, Michelle, and Matthew S. Weber. 2015. “The Art of Web Crawling for Social Science Research.” In Digital Research Confidential: The Secrets of Studying Behavior Online, edited by Eszter Hargittai and Christian Sandvig, 234–59. Cambridge, MA: The MIT Press. ''[[https://canvas.uw.edu/files/61060105/download?download_frd=1 Available in Canvas]]'' * Freelon, Deen. 2018. “Computational Research in the Post-API Age.” Political Communication 35 (4): 665–68. https://doi.org/10.1080/10584609.2018.1477506. ''[[https://doi.org/10.1080/10584609.2018.1477506 Available through UW Libraries]]'' * '''[Example]''' Graeff, Erhardt, Matt Stempeck, and Ethan Zuckerman. 2014. “The Battle for ‘Trayvon Martin’: Mapping a Media Controversy Online and Off-Line.” First Monday 19 (2). http://firstmonday.org/ojs/index.php/fm/article/view/4947. ''[[http://firstmonday.org/ojs/index.php/fm/article/view/4947 Available free online]]'' '''Optional Readings:''' * Ankerson, Megan Sapnar. 2015. “Read/Write the Digital Archive: Strategies for Historical Web Research.” In Digital Research Confidential: The Secrets of Studying Behavior Online, edited by Eszter Hargittai and Christian Sandvig, 29–54. Cambridge, MA: MIT Press. ''[[https://canvas.uw.edu/files/61061872/download?download_frd=1 Available in Canvas]]'' * Spaniol, Marc, Dimitar Denev, Arturas Mazeika, Gerhard Weikum, and Pierre Senellart. 2009. “Data Quality in Web Archiving.” In Proceedings of the 3rd Workshop on Information Credibility on the Web, 19–26. WICOW ’09. New York, NY, USA: ACM. https://doi.org/10.1145/1526993.1526999. ''[[https://doi.org/10.1145/1526993.1526999 Available through UW Libraries]]'' * Schneider, Steven M., and Kirsten A. Foot. 2004. “The Web as an Object of Study.” New Media & Society 6 (1): 114–22. https://doi.org/10.1177/1461444804039912. ''[[https://doi.org/10.1177/1461444804039912 Available through UW Libraries]]'' * Weber, Matthew S. 2014. “Observing the Web by Understanding the Past: Archival Internet Research.” In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, 1031–1036. WWW Companion ’14. Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/2567948.2579213. ''[[https://doi.org/10.1145/2567948.2579213 Available through UW Libraries]]'' '''Optional readings related to the ethics of data collection online:''' * Amy Bruckman's two 2016 blog posts about researchers violating terms of Service (TOS) while doing academic research: [https://nextbison.wordpress.com/2016/02/26/tos/ Do Researchers Need to Abide by Terms of Service (TOS)? An Answer.] and [https://nextbison.wordpress.com/2016/02/29/tos2/ More on TOS: Maybe Documenting Intent Is Not So Smart] * [http://www.copyright.gov/legislation/dmca.pdf Digital Millenium Copyright Act] and these explanatory/commentary essays & sites: ** The [https://www.eff.org/ Electronic Frontier Foundation's] [https://www.eff.org/issues/dmca page on the DMCA]. ** Templeton, Brad's [http://www.templetons.com/brad/copyright.html A Brief Intro to Copyright] & [http://www.templetons.com/brad/copymyths.html 10 Big Myths about Copyright Explained] ** Sections on Copyright, Privacy, and Social Media in the “Internet Case Digest” of the [http://www.perkinscoie.com/casedigest/ Perkins Coie LLP “Case Digest” site]. * Narayanan, A., and V. Shmatikov. 2008. “Robust De-Anonymization of Large Sparse Datasets.” In IEEE Symposium on Security and Privacy, 2008. SP 2008, 111–25. https://doi.org/10.1109/SP.2008.33. ''[[https://doi.org/10.1109/SP.2008.33 Available through UW Libraries]]'' '''Two useful sources of data collection:''' * [http://www.archiveteam.org/index.php?title=Main_Page Archive Team] is an online community that archives websites. They are a fantastic resource and include many pieces of detailed technical documentation on the practice of engaging in web archiving. For example, here are detailed explanations of [http://www.archiveteam.org/index.php?title=Wget#Mirroring_a_website mirroring a website with GNU wget] which is the piece of free software I usually use to archive websites. * [https://www.openhumans.org/ OpenHumans] is an online community where people share personal data with each other and with researchers. ==== Part II: Textual Analyses ==== '''Required Readings:''' * McMillan, Sally J. 2000. “The Microscope and the Moving Target: The Challenge of Applying Content Analysis to the World Wide Web.” Journalism & Mass Communication Quarterly 77 (1): 80–98. https://doi.org/10.1177/107769900007700107. ''[[https://doi.org/10.1177/107769900007700107 Available through UW Libraries]]'' * Shah, Dhavan V., Joseph N. Capella, W. Russell Neuman, Rodrigo Zamith, and Seth C. Lewis. 2015. “Content Analysis and the Algorithmic Coder: What Computational Social Science Means for Traditional Modes of Media Analysis.” The ANNALS of the American Academy of Political and Social Science 659 (1): 307–18. https://doi.org/10.1177/0002716215570576. ''[[https://doi.org/10.1177/0002716215570576 Available in UW libraries]]'' * Grimmer, Justin, and Brandon M. Stewart. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis, January, mps028. https://doi.org/10.1093/pan/mps028. ''[[https://doi.org/10.1093/pan/mps028 Available through UW Libraries]]'' * DiMaggio, Paul, Manish Nag, and David Blei. 2013. “Exploiting Affinities between Topic Modeling and the Sociological Perspective on Culture: Application to Newspaper Coverage of U.S. Government Arts Funding.” Poetics, Topic Models and the Cultural Sciences, 41 (6): 570–606. https://doi.org/10.1016/j.poetic.2013.08.004. ''[[https://doi.org/10.1016/j.poetic.2013.08.004 Available through UW Libraries]]'' * Feldman, Ronen. 2013. “Techniques and Applications for Sentiment Analysis.” Communications of the ACM 56 (4): 82–90. https://doi.org/10.1145/2436256.2436274. ''[[https://doi.org/10.1145/2436256.2436274 Available in UW libraries]]'' <!-- super industry focused. remove next time and replace w/ something better --> '''Optional Readings:''' * Trilling, Damian, and Jeroen G. F. Jonkman. 2018. “Scaling up Content Analysis.” Communication Methods and Measures 12 (2–3): 158–74. https://doi.org/10.1080/19312458.2018.1447655. ''[[https://doi.org/10.1080/19312458.2018.1447655 Available in UW libraries]]'' * Leetaru, Kalev Hannes. 2011. Data Mining Methods for the Content Analyst: An Introduction to the Computational Analysis of Content. Routledge Communication Series. New York, NY: Taylor and Francis. ''[[https://ebookcentral.proquest.com/lib/washington/detail.action?docID=1075229 Available through UW libraries]]''. I'm assuming you have at least a rough familiarity with [https://en.wikipedia.org/wiki/Content_analysis content analysis] as a methodology. If your not as comfortable with this, check out the Wikipedia article to start. These help provide more of a background into content analysis (in general, and online): * Van Selm, Martine & Jankowski, Nick, (2005) "[https://canvas.uw.edu/files/36066292/download?download_frd=1 Content Analysis of Internet-Based Documents.]" Unpublished Manuscript. ''[Available in Canvas]'' * Neuendorf, K. A. (2002). The content analysis guidebook. Thousand Oaks, Calif.: Sage Publications. ''[Available from Instructor]'' * Krippendorff, K. (2005). Content analysis: an introduction to its methodology. Thousand Oaks; London; New Delhi: Sage. ''[Available from Instructor]'' Examples of more traditional content analysis using online content: * Trammell, K. D., Tarkowski, A., Hofmokl, J., & Sapp, A. M. (2006). [http://doi.org/10.1111/j.1083-6101.2006.00032.x Rzeczpospolita blogów (Republic of Blog): Examining Polish Bloggers Through Content Analysis.] Journal of Computer-Mediated Communication, 11(3), 702–722. ''[Available Free Online]'' * Woolley, J. K., Limperos, A. M., & Oliver, M. B. (2010). [http://doi.org/10.1080/15205436.2010.516864 The 2008 Presidential Election, 2.0: A Content Analysis of User-Generated Political Facebook Groups.] Mass Communication and Society, 13(5), 631–652. ''[Available from UW Libraries]''' * Maier, Daniel, A. Waldherr, P. Miltner, G. Wiedemann, A. Niekler, A. Keinert, B. Pfetsch, et al. 2018. “Applying LDA Topic Modeling in Communication Research: Toward a Valid and Reliable Methodology.” Communication Methods and Measures 12 (2–3): 93–118. https://doi.org/10.1080/19312458.2018.1430754. Another example of topic modeling from political science: * Barberá, P., Bonneau, R., Egan, P., Jost, J. T., Nagler, J., & Tucker, J. (2014). [http://smapp.nyu.edu/SMAPP_Website_Papers_Articles/leadersAndFollowersMeasuringPolitical.pdf Leaders or Followers? Measuring Political Responsiveness in the US Congress Using Social Media Data.] Presented at the Annual Meeting of the American Political Science Association. ''[Free Online]''
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information