Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Page
Discussion
Edit
View history
Editing
Designing Internet Research (Spring 2022)
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Monday April 4: Internet Data Collection === '''Required Readings:''' * Mislove, Alan, and Christo Wilson. 2018. “A Practitioner’s Guide to Ethical Web Data Collection.” In The Oxford Handbook of Networked Communication, edited by Brooke Foucault Welles and Sandra González-Bailón. London, UK: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780190460518.001.0001. {{avail-uw|https://doi.org/10.1093/oxfordhb/9780190460518.001.0001}} * Brügger, Niels. 2018. “Web History and Social Media.” In The SAGE Handbook of Social Media, edited by Jean Burgess, Alice Marwick, and Thomas Poell, 196–212. London, UK: SAGE Publications Ltd. https://doi.org/10.4135/9781473984066. {{avail-uw|https://doi.org/10.4135/9781473984066}} * Shumate, Michelle, and Matthew S. Weber. 2015. “The Art of Web Crawling for Social Science Research.” In Digital Research Confidential: The Secrets of Studying Behavior Online, edited by Eszter Hargittai and Christian Sandvig, 234–59. Cambridge, MA: The MIT Press. {{avail-canvas|1=https://canvas.uw.edu/files/90060239/download?download_frd=1}} * Freelon, Deen. 2018. “Computational Research in the Post-API Age.” Political Communication 35 (4): 665–68. https://doi.org/10.1080/10584609.2018.1477506. {{avail-uw|https://doi.org/10.1080/10584609.2018.1477506}} * '''[Example]''' Graeff, Erhardt, Matt Stempeck, and Ethan Zuckerman. 2014. “The Battle for ‘Trayvon Martin’: Mapping a Media Controversy Online and Off-Line.” First Monday 19 (2). http://firstmonday.org/ojs/index.php/fm/article/view/4947. {{avail-free|http://firstmonday.org/ojs/index.php/fm/article/view/4947}} '''Optional Readings:''' * Ankerson, Megan Sapnar. 2015. “Read/Write the Digital Archive: Strategies for Historical Web Research.” In Digital Research Confidential: The Secrets of Studying Behavior Online, edited by Eszter Hargittai and Christian Sandvig, 29–54. Cambridge, MA: MIT Press. {{avail-canvas|1=https://canvas.uw.edu/files/90060241/download?download_frd=1}} * Spaniol, Marc, Dimitar Denev, Arturas Mazeika, Gerhard Weikum, and Pierre Senellart. 2009. “Data Quality in Web Archiving.” In Proceedings of the 3rd Workshop on Information Credibility on the Web, 19–26. WICOW ’09. New York, NY, USA: ACM. https://doi.org/10.1145/1526993.1526999. {{avail-uw|https://doi.org/10.1145/1526993.1526999}} * Schneider, Steven M., and Kirsten A. Foot. 2004. “The Web as an Object of Study.” New Media & Society 6 (1): 114–22. https://doi.org/10.1177/1461444804039912. {{avail-uw|https://doi.org/10.1177/1461444804039912}} * Weber, Matthew S. 2014. “Observing the Web by Understanding the Past: Archival Internet Research.” In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, 1031–1036. WWW Companion ’14. Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/2567948.2579213. {{avail-uw|https://doi.org/10.1145/2567948.2579213}} '''Optional readings related to the ethics of data collection online:''' * Amy Bruckman's two 2016 blog posts about researchers violating terms of Service (TOS) while doing academic research: [https://nextbison.wordpress.com/2016/02/26/tos/ Do Researchers Need to Abide by Terms of Service (TOS)? An Answer.] and [https://nextbison.wordpress.com/2016/02/29/tos2/ More on TOS: Maybe Documenting Intent Is Not So Smart] * [http://www.copyright.gov/legislation/dmca.pdf Digital Millenium Copyright Act] and these explanatory/commentary essays & sites: ** The [https://www.eff.org/ Electronic Frontier Foundation's] [https://www.eff.org/issues/dmca page on the DMCA]. ** Templeton, Brad's [http://www.templetons.com/brad/copyright.html A Brief Intro to Copyright] & [http://www.templetons.com/brad/copymyths.html 10 Big Myths about Copyright Explained] ** Sections on Copyright, Privacy, and Social Media in the “Internet Case Digest” of the [http://www.perkinscoie.com/casedigest/ Perkins Coie LLP “Case Digest” site]. * Narayanan, A., and V. Shmatikov. 2008. “Robust De-Anonymization of Large Sparse Datasets.” In IEEE Symposium on Security and Privacy, 2008. SP 2008, 111–25. https://doi.org/10.1109/SP.2008.33. {{avail-uw|https://doi.org/10.1109/SP.2008.33}} '''Two useful sources of data collection:''' * [http://www.archiveteam.org/index.php?title=Main_Page Archive Team] is an online community that archives websites. They are a fantastic resource and include many pieces of detailed technical documentation on the practice of engaging in web archiving. For example, here are detailed explanations of [http://www.archiveteam.org/index.php?title=Wget#Mirroring_a_website mirroring a website with GNU wget] which is the piece of free software I usually use to archive websites. * [https://www.openhumans.org/ OpenHumans] is an online community where people share personal data with each other and with researchers.
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information