Editing DS4UX (Spring 2016)/Panama Papers
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 3: | Line 3: | ||
In this project, we will explore a few ways to gather data using two Wikipedia APIs: one provides data related to edits, and the other provides data related to pageviews. Once we've done that, we will extend this to code to create our own datasets of Wikipedia edits or other data that you use as the basis of your Final Project. | In this project, we will explore a few ways to gather data using two Wikipedia APIs: one provides data related to edits, and the other provides data related to pageviews. Once we've done that, we will extend this to code to create our own datasets of Wikipedia edits or other data that you use as the basis of your Final Project. | ||
This project is adapted from material being developed for the [[CDSW|Community Data Science Workshops]] by Ben Lewis and Morten Wang | This project is adapted from material being developed for the [[CDSW|Community Data Science Workshops]] by Ben Lewis and Morten Wang. | ||
== Overview == | == Overview == | ||
In this project we will look at the viewing and editing history of a recently created Wikipedia article about a breaking news event—''[[w:Panama_Papers| Panama Papers]]''. When events of global significance occur, Wikipedia is often among the first places that people look for information about these events. By examining both the editing and viewing history of this article, we can learn a lot about how people create ''and'' consume information on Wikipedia. | In this project we will look at the viewing and editing history of a recently created Wikipedia article about a breaking news event—''[[w:Panama_Papers| Panama Papers]]''. When events of global significance occur, Wikipedia is often among the first places that people look for information about these events. By examining both the editing and viewing history of this article, we can learn a lot about how people create ''and'' consume information on Wikipedia. | ||
=== Goals === | === Goals === | ||
Line 48: | Line 46: | ||
* [https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=revisions&list=&meta=&titles=Panama_Papers&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser&rvlimit=1&rvdir=newer View query in sandbox] | * [https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=revisions&list=&meta=&titles=Panama_Papers&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser&rvlimit=1&rvdir=newer View query in sandbox] | ||
* [https://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&list=&meta=&titles=Panama_Papers&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser&rvlimit=1&rvdir=newer View result in browser] | * [https://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&list=&meta=&titles=Panama_Papers&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser&rvlimit=1&rvdir=newer View result in browser] | ||
Line 55: | Line 52: | ||
* [https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=revisions&list=&meta=&titles=Panama_Papers&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser&rvlimit=1&rvdir=older View query in sandbox] | * [https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=revisions&list=&meta=&titles=Panama_Papers&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser&rvlimit=1&rvdir=older View query in sandbox] | ||
* [https://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&list=&meta=&titles=Panama_Papers&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser&rvlimit=1&rvdir=newer View result in browser] | * [https://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&list=&meta=&titles=Panama_Papers&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser&rvlimit=1&rvdir=newer View result in browser] | ||
; How many edits has the creator of Panama Papers made to Wikipedia? | ; How many edits has the creator of Panama Papers made to Wikipedia? | ||
* [https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&list=users&usprop=editcount%7Cregistration&ususers=Czar View query in sandbox] | :* [https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&list=users&usprop=editcount%7Cregistration&ususers=Czar View query in sandbox] | ||
* [https://en.wikipedia.org/w/api.php?action=query&format=json&list=users&usprop=editcount%7Cregistration&ususers=Czar View result in browser | :* [https://en.wikipedia.org/w/api.php?action=query&format=json&list=users&usprop=editcount%7Cregistration&ususers=Czar View result in browser] | ||
Line 69: | Line 64: | ||
* [https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=revisions&titles=Panama+Papers&rvprop=ids%7Ctimestamp%7Ccontent&rvstart=2016-04-04T17%3A58%3A00.000Z&rvend=2016-04-04T17%3A59%3A05.000Z&rvdir=newer View query in sandbox] | * [https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=revisions&titles=Panama+Papers&rvprop=ids%7Ctimestamp%7Ccontent&rvstart=2016-04-04T17%3A58%3A00.000Z&rvend=2016-04-04T17%3A59%3A05.000Z&rvdir=newer View query in sandbox] | ||
* [https://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&titles=Panama+Papers&rvprop=ids%7Ctimestamp%7Ccontent&rvstart=2016-04-04T17%3A58%3A00.000Z&rvend=2016-04-04T17%3A59%3A05.000Z&rvdir=newer View result in browser] | * [https://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&titles=Panama+Papers&rvprop=ids%7Ctimestamp%7Ccontent&rvstart=2016-04-04T17%3A58%3A00.000Z&rvend=2016-04-04T17%3A59%3A05.000Z&rvdir=newer View result in browser] | ||
* [https://en.wikipedia.org/w/index.php?title=Panama_Papers&oldid=713548359 revision on Wikipedia] | |||
* [https://en.wikipedia.org/w/index.php?title=Panama_Papers&oldid=713548359 | * [https://en.wikipedia.org/w/index.php?title=Panama_Papers&diff=713548359&oldid=713548357 "diff" view of revision] | ||
* [https://en.wikipedia.org/w/index.php?title=Panama_Papers&diff=713548359&oldid=713548357 | |||
Line 111: | Line 105: | ||
Now that we're comfortable building API queries in the sandbox, we will focus on how we can access these APIs with Python. If you would like to review the steps involved in building an API query in Python, check out the resources listed below. | Now that we're comfortable building API queries in the sandbox, we will focus on how we can access these APIs with Python. If you would like to review the steps involved in building an API query in Python, check out the resources listed below. | ||
* [https://github.com/makoshark/wikipedia-cdsw/blob/master/building-a-query.md Querying APIs from Python] a written lecture by Ben Lewis that walks you step-by-step through the process of building and executing an API query in Python. The 'companion script' <code>building_a_query_code.py</code> in the project directory executes all of the code shown in this lecture step-by-step. If you want to just execute some of the code in the lecture, comment out all the stuff below the blocks of code you want to execute it before you run the script. | * [https://github.com/makoshark/wikipedia-cdsw/blob/master/building-a-query.md Querying APIs from Python] a written lecture by Ben Lewis that walks you step-by-step through the process of building and executing an API query in Python. The 'companion script' <code>building_a_query_code.py</code> in the project directory executes all of the code shown in this lecture step-by-step. If you want to just execute some of the code in the lecture, comment out all the stuff below the blocks of code you want to execute it before you run the script. | ||
* <code>introduce_while.py</code> — (in project directory) this script uses a while loop to roll two 'virtual dice' until they both come up 6's. | |||
* <code>introduce_while.py</code> — (in project directory) this script uses a while loop to roll two 'virtual dice' until they both come up 6's | * <code>introduce_continue.py</code> — (in project directory) | ||
* <code>introduce_continue.py</code> — (in project directory) | |||
Line 128: | Line 121: | ||
;3. How many times was Panama Papers viewed in the first week? What proportion of those views came from mobile devices? | ;3. How many times was Panama Papers viewed in the first week? What proportion of those views came from mobile devices? | ||
Line 148: | Line 139: | ||
* [https://en.wikipedia.org/w/api.php?action=help&modules=query API documentation for the query module] | * [https://en.wikipedia.org/w/api.php?action=help&modules=query API documentation for the query module] | ||
* [https://en.wikipedia.org/wiki/Special:ApiSandbox API Sandbox] | * [https://en.wikipedia.org/wiki/Special:ApiSandbox API Sandbox] | ||
* [[Sample | * [[Sample Wikipedia API queries]] | ||
* [https://github.com/ben-zen/wikipedia-session The session lecture notes (in Markdown) and python sources.] | |||
=== Research using Wikipedia data === | === Research using Wikipedia data === | ||
Line 155: | Line 146: | ||
* [http://www.brianckeegan.com/papers/CSCW_2015.pdf ‘Is’ to ‘Was’: Coordination and Commemoration on Posthumous Wikipedia Biographies] — an exploration of editing patterns around Wikipedia articles about people who have recently died. | * [http://www.brianckeegan.com/papers/CSCW_2015.pdf ‘Is’ to ‘Was’: Coordination and Commemoration on Posthumous Wikipedia Biographies] — an exploration of editing patterns around Wikipedia articles about people who have recently died. | ||
* [http://www.brianckeegan.com/papers/ICS_2015.pdf WikiWorthy: Judging a Candidate’s Notability in the Community] — A study that uses the editing activity on Wikipedia articles about political candidates as a predictor of election success. | * [http://www.brianckeegan.com/papers/ICS_2015.pdf WikiWorthy: Judging a Candidate’s Notability in the Community] — A study that uses the editing activity on Wikipedia articles about political candidates as a predictor of election success. | ||
=== Websites that use the MediaWiki API === | === Websites that use the MediaWiki API === |