|
|
(71 intermediate revisions by 3 users not shown) |
Line 2: |
Line 2: |
|
| |
|
| * Notes on Wikia Dumps [[CommunityData:Wikia Dumps]] | | * Notes on Wikia Dumps [[CommunityData:Wikia Dumps]] |
| * Notes on the code [[CommunityData:Message Walls Code]] | | * Notes on the code -- Now with a diagram! [[CommunityData:Message Walls Code]] |
|
| |
|
| | = Robustness Checks = |
| | * Pre-period matching placebo test |
| | * Normal placebo test |
|
| |
|
| = Task Management = | | = Task Management = |
|
| |
|
| ==Overview== | | ==Overview == |
|
| |
|
| *By first week of October, we have a complete dataset
| | (Updated March 15th) |
| *By end of October, we build all variables, and start analysis
| |
| *By end of November, we conduct and finish data analysis
| |
| *Sneha gets a lot of writing done during dissertation boot camp (Dec 4-15)
| |
| *By January, have a solid draft ready for CSCW second round
| |
|
| |
|
| ==Next Steps (Sept 28)== | | ===Get missing wikis=== |
| * (Salt) Verify which dumps are good | | *'''ASAP''' Need to use wikilist3.csv to determine which wikis we don't have - Salt (with Mako's help) |
| * (Salt) Rerun wikiq to get encoded urls
| | *'''ASAP''' Download the rest and put them through wikiq and build edit weeks - Salt (with Mako's help) |
| * (Salt) Write R code to define edits as either newcomer or non-newcomer
| |
| * (Nate) Finish refactoring build wiki list code
| |
| * (Nate) Make code architecture diagram | |
| * (Nate) Continue work on bot and admin scraper
| |
| * (Nate, from last week) Convert dates to lubridate
| |
| * (Sneha) Contact Danny Horn to get information on message wall rollouts
| |
| * (Sneha) Determine inclusion criteria for wikis, and write python code to subset the ones we want
| |
|
| |
|
| ==Next Steps (Sept 20)== | | ===Analysis=== |
| * (Nate) Convert all dates to lubridate | | * Another meeting with full team to go over the results and try to make sense of them (after Sneha takes a first stab) |
| | * Determine any other models we want to run |
|
| |
|
| ==Next Steps (Aug 24)== | | ===Writing=== |
| *(Sneha) Add list of variables to build to the wiki | | * Switch from Haythornwaite to Reader to Leader framing (Sneha) |
| *(Salt) Verify whether the wiki dumps are solid | | * knitr integration (Sneha + Nate) |
| *(Nate) Generate 'contributor experience' variables for every edit in the dataset | | * plots (Salt) |
| *(Nate) Generate bot and admin data for the larger wiki dataset | | * Better pictures of message walls (Sneha) |
| | * Better explanations of why talk pages suck (Sneha) |
| | * Zotero streamlining |
|
| |
|
| ==Next Steps (Aug 15)== | | == Archive == |
| *(Salt) Edit build wikilist code to map filenames with message wall transition dates
| |
| *(Sneha) Continue preliminary analysis with 25 wikis
| |
| *(Nate) Continue investigating what dumps we can get from wikiteam
| |
|
| |
|
| ==Next Steps (Aug 1)==
| | * [[/Archived_tasks|Past next steps]] |
| *(Salt) Make file with mapping between urls and the newly scraped dumps. | |
| *(Nate with Mako's help) Figure out what's going on in the wiki mapping code
| |
| *(Sneha) Plan for visit in September
| |
| *(Sneha) Continue preliminary analysis with 25 wikis
| |
| | |
| == Retreat Tasks ==
| |
| * Document and organize the git repository.
| |
| * Data exploration / preliminary analysis.
| |
| | |
| == Next Steps (July 18) ==
| |
| * (Salt with Nate's help) Add new dumps to wikilist
| |
| * (Nate) Update wikiteam mapping.
| |
| * (Sneha) (Using wikiq data) Check that dumps, even if valid xml, have message wall data.
| |
| * (Sneha) create list of subsetting characteristics (inclusion criteria for Wikis) for study.
| |
| * (Sneha) create exploratory plots for a larger set of wikis of different sizes.
| |
| * (Sneha) Request new dumps for missing wikis.
| |
| | |
| == Next Steps (July 11) ==
| |
| * (Sneha) (Using wikiq data) Check that dumps, even if valid xml, have message wall data.
| |
| * (Sneha) create list of subsetting characteristics (inclusion criteria for Wikis) for study.
| |
| * (Sneha) create exploratory plots for a larger set of wikis of different sizes.
| |
| * (Sneha) Request new dumps for missing wikis.
| |
| * (Salt) Download wikis available on Special:statistics. '''Done'''
| |
| * (Nate) Scrape admin and bot edits using a script from Mako. '''Done'''
| |
| | |
| == Next Steps (June 27th) ==
| |
| * (Sneha) (Using wikiq data) Check that dumps, even if valid xml, have message wall data.
| |
| * (Sneha) Take a look namespaces 1200-1202 to understand what they mean. '''Done'''
| |
| * (Sneha) create list of subsetting characteristics (inclusion criteria for Wikis) for study.
| |
| * (Sneha) create exploratory plots for a larger set of wikis of different sizes.
| |
| * (Salt) Download wikis available on Special:statistics.
| |
| * (Salt) Request new dumps for missing wikis.
| |
| * (Nate) Scrape admin and bot edits using a script from Mako.
| |
| * (Nate) Finish identifying wikiteam mapping '''Done'''.
| |
| | |
| ==Next Steps (June 20th)==
| |
| * (Nate) Improve wiki list by identifying wikis that turn off the feature without turning on first (Done)
| |
| * (Nate) Get <strike>muppet wiki</strike> Dr. Horrible Wiki edit weeks for Sneha (Done)
| |
| * (Nate) Do brute force mapping using revision ids and and hashing texts (Done)
| |
| * (Sneha) Will play with Dr. Horrible data (Done)
| |
| * (Sneha) create list of subsetting characteristics for study
| |
| | |
| ==Next Steps (June 13th)==
| |
| | |
| * Build a new dataset of dumps of the ~4800 wikis (Salt/Nate) (May take more than a week to generate all the new dumps)
| |
|
| |
| * Build a msgwall version of the build_edit_weeks file from the anon_edits paper (Nate)
| |
| | |
| * Do analysis of alt history wiki and update (Sneha)
| |
| | |
| * Create list of criteria to identify wikis we want to use in this study (Sneha)
| |
| | |
| == Next Steps (June 6th)==
| |
| | |
| * Identify list of Wikis we will analyze from the tsv file.
| |
| | |
| * Attempt to obtain a good dump for each of these wikis. See [[CommunityData:Wikia Dumps]] for information.
| |
| | |
| ** This may depend on mapping between the urls in the tsv file and the dumps. Consider using HTTP redirects from the url under <siteinfo>.
| |
|
| |
| * Modify Wikiq to give an error message if the closing </mediawiki> tag is missing.
| |
| | |
| * Sneha to take a look althistory data from Nate.
| |
| | |
| * Nate will write a version of build_edit_weeks for the message wall project
| |
| | |
| * Check back next meeting Tuesday (June 13th)
| |