CommunityData:Message Walls: Difference between revisions

From CommunityData
(→‎Next Steps (June 20th): finished mapping between wikia wikis and wikiteam dumps)
Line 9: Line 9:
* Scrape admin and bot edits using a script from Mako
* Scrape admin and bot edits using a script from Mako
* Check that dumps, even if valid xml, have message wall data.
* Check that dumps, even if valid xml, have message wall data.
== Next Steps (June 27th) ==
* Take a look namespaces 1200-1202 to understand what they mean.
* (Sneha) create list of subsetting characteristics (inclusion criteria for Wikis) for study.
* Download wikis available on Special:statistics.
* Request new dumps for missing wikis.


==Next Steps (June 20th)==
==Next Steps (June 20th)==
Line 14: Line 22:
* (Nate) Get <strike>muppet wiki</strike> Dr. Horrible Wiki edit weeks for Sneha (Done)
* (Nate) Get <strike>muppet wiki</strike> Dr. Horrible Wiki edit weeks for Sneha (Done)
* (Nate) Do brute force mapping using revision ids and and hashing texts (Done)
* (Nate) Do brute force mapping using revision ids and and hashing texts (Done)
* (Sneha) Will play with muppet wiki data
* (Sneha) Will play with Dr. Horrible data (Done)
* (Sneha) create list of subsetting characteristics for study
* (Sneha) create list of subsetting characteristics for study  


==Next Steps (June 13th)==
==Next Steps (June 13th)==

Revision as of 20:55, 27 June 2017

Useful Resources


Task Management

Future Tasks

  • Scrape admin and bot edits using a script from Mako
  • Check that dumps, even if valid xml, have message wall data.

Next Steps (June 27th)

  • Take a look namespaces 1200-1202 to understand what they mean.
  • (Sneha) create list of subsetting characteristics (inclusion criteria for Wikis) for study.
  • Download wikis available on Special:statistics.
  • Request new dumps for missing wikis.


Next Steps (June 20th)

  • (Nate) Improve wiki list by identifying wikis that turn off the feature without turning on first (Done)
  • (Nate) Get muppet wiki Dr. Horrible Wiki edit weeks for Sneha (Done)
  • (Nate) Do brute force mapping using revision ids and and hashing texts (Done)
  • (Sneha) Will play with Dr. Horrible data (Done)
  • (Sneha) create list of subsetting characteristics for study

Next Steps (June 13th)

  • Build a new dataset of dumps of the ~4800 wikis (Salt/Nate) (May take more than a week to generate all the new dumps)
  • Build a msgwall version of the build_edit_weeks file from the anon_edits paper (Nate)
  • Do analysis of alt history wiki and update (Sneha)
  • Create list of criteria to identify wikis we want to use in this study (Sneha)

Next Steps (June 6th)

  • Identify list of Wikis we will analyze from the tsv file.
    • This may depend on mapping between the urls in the tsv file and the dumps. Consider using HTTP redirects from the url under <siteinfo>.
  • Modify Wikiq to give an error message if the closing </mediawiki> tag is missing.
  • Sneha to take a look althistory data from Nate.
  • Nate will write a version of build_edit_weeks for the message wall project
  • Check back next meeting Tuesday (June 13th)