CommunityData:Message Walls Code

We are adapting the code the Mako and Aaron worked on for the AnonEdits paper. Here is the status of code that should work with the MessageWalls dataset:

Ready
lib-01-build_wiki_list-stage1.R

01-build_edit_weeks.R

On Deck
Filter out bot/admin edits in wikiweeks.

Get questions from wikiq.

Get blocked users from AP

Things to find out
How much variance is there in the edit distributions among Wikia wikis. Estimate critical points of edit distributions. How much variation is there?

Principles for defining outcomes

 * Experience or newcomer definition not depends on something MW changes

Defining newcomers
Nate and Sneha are arguing about this! There are two options we've considered for defining newcomer edits


 * Any edit made by a user who made their first edit m months ago is a 'newcomer edit'
 * Any edit made by any editor who has made less than n edits by the cutoff date is a 'newcomer edit'

We've reached a cautious consensus in taking the intersection of these two measures as a determinant of newcomer status.

Dependent Variables

 * For each wiki week:
 * Proportion of edits made by accounts that are less than one month old.
 * total number of edits made to talk pages/message walls -- DONE
 * number of edits made to talk pages/message walls by 'new users
 * exclude blocked folks from newcomers
 * number of reverts / reverted newcomers
 * number of edits made to talk pages/message walls by 'veteran users'
 * number of edits made by a newcomer on a veteran talk page/message walls, or vice versa
 * total number of questions asked on talk pages/message walls
 * number of questions asked by newcomers
 * total number of edits made to article pages -- DONE
 * number of edits made to article pages by newcomers
 * number of edits made to article pages by veterans