Editing Wiki language research

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 3: Line 3:
== Action Items ==
== Action Items ==


* eliminate bots from sample
* write rough draft of preliminary findings for 6/21
* find percent change for talk edits at median values of coefs for each language.


=== Undergrads ===
=== Undergrads ===
Line 15: Line 14:
* Week 1: Write initial analysis, get google doc - latex pipeline setup
* Week 1: Write initial analysis, get google doc - latex pipeline setup
* Week 2-3: Flag bot edits, pull new samples for coding based on updated percentiles, write new draft of analysis
* Week 2-3: Flag bot edits, pull new samples for coding based on updated percentiles, write new draft of analysis
* Week 3-6: Develop hypotheses and run analysis
* week 3-6: Develop hypotheses and run analysis
** Cross cultural deliberative practices
** Cross cultural deliberative practices
** Discussion structure
** Discussion structure
Line 23: Line 22:


== meeting logs & notes ==
== meeting logs & notes ==
=== 8-16-16 ===
JM: Processing the ML corpus with the talk page aggregation fixes.
=== 8-9-16 ===
JM: Working on re-processing the ML corpus to account for archived talk pages across the language editions.
All: Read up on Edgar Schein's organizational culture work and think about the ways in which it might apply to the ML communities and what the org cultures are that exist within the different language editions.
DG: Work on getting models of marginals across the quantiles
=== 7-8-16 ===
Considering small groups and small teams / workgroup literature.
Next steps: Read through the previous papers on wikis that are English, and see what the general results and findings are. Then use our corpus to see how much of this holds across language editions, and how much of this seems to be more uniquely English oriented.
=== 06-28-16 ===
Thinking about ways to do small group analyses within the various language editions. Consider different approaches to network construction here. It would also be good to think about this with respect to things like sequential data analysis techniques like chain revision graphs, etc.
=== 06-21-16 ===
JM, DG, and AS discussed approach for communicating findings and for next steps.  We might look into orgs, coordination, and collective intelligence lit.
=== 05-22-16 ===
=== 05-22-16 ===
DG: I spent some time looking at the data distributions and ran a bunch of models on the simple EN models overnight. The data for len_1 are reallllly long-tailed with very low frequencies -- this is causing the convergence issues. Below is a table of the simple model (len_1 ~ num_editors_1), run through a series of truncated data sets. The models will converge all the way up to removing the final data point out of the 4,077,819 data points we have. In other words, I was able to get convergence by dropping a single data point. Here's a quick table of the results from running the models:  
DG: I spent some time looking at the data distributions and ran a bunch of models on the simple EN models overnight. The data for len_1 are reallllly long-tailed with very low frequencies -- this is causing the convergence issues. Below is a table of the simple model (len_1 ~ num_editors_1), run through a series of truncated data sets. The models will converge all the way up to removing the final data point out of the 4,077,819 data points we have. In other words, I was able to get convergence by dropping a single data point. Here's a quick table of the results from running the models:  
Line 140: Line 114:


== project resources & links ==
== project resources & links ==
'''07-28-16'''
[https://docs.google.com/document/d/1DKaq6uZdMFiqmJoxgLIKCNTFOvUjmawNrcjec9P1ZHs/edit?usp=sharing CHI 2016 Rough Draft]
'''05-16-16'''
'''05-16-16'''


Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)