Editing Human Centered Data Science (Fall 2019)/Assignments

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 220: Line 220:
You are also expected to write a short reflection on the project, that focuses on how both your findings from this analysis and the process you went through to reach those findings helps you understand the causes and consequences of biased data in large, complex data science projects.
You are also expected to write a short reflection on the project, that focuses on how both your findings from this analysis and the process you went through to reach those findings helps you understand the causes and consequences of biased data in large, complex data science projects.


''A repository with a README framework and examples of querying the ORES datastore in R and Python can be found [https://github.com/Ironholds/data-512-a2 here]''


==== Getting the article and population data ====
==== Getting the article and population data ====
Line 233: Line 234:


==== Getting article quality predictions ====
==== Getting article quality predictions ====
''A repository with a template README file and examples of querying the ORES datastore in R and Python can be found [https://github.com/Ironholds/data-512-a2 here]''


Now you need to get the predicted quality scores for each article in the Wikipedia dataset. We're using a machine learning system called [https://www.mediawiki.org/wiki/ORES ORES] ("Objective Revision Evaluation Service"). ORES estimates the quality of an article (at a particular point in time), and assigns a series of probabilities that the article is in one of 6 quality categories. The options are, from best to worst:
Now you need to get the predicted quality scores for each article in the Wikipedia dataset. We're using a machine learning system called [https://www.mediawiki.org/wiki/ORES ORES] ("Objective Revision Evaluation Service"). ORES estimates the quality of an article (at a particular point in time), and assigns a series of probabilities that the article is in one of 6 quality categories. The options are, from best to worst:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)