Editing HCDS (Fall 2017)/Assignments
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 1: | Line 1: | ||
<noinclude> | |||
<div style="font-family:Rockwell,'Courier Bold',Courier,Georgia,'Times New Roman',Times,serif; min-width:10em;"> | |||
<div style="float:left; width:100%; margin-right:2%;"> | |||
{{Link/Graphic/Main/2 | |||
|highlight color= 27666b | |||
|color=460c40 | |||
|link= | |||
|image= | |||
|text-align=left | |||
|top font-size= 1.1em | |||
|top color=FFF | |||
|line color=FFF | |||
|top text=This page is a work in progress. | |||
|bottom font-size= 1em | |||
|bottom color= FFF | |||
|bottom text= | |||
|line= none | |||
}}</div></div> | |||
</noinclude> | |||
__FORCETOC__ | __FORCETOC__ | ||
Line 16: | Line 35: | ||
;Scheduled assignments | ;Scheduled assignments | ||
* '''A1 - 5 points''' (due Week 4): Data curation (programming/analysis) | * '''A1 - 5 points''' (due Week 4): Data curation (programming/analysis) | ||
* '''A2 - 10 points''' (due Week | * '''A2 - 10 points''' (due Week 5): Sources of bias in data (programming/analysis) | ||
* '''A3 - 10 points''' (due Week 7): Final project plan (written) | * '''A3 - 10 points''' (due Week 7): Final project plan (written) | ||
* '''A4 - 10 points''' (due Week 9): Crowdwork self-ethnography (written) | * '''A4 - 10 points''' (due Week 9): Crowdwork self-ethnography (written) | ||
Line 179: | Line 198: | ||
The goal of this assignment is to explore the concept of 'bias' through data on Wikipedia articles - specifically, articles on political figures from a variety of countries. For this assignment, you will combine a dataset of Wikipedia articles with a dataset of country populations, and use a machine learning service called ORES to estimate the quality of each article. | The goal of this assignment is to explore the concept of 'bias' through data on Wikipedia articles - specifically, articles on political figures from a variety of countries. For this assignment, you will combine a dataset of Wikipedia articles with a dataset of country populations, and use a machine learning service called ORES to estimate the quality of each article. | ||
You are expected to perform an analysis of how the ''coverage'' of politicians on Wikipedia and the ''quality'' of articles about politicians varies between countries. Your analysis will consist of a series of | You are expected to perform an analysis of how the ''coverage'' of politicians on Wikipedia and the ''quality'' of articles about politicians varies between countries. Your analysis will consist of a series of visualizations that show: | ||
# the countries with the greatest and least coverage of politicians on Wikipedia compared to their population. | # the countries with the greatest and least coverage of politicians on Wikipedia compared to their population. | ||
# the countries with the highest and lowest proportion of high quality articles about politicians. | # the countries with the highest and lowest proportion of high quality articles about politicians. | ||
You are also expected to write a short reflection on the project, that describes how this assignment helps you understand the causes and consequences of bias on Wikipedia. | You are also expected to write a short (1-2 paragraph) reflection on the project, that describes how this assignment helps you understand the causes and consequences of bias on Wikipedia. | ||
==== Getting the article and population data ==== | ==== Getting the article and population data ==== | ||
Line 241: | Line 260: | ||
* if a country has 10 articles about politicians, and 2 of them are FA or GA class articles, then the percentage of high-quality articles would be 20%. | * if a country has 10 articles about politicians, and 2 of them are FA or GA class articles, then the percentage of high-quality articles would be 20%. | ||
==== | ==== Visualization ==== | ||
The | The visualization should be pretty straightforward. Produce four visualizations that show: | ||
#10 highest-ranked countries in terms of number of politician articles as a proportion of country population | #10 highest-ranked countries in terms of number of politician articles as a proportion of country population | ||
#10 lowest-ranked countries in terms of number of politician articles as a proportion of country population | #10 lowest-ranked countries in terms of number of politician articles as a proportion of country population | ||
Line 248: | Line 267: | ||
#10 lowest-ranked countries in terms of number of GA and FA-quality articles as a proportion of all articles about politicians from that country | #10 lowest-ranked countries in terms of number of GA and FA-quality articles as a proportion of all articles about politicians from that country | ||
We recommend using bar charts to visualize your data. | |||
In order to complete the assignment correctly and receive full credit, your graphs will need to be the right scale to view the data; all units, axes, and values should be clearly labeled; and the graph should possess a key and a title. You must also generate a .png or .jpeg formatted image of your final graphs. | |||
You may choose to graph the data in Python, in your notebook. If you decide to use Google Sheet or some other open, public data visualization platform to build your graphs, link to them in the README, and make sure sharing settings allow anyone who clicks on the links to view the graphs and download the data! | |||
==== Writeup ==== | ==== Writeup ==== | ||
Line 257: | Line 280: | ||
#Create the data-512-a2 repository on GitHub w/ your code and data. | #Create the data-512-a2 repository on GitHub w/ your code and data. | ||
#Complete and add your README and LICENSE file. | #Complete and add your README and LICENSE file. | ||
#Submit the link to your GitHub repo to: https://canvas.uw.edu/courses/1174178/assignments/ | #Submit the link to your GitHub repo to: https://canvas.uw.edu/courses/1174178/assignments/3876066 | ||
==== Required deliverables ==== | ==== Required deliverables ==== | ||
A directory in your GitHub repository called <tt>data-512-a2</tt> that contains the following files: | A directory in your GitHub repository called <tt>data-512-a2</tt> that contains the following files: | ||
:# 1 final data file in CSV format that follows the formatting conventions. | :# 1 final data file in CSV format that follows the formatting conventions. | ||
:# 1 Jupyter notebook named <tt>hcds-a2-bias</tt> that contains all code as well as information necessary to understand each programming step, as well as your writeup (if you have not included it in the README) | :# 1 Jupyter notebook named <tt>hcds-a2-bias</tt> that contains all code as well as information necessary to understand each programming step, as well as your writeup (if you have not included it in the README). | ||
:# 1 README file in .txt or .md format that contains information to reproduce the analysis, including data descriptions, attributions and provenance information, and descriptions of all relevant resources and documentation (inside and outside the repo) and hyperlinks to those resources, and your writeup (if you have not included it in the notebook). | :# 1 README file in .txt or .md format that contains information to reproduce the analysis, including data descriptions, attributions and provenance information, and descriptions of all relevant resources and documentation (inside and outside the repo) and hyperlinks to those resources, and your writeup (if you have not included it in the notebook). | ||
:# 1 LICENSE file that contains an [https://opensource.org/licenses/MIT MIT LICENSE] for your code. | :# 1 LICENSE file that contains an [https://opensource.org/licenses/MIT MIT LICENSE] for your code. | ||
:# 1 .png or .jpeg image of your visualization. | |||
==== Helpful tips ==== | ==== Helpful tips ==== | ||
Line 275: | Line 299: | ||
=== A3: Final project plan === | === A3: Final project plan === | ||
For this assignment, you will write up a study plan for your final class project. The plan will cover a variety of details about your final project, including what data you will use, what you will do with the data (e.g. statistical analysis, train a model), what results you expect or intend, and most importantly, why your project is interesting or important (and to whom, besides yourself). | For this assignment, you will write up a study plan for your final class project. The plan will cover a variety of details about your final project, including what data you will use, what you will do with the data (e.g. statistical analysis, train a model), what results you expect or intend, and most importantly, why your project is interesting or important (and to whom, besides yourself). | ||
=== A4: Crowdwork ethnography === | === A4: Crowdwork self-ethnography === | ||
For this assignment, you will go undercover as a member of the Amazon Mechanical Turk community. You will | For this assignment, you will go undercover as a member of the Amazon Mechanical Turk community. You will perform assigned tasks, participate (or lurk) in Turker discussion forums, and write an ethnographic account of your experience as a human-in-the-loop of data science. | ||
=== A5: Final project presentation === | === A5: Final project presentation === |