Community Data Science Course (Spring 2015)/Day 4 Coding Challenges

From CommunityData

Each of the challenges this week will ask you to modify and work with code in the Wikipedia API projects which you should have installed and begun working with in class.

As always, it's not essential that you solve or get through all of these — I'm not grading your answers on these. That said, being able to work through at least many of them is a good sign that you have mastered the concepts for the week. It is always fine to collaborate or work together on these problem sets. The only thing I ask is that you do not broadcast answers before Sunday at midnight on Canvas.

Challenges[edit]

  1. Save the revision metadata printed in wikipedia1-2.py (i.e., the material already being printed out) to a file called "wikipedia_revisions.tsv".
  2. Print out the revision ids and edit summaries (i.e., comment) of each revision for the article on Python.
  3. Find out what other data or metadata you can print out for a revision for an article.
  4. Which article is in more categories? Python (programming language) or R (programming language)?
  5. Find out how many revisions to the article on "Python (programming language)" were made by user "Peterl"? How about "Hfastedge"?
  6. How would you use the API to find out how many revisions/edits the user "Benjamin Mako Hill" has made to Wikipedia?
  7. Can you build a list of all of the articles edited by "Benjamin Mako Hill"? What is the article with the longest title that user Benjamin Mako Hill has edited?
  8. How many edits to the article "Python (programming language)" where made in 2014?
Here's a much more complicated challenge but a fun one that you know enough to solve
Check out the game Catfishing which shows you categories and has you guess an article. Write a version that uses the Wikipedia API. For example, pick 5 articles and write a program that will randomly show the categories for one of those articles and to ask you to guess the article. Read the guess with input() and let the user know if they go it right or wrong!