DS4UX (Spring 2016)/Day 6 coding challenge: Difference between revisions

From CommunityData
No edit summary
No edit summary
Line 12: Line 12:
:2. How many edits did Panama Papers receive, on average, in its first two weeks?
:2. How many edits did Panama Papers receive, on average, in its first two weeks?
:3. What hour in the first two weeks had the highest number of edits?
:3. What hour in the first two weeks had the highest number of edits?
:4. Who were the top editors during that hour?
:4. Who were the top 3 editors during that hour?
:5. What day in the first two weeks had the most views?
:5. What day in the first two weeks had the most views?
:6. Write a script that generates daily edit and view counts for Panama Papers over its first 30 days of existence, and prints them to a CSV or TSV file in reverse-chronological order. You file should have three colums with the headers "date", "edits" and "views".
:6. Write a script that generates daily edit and view counts for Panama Papers over its first 30 days of existence, and prints them to a CSV or TSV file in reverse-chronological order. You file should have three colums with the headers "date", "edits" and "views".

Revision as of 01:19, 5 May 2016

Each of the challenges this week are related to the Panama Papers project which you should have installed and begun working with in class.

This week, you will be required to attempt the first 5 challenges in this list using Python and the requests module to gather data from APIs. You must upload your solution scripts via Canvas before class. You will NOT be graded on whether your solutions are correct, efficient, or even functional—just on whether you turn in an attempt at a solution that shows you tried. You WILL be marked down if you don't submit your solutions—so be sure to spend time attempting these challenges!

You do NOT need to complete and turn in your answers to the bonus challenges (#6, #7, and #8). You will not be graded on these. But if you do attempt them, I'd love to see your solutions!

Being able to work through at least many of these challenges is a very good sign that you have mastered the important Python concepts we've covered so far. As always, it is fine to collaborate or work together on these problem sets, as long as you submit your solutions separately. And this week, please don't broadcast your responses via Canvas before Sunday night.


1. Which editor has made the total most edits to the article Panama Papers so far?
2. How many edits did Panama Papers receive, on average, in its first two weeks?
3. What hour in the first two weeks had the highest number of edits?
4. Who were the top 3 editors during that hour?
5. What day in the first two weeks had the most views?
6. Write a script that generates daily edit and view counts for Panama Papers over its first 30 days of existence, and prints them to a CSV or TSV file in reverse-chronological order. You file should have three colums with the headers "date", "edits" and "views".


Bonus challenges
6. Write a script that generates daily page view counts for Panama Papers, broken down by desktop and mobile (app + web) access methods over its first 30 days of existence, output them to a CSV file that you can open in Excel or a similar spreadsheet program, and use your data to create a graph of pagesviews by access method over time. Make sure your graph follows sound information design principles!
7. Write a script that generates daily page view counts for Panama Papers on German (de.wikipedia.org), English (en.wikipedia.org), and Spanish (es.wikipedia.org) Wikipedias. Output your results to CSV and graph them per the instructions for Challenge #6 above.
Hint: the title of this article is in English across all three Wikipedias.