DS4UX (Spring 2016)/Day 4 coding challenge: Difference between revisions

From CommunityData
No edit summary
No edit summary
Line 1: Line 1:
<!--
Each of the challenges this week will ask you to modify and work with code in the [[DS4UX_(Spring_2016)/Seattle_traffic|Seattle traffic project]] which you should have installed and begun working with in class.
Each of the challenges this week will ask you to modify and work with code in the [[Community Data Science Course (Spring 2015)/Wikipedia API projects|Wikipedia API projects]] which you should have installed and begun working with in class.


As always, it's not essential that you solve or get through all of these — I'm not grading your answers on these. That said, being able to work through at least many of them is a good sign that you have mastered the concepts for the week. It is always fine to collaborate or work together on these problem sets. The only thing I ask is that you do not broadcast answers before Sunday at midnight on Canvas.
This week, you will be required to attempt the first FIXME challenges in this list, and upload your solution scripts via Canvas. You will not be graded on whether your solutions are correct, efficient, or even functional. But you will be marked down if you don't submit your solutions—so be sure to spend time attempting these challenges!
 
Being able to work through at least many of these challenges is a very good sign that you have mastered the concepts the important Python concepts we've covered so far. As always, it is fine to collaborate or work together on these problem sets, as long as you submit your solutions separately. And this week, please don't broadcast your responses via Canvas before Sunday night.


== Challenges ==
== Challenges ==


# Save the revision metadata printed in <code>wikipedia1-2.py</code> (i.e., the material already being printed out) to a file called "wikipedia_revisions.tsv".
# What day between January 1, 2014 and March 31, 2016 saw the most total traffic on the Burke-Gilman trail?  
# Print out the revision ids and edit summaries (i.e., <code>comment</code>) of each revision for the article on Python.
# What was the busiest hour of ''any'' day for northbound bike traffic? How about southbound pedestrian traffic?
# Find out what other data or metadata you can print out for a revision for an article.
# How much southbound traffic does the Burke-Gilman get, on average, during Morning commute hours? How much does it get during evening commute hours?
# Which article is in more categories? [[:wiki:Python (programming language)|Python (programming language)]] or [[:wiki:R (programming language)|R (programming language)]]? 
::*''Note: for this question, assume morning commute hours start at 7am and end at 10am, and that evening commute hours start at 4pm and end at 7pm. You can also consider these to be hours to be "commute hours" even on weekends, since our data doesn't contain days of the week.''
# Find out how many revisions to the article on "Python (programming language)" were made by user "Peterl"? How about "Hfastedge"?
 
# How would you use the API to find out how many revisions/edits the user "Benjamin Mako Hill" has made to Wikipedia?
=== Bonus challenges ===
# Can you build a list of all of the articles edited by "Benjamin Mako Hill"? What is the article with the longest title that user Benjamin Mako Hill has edited?
You don't need to complete these to get a "complete" grade for this assignment, but you should attempt them anyway!
# How many edits to the article "Python (programming language)" where made in 2014?
 
:1. What day of the week does the Burke Gilman experience the most overall traffic, on average? The most pedestrian traffic?
::*''Hint: it will help to know that January 1, 2014 was a Wednesday!''
:2. Which gets more inbound bike traffic per day, on average—the Burke-Gilman trail or the Mountain to Sound trail?
::*By 'inbound' I mean ''towards'' central Seattle. For BGT, inbound means southbound, for MTS, inbound means westbound.
::*To answer this question, you will need to alter <code>bgt_traffic.py</code> so that it takes BOTH <code>bgt_bike_and_peds.csv</code> and <code>mts_bike_and_peds.csv</code> as input, and converts them into TWO separate dictionaries. <code>mts_bike_and_peds.csv</code> is also included in the <code>bgt-traffic.zip</code> file you downloaded for this week's challenges.  
::*The two files are in the same format, BUT there are a few differences that you need to account for (Hint: make sure to check the column titles and the order of the rows in each dataset!)


;Here's a much more complicated challenge but a fun one that you know enough to solve: Check out the game [http://kevan.org/catfishing.php Catfishing] which shows you categories and has you guess an article. Write a version that uses the Wikipedia API. For example, pick 5 articles and write a program that will randomly show the categories for one of those articles and to ask you to guess the article. Read the guess with <code>input()</code> and let the user know if they go it right or wrong!
-->
[[Category:DS4UX (Spring 2016)]]
[[Category:DS4UX (Spring 2016)]]

Revision as of 03:10, 18 April 2016

Each of the challenges this week will ask you to modify and work with code in the Seattle traffic project which you should have installed and begun working with in class.

This week, you will be required to attempt the first FIXME challenges in this list, and upload your solution scripts via Canvas. You will not be graded on whether your solutions are correct, efficient, or even functional. But you will be marked down if you don't submit your solutions—so be sure to spend time attempting these challenges!

Being able to work through at least many of these challenges is a very good sign that you have mastered the concepts the important Python concepts we've covered so far. As always, it is fine to collaborate or work together on these problem sets, as long as you submit your solutions separately. And this week, please don't broadcast your responses via Canvas before Sunday night.

Challenges

  1. What day between January 1, 2014 and March 31, 2016 saw the most total traffic on the Burke-Gilman trail?
  2. What was the busiest hour of any day for northbound bike traffic? How about southbound pedestrian traffic?
  3. How much southbound traffic does the Burke-Gilman get, on average, during Morning commute hours? How much does it get during evening commute hours?
  • Note: for this question, assume morning commute hours start at 7am and end at 10am, and that evening commute hours start at 4pm and end at 7pm. You can also consider these to be hours to be "commute hours" even on weekends, since our data doesn't contain days of the week.

Bonus challenges

You don't need to complete these to get a "complete" grade for this assignment, but you should attempt them anyway!

1. What day of the week does the Burke Gilman experience the most overall traffic, on average? The most pedestrian traffic?
  • Hint: it will help to know that January 1, 2014 was a Wednesday!
2. Which gets more inbound bike traffic per day, on average—the Burke-Gilman trail or the Mountain to Sound trail?
  • By 'inbound' I mean towards central Seattle. For BGT, inbound means southbound, for MTS, inbound means westbound.
  • To answer this question, you will need to alter bgt_traffic.py so that it takes BOTH bgt_bike_and_peds.csv and mts_bike_and_peds.csv as input, and converts them into TWO separate dictionaries. mts_bike_and_peds.csv is also included in the bgt-traffic.zip file you downloaded for this week's challenges.
  • The two files are in the same format, BUT there are a few differences that you need to account for (Hint: make sure to check the column titles and the order of the rows in each dataset!)