Community Data Science Course (Spring 2023)/Week 5 coding challenges
From CommunityData
< Community Data Science Course (Spring 2023)
Revision as of 01:47, 25 April 2023 by Benjamin Mako Hill (talk | contribs) (Created page with "There's actually nothing to download this time so you simply start with a fresh Jupyter notebook! Be sure to give a nice descriptive name, as always. Although there's nothing to download, you will likely want to look at the following resources when working through the first half of these these: * ../Week 5 lecture notes * The [Week 5 lecture notebook] * The [Week 5 lecture video] == #1 Wikipedia Page View API == # Identify a famous person that you are interested...")
There's actually nothing to download this time so you simply start with a fresh Jupyter notebook! Be sure to give a nice descriptive name, as always.
Although there's nothing to download, you will likely want to look at the following resources when working through the first half of these these:
- Community Data Science Course (Spring 2023)/Week 5 lecture notes
- The [Week 5 lecture notebook]
- The [Week 5 lecture video]
#1 Wikipedia Page View API
- Identify a famous person that you are interested in and collect page view data on that person. Generate a time-series visualization and include a link to it in your notebook.
- Identify 2 other languages editions of Wikipedia that have articles on that person. Collect page view data on the article in other languages and create a single visualization that shows how the dynamics and similar and/or different.
- Collect page view data on Marvel Comics and DC Comics in Wikipedia. (If you'd rather replace these examples with some other comparison of popular rivals, that's fine.)
- Which has more total page views in 2022?
- Can you draw a visualization of this?
- Where there years since 2015 when the less viewed page was viewed more? How many and which ones?
- Where their any months was this true? How many and which ones?
- How about any days? How many?
- I've made this file available which a list of several hundred titles of Wikipedia articles about Harry Potter [Forthcoming]. Can you download this file, read it in, and request monthly page view data from all of them?
- Once you've done this, sum up all of the page views from all of the pages and print out a TSV file with these total numbers.
- Make a time series graph of these numbers and include a link in your notebook.