Editing Community Data Science Course (Spring 2023)/Week 5 coding challenges
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 4: | Line 4: | ||
* [[../Week 5 lecture notes]] | * [[../Week 5 lecture notes]] | ||
* | * The [Week 5 lecture notebook] {{forthcoming}} | ||
* The [Week 5 lecture video] {{forthcoming}} | |||
* The [ | |||
== #1 Wikipedia Page View API == | == #1 Wikipedia Page View API == | ||
Line 17: | Line 14: | ||
## Which has more total page views in 2022? | ## Which has more total page views in 2022? | ||
## Can you draw a visualization in a spreadsheet that shows this? (Again, provide a link.) | ## Can you draw a visualization in a spreadsheet that shows this? (Again, provide a link.) | ||
## | ## Where there years since 2015 when the less viewed page was viewed more? How many and which ones? | ||
## | ## Where their any months was this true? How many and which ones? | ||
## How about any days? How many? | ## How about any days? How many? | ||
# I've made [https://github.com/kayleachampion/spr23_CDSW/blob/main/curriculum/week5/list_of_washington_alternative_rocks_bands_wikipedia-2023-04-25.jsonl this file available] which includes list of more than 100 Wikipedia articles about alternative rock bands from Washington state that I built from [https://en.wikipedia.org/wiki/Category:Alternative_rock_groups_from_Washington_(state) this category in Wikipedia].[*] It's a <code>.jsonl</code> file. Download the file (click "raw" and then save the file onto your drive). Now read it in, and request monthly page view data from all of them. If you need some help with loading it in, I've included some sample code at the bottom of this page. | # I've made [https://github.com/kayleachampion/spr23_CDSW/blob/main/curriculum/week5/list_of_washington_alternative_rocks_bands_wikipedia-2023-04-25.jsonl this file available] which includes list of more than 100 Wikipedia articles about alternative rock bands from Washington state that I built from [https://en.wikipedia.org/wiki/Category:Alternative_rock_groups_from_Washington_(state) this category in Wikipedia].[*] It's a <code>.jsonl</code> file. Download the file (click "raw" and then save the file onto your drive). Now read it in, and request monthly page view data from all of them. If you need some help with loading it in, I've included some sample code at the bottom of this page. | ||
Line 26: | Line 23: | ||
== #2 Starting on your projects == | == #2 Starting on your projects == | ||
{{notice|If you are planning on collecting data from Reddit, please look into using the [https://pushshift.io Pushshift API] instead of the default Reddit API. The Pushshift API is not as up-to-date but it is targeted toward data scientists, not app-makers, and is | {{notice|If you are planning on collecting data from Reddit, please look into using the [https://pushshift.io Pushshift API] instead of the default Reddit API. The Pushshift API is not as up-to-date but it is targeted toward data scientists, not app-makers, and is much better suited to our needs in the class.}} | ||
In this section, you will take your first steps towards working with your project API. Many of these questions will not involve code, so just mark down your answers in cells in your notebook. | In this section, you will take your first steps towards working with your project API. Many of these questions will not involve code, so just mark down your answers in "markdown" cells in your notebook. Feel free to document any findings you think might be useful as you continue to work on your project; you might thank yourself later! | ||
Feel free to document any findings you think might be useful as you continue to work on your project; you might thank yourself later! | |||
# Identify an API you will (or might!) want to use for your project. | # Identify an API you will (or might!) want to use for your project. | ||
Line 46: | Line 39: | ||
== Notes == | == Notes == | ||
[*] You will probably not be shocked to hear that I collected this data from an API! I've included a Jupyter Notebook with the code to grab that data from | [*] You will probably not be shocked to hear that I collected this data from an API! I've included a Jupyter Notebook with the code to grab that data from the PetScan API online here. {{forthcoming}} | ||