Editing Intro to Programming and Data Science (Spring 2020)/Day 8 Coding Challenges
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 3: | Line 3: | ||
__NOTOC__ | __NOTOC__ | ||
In this project, we will explore a few ways to gather data using the Twitter API. Once we've done that, we will extend the example code to create our own dataset of tweets. | In this project, we will explore a few ways to gather data using the Twitter API. Once we've done that, we will extend the example code to create our own dataset of tweets. In next week's assignment, we will ask and answer questions with the data we've collected. | ||
== Goals == | == Goals == | ||
Line 18: | Line 18: | ||
===Download the Twitter API project=== | ===Download the Twitter API project=== | ||
* Download the following zip file: https://github.com/CommunityDataScienceCollective/twitter-cdsw/archive/master.zip | * Download the following zip file: https://github.com/CommunityDataScienceCollective/twitter-cdsw/archive/master.zip | ||
Line 38: | Line 36: | ||
=== Making your own notebooks === | ===Making your own notebooks=== | ||
we are using [http://www.tweepy.org/ tweepy], a python library that simplifies accessing the Twitter API. | we are using [http://www.tweepy.org/ tweepy], a python library that simplifies accessing the Twitter API. | ||
Line 54: | Line 52: | ||
This will enable your authenticated Twitter API calls via the variable <code>api</code> | This will enable your authenticated Twitter API calls via the variable <code>api</code> | ||
== Potential exercises == | |||
'''Topics and Trends''' | '''Topics and Trends''' | ||
Line 70: | Line 66: | ||
This section will require you to investigate the filter function in example 2 in more detail. | This section will require you to investigate the filter function in example 2 in more detail. | ||
# Get the last 50 tweets from | # Get the last 50 tweets from Ballard. | ||
# Get the last 50 tweets from Times Square. | # Get the last 50 tweets from Times Square. | ||
# Using timestamps, can you estimate whether people tweet more often in | # Using timestamps, can you estimate whether people tweet more often in Ballard or Times Square? | ||
# A | # A baseball game happened today (May 11) between the Seattle Mariners and the Tampa Bay Rays. Using two geo searches, see if you can tell which city hosted the game. Note: if you do this some other day, you should pick a new sporting event. | ||
'''Geolocation in the streaming | '''Geolocation in the streaming APi''' | ||
# Alter the streaming algorithm to include a "locations" filter. You need to use the order sw_lng, sw_lat, ne_lng, ne_lat for the four coordinates. (Recall the stop button will stop an active process like the stream.) | # Alter the streaming algorithm to include a "locations" filter. You need to use the order sw_lng, sw_lat, ne_lng, ne_lat for the four coordinates. (Recall the stop button will stop an active process like the stream.) | ||
# What are people tweeting about in Times Square today? (Bonus points: set up a bounding box around TS and around NYC as a whole.) | # What are people tweeting about in Times Square today? (Bonus points: set up a bounding box around TS and around NYC as a whole.) | ||
# Can you find words that are more likely to appear in | # Can you find words that are more likely to appear in Time's Square (hint: you'll need two bounding boxes)? | ||
# | # Oregon State is playing basketball against UC Berkeley. Set up a bounding box around Berkeley and Corvallis, Oregon. Can you identify tweets about basketball? Who tweets more about the game? Can you tell which team is the home team? | ||
Geolocation hint: You can use <code>d = api.search(geocode='[lng],[lat],5mi)</code> to get Tweets from a 5 mile radius around a point. Use Google or Bing maps to get a similar bounding box around Fenway Park. | Geolocation hint: You can use <code>d = api.search(geocode='[lng],[lat],5mi)</code> to get Tweets from a 5 mile radius around a point. Use Google or Bing maps to get a similar bounding box around Fenway Park. | ||
Line 91: | Line 87: | ||
# Identify the follower you have that also follows the most of your followers. | # Identify the follower you have that also follows the most of your followers. | ||
# How many handles follow you but none of your followers? | # How many handles follow you but none of your followers? | ||
# Repeat this for people you follow, rather than | # Repeat this for people you follow, rather than that follow you. | ||
== Congratulations!!!!== | == Congratulations!!!!== | ||
You now know how to capture data from Twitter that you can use in your research!!! Next workshop we'll play with some fun analytical tools. In the meantime, here are [[Twitter words of warning|a few words of caution about using Twitter data for science]]. | You now know how to capture data from Twitter that you can use in your research!!! Next workshop we'll play with some fun analytical tools. In the meantime, here are [[Twitter words of warning|a few words of caution about using Twitter data for science]]. | ||
[[Category:Spring_2016_series]] |