Community Data Science Course (Spring 2015)/Day 6 Coding Challenges: Difference between revisions

Latest revision as of 02:10, 8 May 2015

Who are my followers?[edit]

Write a program to find out how many people a particular user follows?
For each of your followers, find out how many followers they have.
Make a "famous ratio" for a given user which I'll define as 'number of followers a person has divided by number of people they follow. Try out @makoshark, and @pontifex (the Pope). Who is higher?
~~Identify the follower you have that also follows the most of your followers.~~
~~How many users follow you but none of your followers?~~
~~Repeat these analyses for people you follow, rather than that follow you.~~
Identify the "famous ratio" for every one of your followers or friends? Who has the highest one?

Topics and Trends[edit]

Modify twitter3.py to produce a list of 1000 tweets about a topic of your choice.
Look at those tweets. How does twitter interpret a two word query like "data science"
Do the previous step but eliminate retweets [hint: look at the tweet object!]
For each tweet original tweet, list the number of times you see it retweeted.
Get a list of the URLs that are associated with your topic using Twitter.

Geolocation[edit]

Alter the streaming code to include a "locations" filter. You need to use the order sw_lng, sw_lat, ne_lng, ne_lat for the four coordinates.
What are people tweeting about in Times Square today?
Set up a bounding box around TS and around NYC as a whole.
Do "static" (i.e., not using the streaming API) geolocation search using code like this:

d = api.search(geocode='37.781157,-122.398720,1mi')

@@ Line 1: / Line 1: @@
-=== Potential exercises ===
+== Who are my followers? ==
-'''Who are my followers?'''
+# Write a program to find out how many people a particular user follows?
+# For each of your followers, find out how many followers they have.
+# Make a "famous ratio" for a given user which I'll define as '''number of followers a person has divided by number of people they follow.'' Try out @makoshark, and @pontifex (the Pope). Who is higher?
+# <strike>Identify the follower you have that also follows the most of your followers.</strike>
+# <strike>How many users follow you but none of your followers?</strike>
+# <strike>Repeat these analyses for people you follow, rather than that follow you.</strike>
+# Identify the "famous ratio" for every one of your followers or friends? Who has the highest one?
-) Use sample 2 to get your followers.
+== Topics and Trends ==
-) For each of your followers, get *their* followers (investigate time.sleep to throttle your computation)
+# Modify <code>twitter3.py</code> to produce a list of 1000 tweets about a topic of your choice.
+# Look at those tweets. How does twitter interpret a two word query like "data science"
+# Do the previous step but eliminate retweets [hint: look at the tweet object!]
+# For each tweet original tweet, list the number of times you see it retweeted.
+# Get a list of the URLs that are associated with your topic using Twitter.
-) Identify the follower you have that also follows the most of your followers.
+== Geolocation ==
-) How many handles follow you but none of your followers?
+# Alter the streaming code to include a "locations" filter. You need to use the order sw_lng, sw_lat, ne_lng, ne_lat for the four coordinates.
+# What are people tweeting about in Times Square today?
+# Set up a bounding box around TS and around NYC as a whole.
+# Do "static" (i.e., not using the streaming API) geolocation search using code like this:
-) Repeat this for people you follow, rather than that follow you.
+ d = api.search(geocode='37.781157,-122.398720,1mi')
-'''Topics and Trends'''
-) Use sample 3 to produce a list of 1000 tweets about a topic.
-) Look at those tweets. How does twitter interpret a two word query like "data science"
-) Eliminate retweets [hint: look at the tweet object!]
-) For each tweet original tweet, list the number of times you see it retweeted.
-) Get a list of the URLs that are associated with your topic.
-'''Geolocation'''
-) Alter the streaming algorithm to include a "locations" filter. You need to use the order sw_lng, sw_lat, ne_lng, ne_lat for the four coordinates.
-) What are people tweeting about in Times Square today?
-.5) Bonus points: set up a bounding box around TS and around NYC as a whole.
-Can you find words that are more likely to appear in TS?
-) UW is playing Arizona in football today. Set up a bounding box around the Arizona stadium and around UW. Can you identify tweets about football? Who tweets more about the game?
-# you can use d = api.search(geocode='37.781157,-122.398720,1mi')  to do
-# static geo search.