Community Data Science Course (Spring 2019)/Day 3 Notes

Online Data Sets: An Important Question
Can you get bulk access to data?

Bad Signs

You must authenticate as a particular user in order to access data, and you can only see data for that user.

For example: you must log into instagram's api as a particular user

Look at this link!

Good signs

The organization owning the data wants everyone to access it. Like wikipedia or most government data.

You may have to authenticate as a particular user, but you can access general data.

For example: once you log into Twitter, you can get all tweets about a place

Twitter API Docs

Dictionaries

 * Use dictionaries to store key/value pairs.
 * Dictionaries do not guarantee ordering.
 * A given key can only have one value, but multiple keys can have the same value.

Initialization
>>> my_dict = {} >>> my_dict {} >>> your_dict = {"Alice" : "chocolate", "Bob" : "strawberry", "Cara" : "mint chip"} >>> your_dict {'Bob': 'strawberry', 'Cara': 'mint chip', 'Alice': 'chocolate'}

Adding elements to a dictionary
>>> your_dict["Dora"] = "vanilla" >>> your_dict {'Bob': 'strawberry', 'Cara': 'mint chip', 'Dora': 'vanilla', 'Alice': 'chocolate'}

Accessing elements of a dictionary
>>> your_dict["Alice"] 'chocolate' >>> your_dict.get("Alice") 'chocolate'

>>> your_dict["Eve"] Traceback (most recent call last): File " ", line 1, in KeyError: 'Eve' >>> "Eve" in your_dict False >>> "Alice" in your_dict True >>> your_dict.get("Eve") >>> person = your_dict.get("Eve") >>> print(person) None >>> print(type(person))  >>> your_dict.get("Alice") 'chocolate'

Changing elements of a dictionary
>>> your_dict["Alice"] = "coconut" >>> your_dict {'Bob': 'strawberry', 'Cara': 'mint chip', 'Dora': 'vanilla', 'Alice': 'coconut'}

Histograms
Challenge: using wordplay example from last week, count the number of words that start with each letter.

This kind of problem is very common Data Science, and it is easy with a dictionary.

(note: I will post the solution after class)

For-loops and dictionaries
There are two common ways to iterate through dictionaries:

>>> ages = {'Tommy': 34, Heather: 30, 'Joanna': 20} >>> for key in ages: >>>    print(key + " is " + str(ages[key]) + " years old")

>>> for key, value in ages.items: >>>    print(key + " is " + str(value) + " years old")