Latest revision |
Your text |
Line 1: |
Line 1: |
| == Review of some important Week 2 concepts ==
| | [[File:Highfivekitten.jpeg|200px|thumb|In which you learn how to use Python and web APIs to meet the likes of her!]] |
| ===Lists===
| |
|
| |
|
| * Use lists to store data where order matters.
| | == Lecture Outline == |
| * Lists are indexed starting with 0.
| | ;Introduction and context |
|
| |
|
| ====List initialization====
| | * You can write some tools in Python now. Congratulations! |
| | * Today we'll learn how to find/create data sets |
| | * Next week we'll get into data science (asking and answering questions) |
|
| |
|
| >>> my_list = []
| |
| >>> my_list
| |
| []
| |
| >>> your_list = ["a", "b", "c", 1, 2, 3]
| |
| >>> your_list
| |
| ['a', 'b', 'c', 1, 2, 3]
| |
|
| |
|
| ====Access and adding elements to a list====
| | ;Outline: |
|
| |
|
| >>> len(my_list)
| | * What is an API? |
| 0
| | * How do we use one to fetch interesting datasets? |
| >>> my_list[0]
| | * How do we write programs that use the internet? |
| Traceback (most recent call last):
| | * How can we use the placekitten API to fetch kitten pictures? |
| File "<stdin>", line 1, in <module>
| | * Introduction to structured data (JSON) |
| IndexError: list index out of range
| | * How do we use APIs in general? |
| >>> my_list.append("Alice")
| |
| >>> my_list
| |
| ['Alice']
| |
| >>> len(my_list)
| |
| 1
| |
| >>> my_list[0]
| |
| 'Alice'
| |
| >>> my_list.insert(0, "Amy")
| |
| >>> my_list
| |
| ['Amy', 'Alice']
| |
|
| |
|
| >>> my_list = ['Amy', 'Alice']
| |
| >>> 'Amy' in my_list
| |
| True
| |
| >>> 'Bob' in my_list
| |
| False
| |
|
| |
|
| ====Changing elements in a list====
| | ;What is a (web) API? |
|
| |
|
| >>> your_list = []
| | * API: a structured way for programs to talk to each other (aka an interface for programs) |
| >>> your_list.append("apples")
| | * Web APIs: like a website your programs can visit (you:a website::your program:a web API) |
| >>> your_list[0]
| |
| 'apples'
| |
| >>> your_list[0] = "bananas"
| |
| >>> your_list
| |
| ['bananas']
| |
|
| |
|
| ====Slicing lists====
| |
|
| |
|
| >>> her_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
| | ; How do we use an API to fetch datasets? |
| >>> her_list[0]
| |
| 'a'
| |
| >>> her_list[0:3]
| |
| ['a', 'b', 'c']
| |
| >>> her_list[:3]
| |
| ['a', 'b', 'c']
| |
| >>> her_list[-1]
| |
| 'h'
| |
| >>> her_list[5:]
| |
| ['f', 'g', 'h']
| |
| >>> her_list[:]
| |
| ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
| |
|
| |
|
| ==== sorting lists ====
| | Basic idea: your program sends a request, the API sends data back |
| | * Where do you direct your request? The site's API endpoint. |
| | ** For example: Wikipedia's web API endpoint is http://en.wikipedia.org/w/api.php |
| | * How do I write my request? Put together a URL; it will be different for different web APIs. |
| | ** Check the documentation, look for code samples |
| | * How do you send a request? |
| | ** Python has modules you can use, like <code>requests</code> (they make HTTP requests) |
| | * What do you get back? |
| | ** Structured data (usually in the JSON format) |
| | * How do you understand (i.e. parse) the data? |
| | ** There's a module for that! |
|
| |
|
| Use <code>.sort()</code> to sort a list:
| |
|
| |
|
| >>> names = ["Eliza", "Joe", "Henry", "Harriet", "Wanda", "Pat"]
| | ; How do we write Python programs that make web requests? |
| >>> names.sort()
| |
| >>> names
| |
| ['Eliza', 'Harriet', 'Henry', 'Joe', 'Pat', 'Wanda']
| |
| >>> names.sort(reverse=True)
| |
| ['Wanda', 'Pat', 'Joe', 'Henry', 'Harriet', 'Eliza']
| |
|
| |
|
| ==== Getting the maximum and minimum values from a list ====
| | To use APIs to build a dataset we will need: |
| | * all our tools from last session: variables, etc |
| | * the ability to open urls on the web |
| | * the ability to create custom URLS |
| | * the ability to save to files |
| | * the ability to understand (i.e., parse) JSON data that APIs usually give us |
|
| |
|
| >>> numbers = [0, 3, 10, -1]
| |
| >>> max(numbers)
| |
| 10
| |
| >>> min(numbers)
| |
| -1
| |
|
| |
|
| | ; New programming concepts: |
|
| |
|
| | * interpolate variables into a string using % and %()s |
| | * requests |
| | * open files and write to them |
|
| |
|
| == New concepts for Week 3 exercises and challenges ==
| |
|
| |
|
| ===More string functions===
| | ; How do we use an API to fetch kitten pictures? |
|
| |
|
| ==== Formatting strings ====
| | [http://placekitten.com/ placekitten.com] |
| Formatting strings makes it much easier to combine alphanumeric characters and other types of object (like ints, floats, and bools) and do things with them—like print!
| | * API that takes specially crafted URLs and gives appropriately sized picture of kittens |
| | * Exploring placekitten in a browser: |
| | ** visit the API documentation |
| | ** kittens of different sizes |
| | ** kittens in greyscale or color |
| | * Now we write a small program to grab an arbitrary square from placekitten by asking for the size on standard in: [http://mako.cc/teaching/2014/cdsw-autumn/placekitten_raw_input.py placekitten_raw_input.py] |
|
| |
|
| >>> x = 1
| |
| >>> y = 1.234
| |
| >>> z = True
| |
| >>> w = "elevator"
| |
| >>> all_together_now = "You can put ints like %d, floating point numbers like %f, boolean values like %s, and other strings like %s into a string without changing them to strings first!" % (x,y,z,w)
| |
| >>> print(all_together_now)
| |
|
| |
|
| ==== Dealing with whitespace ====
| | ; Introduction to structured data (JSON, JavaScriptObjectNotation) |
| >>> text = " this is a text string with lots of extra spaces "
| |
| >>> text.strip()
| |
| "this is a text string with lots of extra spaces"
| |
| >>> text.split()
| |
| ['this', 'is', 'a', 'text', 'string', 'with', 'lots', 'of', 'extra', 'spaces']
| |
| >>> " ".join(text.split())
| |
| 'this is a text string with lots of extra spaces'
| |
|
| |
|
| | * what is json: useful for more structured data |
| | * import json; json.loads() |
| | * like Python (except no single quotes) |
| | * simple lists, dictionaries |
| | * can reflect more complicated data structures |
| | * Example file at http://mako.cc/cdsw.json |
| | * You can parse data directly with <code>.json()</code> on a <code>requests</code> call |
|
| |
|
| ==== Tuples ====
| | ; Using other APIs |
| Tuples are similar to lists, but unlike lists, once they're created ("assigned") they can't be changed. Since most of our work involves reading and writing files and building and manipulating sets of data, we might not have too much cause to use tuples. But Python uses them a lot "behind the scenes", and they're useful for other types of programming, so we'll go over them briefly here.
| |
|
| |
|
| You can create a tuple just like a list...
| | * every API is different, so read the documentation! |
| >>> my_tuple = ("John", "Terry", "Terry", "Graham", "Eric")
| | * If the documentation isn't helpful, search online |
| | * for popular APIs, there are python modules that help you make requests and parse json |
|
| |
|
| You can find items by index...
| | Possible issues: |
| >>> my_tuple[1]
| | * rate limiting |
| 'Terry'
| | * authentication |
| | * text encoding issues |
|
| |
|
| BUT you can't edit them...
| | == Other Potentially Resources == |
| >>> my_tuple[1] = "John"
| |
| ---------------------------------------------------------------------------
| |
| TypeError Traceback (most recent call last)
| |
| <ipython-input-63-2dfac7e646ea> in <module>()
| |
| ----> 1 my_tuple[1] = "Michael"
| |
|
| |
|
| TypeError: 'tuple' object does not support item assignment
| | My friend Frances gave a version of this lecture last year and create slides. They are written for Python 2, so the code might not all work (remember, use <Code>print()</code> with parentheses) but the basic ideas might be helpful: |
|
| |
|
| | * [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.pdf Slides (PDF)] — For viewing |
| | * [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.odp Slides (ODP Libreoffice Slides Format)] — For editing and modification |
|
| |
|
| ====Generating a list of numbers easily with <code>range()</code>====
| | [[Category:DS4UX (Spring 2016)]] |
| | |
| <pre>
| |
| >>> range(5)
| |
| [0, 1, 2, 3, 4]
| |
| >>> for i in range(5):
| |
| ... print("Hi" * i)
| |
| ...
| |
| | |
| Hi
| |
| HiHi
| |
| HiHiHi
| |
| HiHiHiHi</pre>
| |
| | |
| The <code>range()</code> function returns a list of numbers. This is handy for when you want to generate a list of numbers on the fly instead of creating the list yourself.
| |
| | |
| >>> range(5)
| |
| [0, 1, 2, 3, 4]
| |
| | |
| Use <code>range</code> when you want to loop over a bunch of numbers in a list, or perform an operation a certain number of times:
| |
| | |
| >>> numbers = range(5)
| |
| >>> for number in numbers:
| |
| ... print(number * number)
| |
| ...
| |
| 0
| |
| 1
| |
| 4
| |
| 9
| |
| 16
| |
| | |
| We could rewrite the above example like this:
| |
| | |
| >>> for number in range(5):
| |
| ... print(number * number)
| |
| ...
| |
| 0
| |
| 1
| |
| 4
| |
| 9
| |
| 16
| |
| | |
| You can also set the start, end, and increment value (called "step") for a range.
| |
| >>> for i in range(2,20,2):
| |
| ... print(i)
| |
| 2
| |
| 4
| |
| 6
| |
| 8
| |
| 10
| |
| 12
| |
| 14
| |
| 16
| |
| 18
| |
| | |
| === Using break statements to halt execution ===
| |
| word_list = ["the", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"]
| |
| letter = "z"
| |
| seen_letter = False
| |
| for word in word_list:
| |
| if letter in word:
| |
| seen_letter = True
| |
| print("%s contains the letter %s" % (word, letter))
| |
| else:
| |
| print("no %s in %s" % (letter, word))
| |
| | |
| === Get user input with <code>input()</code> ===
| |
| | |
| >>> for i in range(100):
| |
| ... my_input = input("Please type something> ")
| |
| ... if my_input == "Quit":
| |
| ... print("Goodbye!")
| |
| ... break
| |
| ... else:
| |
| ... print("You said: " + my_input)
| |
| ...
| |
| Please type something> Hello
| |
| You said: Hello
| |
| Please type something> How are you?
| |
| You said: How are you?
| |
| Please type something> Quit
| |
| Goodbye!
| |
| >>>
| |
| | |
| === Iterating an indeterminate number of times with <code>while</code> loops ===
| |
| | |
| grocery_list = []
| |
| testAnswer = input('Press y if you want to enter more groceries: ')
| |
| while testAnswer == 'y':
| |
| food = input('Next item:')
| |
| grocery_list.append(food)
| |
| testAnswer = input('Press y if you want to enter more groceries: ')
| |
| | |
| print('Your grocery list:')
| |
| for food in grocery_list:
| |
| print(food)
| |
| | |
| ===Dictionaries===
| |
| | |
| * Use dictionaries to store key/value pairs.
| |
| * Dictionaries do not guarantee ordering.
| |
| * A given key can only have one value, but multiple keys can have the same value.
| |
| | |
| ====Initialization====
| |
| | |
| >>> my_dict = {}
| |
| >>> my_dict
| |
| {}
| |
| >>> your_dict = {"Alice" : "chocolate", "Bob" : "strawberry", "Cara" : "mint chip"}
| |
| >>> your_dict
| |
| {'Bob': 'strawberry', 'Cara': 'mint chip', 'Alice': 'chocolate'}
| |
| | |
| ====Types====
| |
| | |
| >>> type(my_dict)
| |
| <type 'dict'>
| |
| | |
| ====Adding and removing elements ====
| |
| | |
| >>> your_dict["Dora"] = "vanilla"
| |
| >>> your_dict
| |
| {'Bob': 'strawberry', 'Cara': 'mint chip', 'Dora': 'vanilla', 'Alice': 'chocolate'}
| |
| | |
| >>> del your_dict["Dora"]
| |
| >>> your_dict
| |
| {'Bob': 'strawberry', 'Cara': 'mint chip', 'Alice': 'chocolate'}
| |
| | |
| ====Accessing elements of a dictionary====
| |
| | |
| >>> your_dict["Alice"]
| |
| 'chocolate'
| |
| >>> your_dict.get("Alice")
| |
| 'chocolate'
| |
| | |
| >>> your_dict["Eve"]
| |
| Traceback (most recent call last):
| |
| File "<stdin>", line 1, in <module>
| |
| KeyError: 'Eve'
| |
| >>> "Eve" in your_dict
| |
| False
| |
| >>> "Alice" in your_dict
| |
| True
| |
| >>> your_dict.get("Eve")
| |
| >>> person = your_dict.get("Eve")
| |
| >>> print(person)
| |
| None
| |
| >>> print(type(person))
| |
| <type 'NoneType'>
| |
| >>> your_dict.get("Alice")
| |
| 'chocolate'
| |
| | |
| ==== Dictionary keys can be integers, and their values can be any data type ====
| |
| | |
| >>> mixed_dict = {1:3, 2:'two', 3:False, 'four':['john','terry','graham']}
| |
| >>> print(mixed_dict[1])
| |
| 3
| |
| >>> print(mixed_dict[2])
| |
| two
| |
| >>> print(mixed_dict[3])
| |
| False
| |
| >>> print(mixed_dict['four'][2])
| |
| graham
| |
| | |
| ====Changing elements of a dictionary====
| |
| | |
| >>> your_dict["Alice"] = "coconut"
| |
| >>> your_dict
| |
| {'Bob': 'strawberry', 'Cara': 'mint chip', 'Dora': 'vanilla', 'Alice': 'coconut'}
| |
| | |
| ====Looping through a dictionary====
| |
| The builtin functions <code>.items(), .keys(),</code> and <code>.values()</code> provide you with a lot of flexibility when iterating through dictionaries.
| |
| | |
| >>>for i in your_dict.items():
| |
| >>> print(i)
| |
| ('Bob', 'strawberry')
| |
| ('Cara', 'mint chip')
| |
| ('Dora', 'vanilla')
| |
| ('Alice', 'chocolate')
| |
| | |
| >>>for i_key in your_dict.keys():
| |
| >>> print(i_key + " is a key in this dictionary")
| |
| Bob is a key in this dictionary
| |
| Cara is a key in this dictionary
| |
| Dora is a key in this dictionary
| |
| Alice is a key in this dictionary
| |
| | |
| >>>for i_val in your_dict.values():
| |
| >>> print(i_val + " is a value in this dictionary")
| |
| strawberry is a value in this dictionary
| |
| mint chip is a value in this dictionary
| |
| vanilla is a value in this dictionary
| |
| chocolate is a value in this dictionary
| |
| | |
| >>> for i_key, i_val in your_dict.items():
| |
| >>> print(i_key + " is the key for " + i_val)
| |
| >>> print(i_val + " is the value for " + i_key)
| |
| >>> print("\n")
| |
| Bob is the key for strawberry
| |
| strawberry is the value for Bob
| |
| ...
| |
| Cara is the key for mint chip
| |
| mint chip is the value for Cara
| |
| ...
| |
| Dora is the key for vanilla
| |
| vanilla is the value for Dora
| |
| ...
| |
| Alice is the key for chocolate
| |
| chocolate is the value for Alice
| |
| | |
| ==== Sorting dictionaries with <code>operator</code> and <code>itemgetter</code> ====
| |
| | |
| We've already learned how you can use <code>.sorted()</code> to create a sorted version of a list. <code>.sorted()</code> accepts an optional <code>key</code> argument to tell it what to sort on. You can use <code>.sorted()</code> with <code>.items()</code> builtin dictionary function and the <code>itemgetter</code> function of the <code>operator</code> module to create sorted versions of dictionaries!
| |
| | |
| >>> import operator
| |
| >>> family = {'ozy':2, 'jonathan':34, 'portia':10, 'eva':6, 'dana':28}
| |
| >>> sorted(family.items(), key=operator.itemgetter(1), reverse=True)
| |
| [('jonathan', 34), ('dana', 28), ('portia', 10), ('eva', 6), ('ozy', 2)]
| |
| | |
| You can also use this approach to sort other complex data structures:
| |
| | |
| >>> family = [['ozy',2], ['portia',10], ['jonathan',34], ['dana', 28], ['eva', 6]]
| |
| >>> sorted(family, key=operator.itemgetter(1))
| |
| [['ozy', 2], ['eva', 6], ['portia', 10], ['dana', 28], ['jonathan', 34]]
| |
| >>> sorted(family, key=operator.itemgetter(0), reverse=True)
| |
| [['portia', 10], ['ozy', 2], ['jonathan', 34], ['eva', 6], ['dana', 28]]
| |
| | |
| == Exercise ==
| |
| <big>'''[http://jtmorgan.net/ds4ux/week3/notifications.zip Click here to download the scripts for this week's in-class exercise]'''</big>
| |