|
|
Line 1: |
Line 1: |
| [[File:Highfivekitten.jpeg|200px|thumb|In which you learn how to use Python and web APIs to meet the likes of her!]]
| | *Database concepts (re-use slides) |
| | | *Introduction to Wikipedia data |
| == Lecture Outline ==
| | *Introduction to MySQL and Quarry |
| ;Introduction and context
| | *Querying with Socrata SOQL API |
| | |
| * You can write some tools in Python now. Congratulations! | |
| * Today we'll learn how to find/create data sets
| |
| * Next week we'll get into data science (asking and answering questions)
| |
| | |
| | |
| ;Outline:
| |
| | |
| * What is an API?
| |
| * How do we use one to fetch interesting datasets?
| |
| * How do we write programs that use the internet?
| |
| * How can we use the placekitten API to fetch kitten pictures?
| |
| * Introduction to structured data (JSON) | |
| * How do we use APIs in general?
| |
| | |
| | |
| ;What is a (web) API?
| |
| | |
| * API: a structured way for programs to talk to each other (aka an interface for programs)
| |
| * Web APIs: like a website your programs can visit (you:a website::your program:a web API)
| |
| | |
| | |
| ; How do we use an API to fetch datasets?
| |
| | |
| Basic idea: your program sends a request, the API sends data back
| |
| * Where do you direct your request? The site's API endpoint.
| |
| ** For example: Wikipedia's web API endpoint is http://en.wikipedia.org/w/api.php
| |
| * How do I write my request? Put together a URL; it will be different for different web APIs.
| |
| ** Check the documentation, look for code samples
| |
| * How do you send a request?
| |
| ** Python has modules you can use, like <code>requests</code> (they make HTTP requests)
| |
| * What do you get back?
| |
| ** Structured data (usually in the JSON format)
| |
| * How do you understand (i.e. parse) the data?
| |
| ** There's a module for that! | |
| | |
| | |
| ; How do we write Python programs that make web requests?
| |
| | |
| To use APIs to build a dataset we will need:
| |
| * all our tools from last session: variables, etc
| |
| * the ability to open urls on the web
| |
| * the ability to create custom URLS
| |
| * the ability to save to files
| |
| * the ability to understand (i.e., parse) JSON data that APIs usually give us
| |
| | |
| | |
| ; New programming concepts:
| |
| | |
| * interpolate variables into a string using % and %()s
| |
| * requests
| |
| * open files and write to them
| |
| | |
| | |
| ; How do we use an API to fetch kitten pictures?
| |
| | |
| [http://placekitten.com/ placekitten.com]
| |
| * API that takes specially crafted URLs and gives appropriately sized picture of kittens
| |
| * Exploring placekitten in a browser:
| |
| ** visit the API documentation
| |
| ** kittens of different sizes
| |
| ** kittens in greyscale or color
| |
| * Now we write a small program to grab an arbitrary square from placekitten by asking for the size on standard in: [http://mako.cc/teaching/2014/cdsw-autumn/placekitten_raw_input.py placekitten_raw_input.py]
| |
| | |
| | |
| ; Introduction to structured data (JSON, JavaScriptObjectNotation)
| |
| | |
| * what is json: useful for more structured data
| |
| * import json; json.loads()
| |
| * like Python (except no single quotes)
| |
| * simple lists, dictionaries
| |
| * can reflect more complicated data structures
| |
| * Example file at http://mako.cc/cdsw.json
| |
| * You can parse data directly with <code>.json()</code> on a <code>requests</code> call | |
| | |
| ; Using other APIs
| |
| | |
| * every API is different, so read the documentation!
| |
| * If the documentation isn't helpful, search online
| |
| * for popular APIs, there are python modules that help you make requests and parse json
| |
| | |
| Possible issues:
| |
| * rate limiting
| |
| * authentication
| |
| * text encoding issues
| |
| | |
| == Other Potentially Resources ==
| |
| | |
| My friend Frances gave a version of this lecture last year and create slides. They are written for Python 2, so the code might not all work (remember, use <Code>print()</code> with parentheses) but the basic ideas might be helpful:
| |
| | |
| * [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.pdf Slides (PDF)] — For viewing
| |
| * [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.odp Slides (ODP Libreoffice Slides Format)] — For editing and modification
| |
|
| |
|
| [[Category:DS4UX (Spring 2016)]] | | [[Category:DS4UX (Spring 2016)]] |