DS4UX (Spring 2016)/Day 4 lecture: Difference between revisions

From CommunityData
(Created page with "200px|thumb|In which you learn how to use Python and web APIs to meet the likes of her! == Lecture Outline == ;Introduction and context * You ca...")
 
(Replaced content with "*Database concepts (re-use slides) *Introduction to Wikipedia data *Introduction to MySQL and Quarry *Querying with Socrata SOQL API Category:DS4UX (Spring 2016)")
Line 1: Line 1:
[[File:Highfivekitten.jpeg|200px|thumb|In which you learn how to use Python and web APIs to meet the likes of her!]]
*Database concepts (re-use slides)
 
*Introduction to Wikipedia data
== Lecture Outline ==
*Introduction to MySQL and Quarry
;Introduction and context
*Querying with Socrata SOQL API
 
* You can write some tools in Python now. Congratulations!
* Today we'll learn how to find/create data sets
* Next week we'll get into data science (asking and answering questions)
 
 
;Outline:
 
* What is an API?
* How do we use one to fetch interesting datasets?
* How do we write programs that use the internet?
* How can we use the placekitten API to fetch kitten pictures?
* Introduction to structured data (JSON)
* How do we use APIs in general?
 
 
;What is a (web) API?
 
* API: a structured way for programs to talk to each other (aka an interface for programs)
* Web APIs: like a website your programs can visit (you:a website::your program:a web API)
 
 
; How do we use an API to fetch datasets?
 
Basic idea: your program sends a request, the API sends data back
* Where do you direct your request? The site's API endpoint.
** For example: Wikipedia's web API endpoint is http://en.wikipedia.org/w/api.php
* How do I write my request? Put together a URL; it will be different for different web APIs.
** Check the documentation, look for code samples
* How do you send a request?
** Python has modules you can use, like <code>requests</code> (they make HTTP requests)
* What do you get back?
** Structured data (usually in the JSON format)
* How do you understand (i.e. parse) the data?
** There's a module for that!
 
 
; How do we write Python programs that make web requests?
 
To use APIs to build a dataset we will need:
* all our tools from last session: variables, etc
* the ability to open urls on the web
* the ability to create custom URLS
* the ability to save to files
* the ability to understand (i.e., parse) JSON data that APIs usually give us
 
 
; New programming concepts:
 
* interpolate variables into a string using % and %()s
* requests
* open files and write to them
 
 
; How do we use an API to fetch kitten pictures?
 
[http://placekitten.com/ placekitten.com]
* API that takes specially crafted URLs and gives appropriately sized picture of kittens
* Exploring placekitten in a browser:
** visit the API documentation
** kittens of different sizes
** kittens in greyscale or color
* Now we write a small program to grab an arbitrary square from placekitten by asking for the size on standard in: [http://mako.cc/teaching/2014/cdsw-autumn/placekitten_raw_input.py placekitten_raw_input.py]
 
 
; Introduction to structured data (JSON, JavaScriptObjectNotation)
 
* what is json: useful for more structured data
* import json; json.loads()
* like Python (except no single quotes)
* simple lists, dictionaries
* can reflect more complicated data structures
* Example file at http://mako.cc/cdsw.json
* You can parse data directly with <code>.json()</code> on a <code>requests</code> call
 
; Using other APIs
 
* every API is different, so read the documentation!
* If the documentation isn't helpful, search online
* for popular APIs, there are python modules that help you make requests and parse json
 
Possible issues:
* rate limiting
* authentication
* text encoding issues
 
== Other Potentially Resources ==
 
My friend Frances gave a version of this lecture last year and create slides. They are written for Python 2, so the code might not all work (remember, use <Code>print()</code> with parentheses) but the basic ideas might be helpful:
 
* [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.pdf Slides (PDF)] — For viewing
* [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.odp Slides (ODP Libreoffice Slides Format)] — For editing and modification


[[Category:DS4UX (Spring 2016)]]
[[Category:DS4UX (Spring 2016)]]

Revision as of 03:56, 26 March 2016

  • Database concepts (re-use slides)
  • Introduction to Wikipedia data
  • Introduction to MySQL and Quarry
  • Querying with Socrata SOQL API