Community Data Science Workshops (Spring 2015)/Day 2 Lecture: Difference between revisions

From CommunityData
No edit summary
(cat)
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[File:Highfivekitten.jpeg|200px|thumb|In which you learn how to use Python and web APIs to meet the likes of her!]]
[[File:Place bear-500x650.jpeg|200px|thumb|In which you learn how to use Python and web APIs to meet the likes of these guys!]]


== Lecture Outline ==
== Lecture Outline ==
Line 12: Line 12:


* What did we learn in Session 1?
* What did we learn in Session 1?
* New data types: Set and Tuple!
* What is an API?
* What is an API?
* How do we use one to fetch interesting datasets?
* How do we use one to fetch interesting datasets?
* How do we write programs that use the internet?  
* How do we write programs that use the internet?  
* How can we use the placekitten API to fetch kitten pictures?
* How can we use the placebear API to fetch bear pictures?
* Introduction to structured data (JSON)
* Introduction to structured data (JSON)
* How do we use APIs in general?
* How do we use APIs in general?
Line 39: Line 38:
** Structured data (usually in the JSON format)
** Structured data (usually in the JSON format)
* How do you understand (i.e. parse) the data?  
* How do you understand (i.e. parse) the data?  
** There's a module for that!
** The requests module can do that using the <code>.json()</code> function!




Line 66: Line 65:
; New programming concepts:
; New programming concepts:


* interpolate variables into a string using % and %()s
* interpolate variables into a string using format
* requests
* [http://docs.python-requests.org/en/latest/ requests]
* open files and write to them
* open files and write to them
* parsing a string (turning the string into a data structure we can manipulate)
* parsing a string (turning the string into a data structure we can manipulate) using the json module




; How do we use an API to fetch kitten pictures?
; How do we use an API to fetch pictures of bears?


[http://placekitten.com/ placekitten.com]
[http://placebear.com/ placebear.com]
* API that takes specially crafted URLs and gives appropriately sized picture of kittens
* API that takes specially crafted URLs and gives appropriately sized picture of bears
* Exploring placekitten in a browser:
* Exploring placebear in a browser:
** visit the API documentation
** visit the API documentation
** kittens of different sizes
** bears of different sizes
** kittens in greyscale or color
** bears in greyscale or color
* Now we write a small program to grab an arbitrary square from placekitten by asking for the size on standard in: [http://mako.cc/teaching/2014/cdsw-autumn/placekitten_raw_input.py placekitten_raw_input.py]
* Now we write a small program to grab an arbitrary square from placebear by asking for the size on standard in ([http://mako.cc/teaching/2015/cdsw-spring/placebear_input.py placebear_input.py])




Line 86: Line 85:


* what is json: useful for more structured data
* what is json: useful for more structured data
* import json; json.loads()
* import json; json.loads(), or, even easier, just do it directly with requests using the <code>.json()</code> function!
* like Python (except no single quotes)
* like Python (except no single quotes)
* simple lists, dictionaries
* simple lists, dictionaries
* can reflect more complicated data structures
* can reflect more complicated data structures
* Example file at http://mako.cc/cdsw.json
* Example file at http://mako.cc/cdsw.json
* download it and parse it: [http://mako.cc/teaching/2014/cdsw-autumn/parse_cdswjson.py parse_cdswjson.py]
* download it and parse it (e.g., with a program like [http://mako.cc/teaching/2015/cdsw-spring/parse_cdswjson.py parse_cdswjson.py])




Line 109: Line 108:
* [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.pdf Slides (PDF)] — For viewing
* [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.pdf Slides (PDF)] — For viewing
* [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.odp Slides (ODP Libreoffice Slides Format)] — For editing and modification
* [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.odp Slides (ODP Libreoffice Slides Format)] — For editing and modification
[[Category:Spring_2015_series]]

Latest revision as of 00:15, 17 September 2015

In which you learn how to use Python and web APIs to meet the likes of these guys!

Lecture Outline[edit]

Introduction and context
  • You can write some tools in Python now. Congratulations!
  • Today we'll learn how to find/create data sets
  • Next week we'll get into data science (asking and answering questions)


Outline
  • What did we learn in Session 1?
  • What is an API?
  • How do we use one to fetch interesting datasets?
  • How do we write programs that use the internet?
  • How can we use the placebear API to fetch bear pictures?
  • Introduction to structured data (JSON)
  • How do we use APIs in general?


What is a (web) API?
  • API: a structured way for programs to talk to each other (aka an interface for programs)
  • Web APIs: like a website your programs can visit (you:a website::your program:a web API)


How do we use an API to fetch datasets?

Basic idea: your program sends a request, the API sends data back

  • Where do you direct your request? The site's API endpoint.
  • How do I write my request? Put together a URL; it will be different for different web APIs.
    • Check the documentation, look for code samples
  • How do you send a request?
    • Python has modules you can use, like requests (they make HTTP requests)
  • What do you get back?
    • Structured data (usually in the JSON format)
  • How do you understand (i.e. parse) the data?
    • The requests module can do that using the .json() function!


How do we write Python programs that make web requests?

To use APIs to build a dataset we will need:

  • all our tools from last session: variables, etc
  • the ability to open urls on the web
  • the ability to create custom URLS
  • the ability to save to files
  • the ability to understand (i.e., parse) JSON data that APIs usually give us


Session 1 review
  • Navigating in the terminal and using it to run programs
  • Writing Python:
    • using variables to manipulate data
    • types of data: strings, integers, lists, dictionaries
    • if statements
    • for loops
    • printing
    • importing modules, so you can use code other people have written for you!


New programming concepts
  • interpolate variables into a string using format
  • requests
  • open files and write to them
  • parsing a string (turning the string into a data structure we can manipulate) using the json module


How do we use an API to fetch pictures of bears?

placebear.com

  • API that takes specially crafted URLs and gives appropriately sized picture of bears
  • Exploring placebear in a browser:
    • visit the API documentation
    • bears of different sizes
    • bears in greyscale or color
  • Now we write a small program to grab an arbitrary square from placebear by asking for the size on standard in (placebear_input.py)


Introduction to structured data (JSON, JavaScriptObjectNotation)
  • what is json: useful for more structured data
  • import json; json.loads(), or, even easier, just do it directly with requests using the .json() function!
  • like Python (except no single quotes)
  • simple lists, dictionaries
  • can reflect more complicated data structures
  • Example file at http://mako.cc/cdsw.json
  • download it and parse it (e.g., with a program like parse_cdswjson.py)


Using other APIs
  • every API is different, so read the documentation!
  • If the documentation isn't helpful, search online
  • for popular APIs, there are python modules that help you make requests and parse json

Possible issues:

  • rate limiting
  • authentication
  • text encoding issues

Lecture Slides (From Fall 2014)[edit]