Community Data Science Workshops (Sprint 2015)/Day 2 lection
From CommunityData
Lecture Outline
- Introduction to APIs
- definition of API: just an interface for programs
- definition of web API
- way to ask for data (almost always a URL)
- way to get data back (almost always in a format called JSON)
- every API is different, and documented
- to use APIs to build a dataset we will need:
- all our tools from last session: variables, etc
- the ability to open urls on the web
- the ability to create custom URLS
- the ability to save to files
- the ability to understand (i.e., parse) JSON data that APIs usually give us
- Review material from last session
- variables
- lists
- dictionaries
- if statements
- for loops
- printing
- modules
- New programming concepts:
- urllib2 and urlopen
- interpolate variables into a string using % and %()s
- open files and write to them
- placekitten.com
- API that takes specially crafted URLs and gives appropriately sized picture of kittens
- example of placekitten in browser
- visit the API documentation
- kittens of different sizes
- kittens in greyscale or color
- show how to use place
- write a small program to grab arbitrary square from placekitten by asking for the size on standard in
- JSON file (JavaScript Object Notation)
- what is json: useful for more structure data
- import json; json.loads()
- like Python (except no single quotes)
- simple lists, dictionaries
- can reflect more complicated data structures
- Example file at http://mako.cc/cdsw.json
- download it and parse it
- Other APIs
- every API is different, so read the documentation!
- for popular APIs, there are python modules that help you make requests and parse json!
- rate limiting
- authenticaiton
- text encoding issues