Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Page
Discussion
Edit
View history
Editing
Community Data Science Course (Spring 2023)/Week 4 lecture notes
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Using APIs to download data from the internet == '''API (Application Programmer Interface)''' is a structured way for two programs to communicate. Think of it like a contract or a secret handshake. APIs exist on both the Internet and, in a sense, we've already been using some APIs in Python. An interface typically has two parts: * A description of ''how to request something'' * A description of ''what one will get in return'' Once you understand those two things, you know the API. An API ''within Python'' typically includes a set of functions. A web API is quite a lot of like functions in Python but it describes how a program running on your computer can talk to another computer running a website. Basically, it's like a website your programs can visit (you:a website::your program:a web API). Examples: * The API for twitter describes how to read tweets, write tweets, and follow people. See details here: https://dev.twitter.com/ * Yelp has an API described here: https://www.yelp.com/developers * Zillow's API: https://www.zillow.com/howto/api/APIOverview.htm === Questions to consider when choosing an API === # Where is the documentation? Are there examples or code samples? # Are there any rate limits or restrictions on use? For instance, Twitter doesn't want you downloading tweets. Zillow forbids storing bulk results. (Why?) # Is there a python package that will help me? For instance, Twitter has a great python package called tweepy that will simplify access. # All the things on the checklist below! === Checklist: How do we use an API to fetch datasets? === Basic idea: your program sends a request, the API sends data back: * Where do you direct your request? (i.e., what are the site's API ''endpoints'') ** For example: Wikipedia's web API endpoint is http://en.wikipedia.org/w/api.php * How do I write my request? Put together a URL; it will be different for different web APIs. ** Check the documentation, look for code samples * How do you send a request? ** Often the simplest way is to try it in your browser ** Python has modules you can use, like <code>requests</code>, to make HTTP requests. The requests library, which has excellent documentation [http://docs.python-requests.org/en/latest/api/ here]. * What do you get back? ** Structured data (usually in the JSON format). *** JSON is ''javascript object notation''. JSON data looks like python lists and dictionaries, and we'll see that it's easy to turn it into a python variable that is a list or dictionary. Here's a sample: * How do you understand (i.e. parse) the data? ** We can display it in Firefox automatically? ** We can draw it out with https://jsonformatter.curiousconcept.com/ ** When it's time to do it Python, we can use the <code>.json()</code> function in the requests module!
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information