Community Data Science Course (Spring 2023)/Week 6 lecture notes

Goals
Three goals for today's lecture:


 * 1) Talk about projects
 * 2) Walk through example code that grabs data from the MediaWiki API and introduces a small number of new concepts
 * 3) walk through example code that grabs data from the Yelp API (and uses a module and authentication)

Final Projects
Your next major milestone is May 15 and it will be turning in your Final Project Proposal. I'm hoping that these proposals will be milestone in which everybody has: (a) clear description of your questions, (b) a clear sense of how you are going to get data to answer these question (and maybe even the data itself), and (c) confidence that your project will be doable.

A few points to talk through in class:


 * What are the components of successful project proposal? (e.g., text! dummy figures, etc)
 * A strong sense of whether your work is going to be doable.
 * Class assignments will continue to shift toward project work.

Wikipedia Edit Data from the API
Walk through some code and introduce some new concepts:


 * MediaWiki: The software that runs many wikis including basically every website on https://fandom.com
 * MediaWiki API with documentation in various places
 * Walk through some example code that I've written in these notebooks:

This introduces a few new concepts:


 * continuations (i.e., what do you do when you don't know how much data you have before you start?)
 * loops
 * updating your parameters to "get the next chunk"

Yelp API
I also want to walk through an example of a package that comes from an API that is both (a) authenticated and (b) that requires interacting through Python module


 * Finding new Python modules
 * Installing new Python modules with

The Yelp API is authenticated. Authentication can come in one of several forms including:


 * keys that are embedded into your normal parameters (like )
 * OAUTH authentication, bearer tokens, and so on...

Yelp is the latter kind. As it typically any API that lets you post and/or interact in ways that are non-passive.

That means you need to sign up for an API key. To do so at Yelp (and many other places) requires:


 * creating a App ("wait... I'm creating an app?!")
 * My app: https://www.yelp.com/developers/v3/manage_app

Some things to keep in mind include:


 * Keeping your API keys outside of your notebook:
 * in a JSON file in your directory!
 * e.g., in a separate python module

Now lets end by walking through two examples:


 * Yelp example notebook #1 (direct nonmodule version)
 * Yelp example notebook #2 (versions using the yelpapi module)