Editing Community Data Science Course (Spring 2023)/Week 5 lecture notes

From CommunityData
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 3: Line 3:
* Defining functions
* Defining functions
* <code>import json</code> and <code>json.loads()</code> and <code>json.dumps()</code>
* <code>import json</code> and <code>json.loads()</code> and <code>json.dumps()</code>
* Reading ''from'' files
* Reading *from* files
* Breaking projects in multiple notebooks and step
* Breaking projects in multiple notebooks and step
* Waiting... <code>time.sleep(1)</code>
* Waiting...


== Stage 0: Coming up with a plan ==
== Stage 0: Coming up with a plan ==


I want to download data on page views data for three universities and present the sum total of each.
* I'm going to split work into two steps, one is basically
 
I'm going to split work into two steps:
 
* collect the data from the web and write the raw JSON "payload" a file
* read the data from the file and do whatever data extraction, cleaning, counting, etc; then write a TSV file
* open a TSV file and make a graph


== Stage 1: Getting data ==
== Stage 1: Getting data ==
Line 29: Line 23:


Between that and the interactive material in [https://www.mediawiki.org/wiki/Wikimedia_REST_API Wikimedia Rest API], I was able to construct a URL.
Between that and the interactive material in [https://www.mediawiki.org/wiki/Wikimedia_REST_API Wikimedia Rest API], I was able to construct a URL.
We will '''build up something like file 1, version 1''':
* setting the header
* json.dumps() [mention that I'll skip this until we have an error]
== Stage 2: Reading in data ==
walk through building '''file 2, version 1''' with a focus on:
* opening files with <code>open(filename, 'r')</code>
* <code>f.read()</code> which reads the whole file in
* json.loads()
* outputting days and views
* try to graph... we'll have an error when we try to graph
* write some new code to create better formatted date strings...
== Stages 3 and 4: lets extend to multiple things ==
* lets build a couple functions. maybe one for dates? maybe one for getting_pageview data? lets refactor the old code to use these?
* lets build in waiting for a second with <code>time.sleep(1)</code>
* let's count with a dictionary
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)