Editing Harry Potter on Wikipedia
From CommunityData
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 7: | Line 7: | ||
=== Download and test the HPWP project === | === Download and test the HPWP project === | ||
# Right click the following file, click "Save Target as..." or "Save link as...", and save it to your Desktop directory: http://mako.cc/teaching/2015/cdsw- | # Right click the following file, click "Save Target as..." or "Save link as...", and save it to your Desktop directory: http://mako.cc/teaching/2015/cdsw-spring/harrypotter-wikipedia-cdsw.zip | ||
# The ".zip" extension on the above file indicates that it is a compressed Zip archive. We need to "extract" its contents. | # The ".zip" extension on the above file indicates that it is a compressed Zip archive. We need to "extract" its contents. | ||
# Start up your terminal, navigate to the new directory you have unpacked called <code>harrypotter-wikipedia-cdsw.zip</code>. | # Start up your terminal, navigate to the new directory you have unpacked called <code>harrypotter-wikipedia-cdsw.zip</code>. | ||
Line 16: | Line 16: | ||
# Run the program <code>build_hpwp_dataset.py</code> which will download the code from the Wikipedia API. This will take 10 minutes or so. | # Run the program <code>build_hpwp_dataset.py</code> which will download the code from the Wikipedia API. This will take 10 minutes or so. | ||
# You can download a "pre-made" version I have run on my computer by doing the right-click, "Save link as..." approach for this URL: http://communitydata.cc/~mako/hp_wiki.tsv | # You can download a "pre-made" version I have run on my computer by doing the right-click, "Save link as..." approach for this URL: http://communitydata.cc/~mako/hp_wiki.tsv ([http://communitydata.cc/~mako/hp_wiki.csv original CSV version used in class]) | ||
=== Test Code === | === Test Code === | ||
Line 42: | Line 42: | ||
# Instead of "binning" your dataset by day, change to bin it by month for each of the two previous questions. | # Instead of "binning" your dataset by day, change to bin it by month for each of the two previous questions. | ||
# Pick a different topic in Wikipedia and download a new dataset. Answer the questions above for this other dataset. | # Pick a different topic in Wikipedia and download a new dataset. Answer the questions above for this other dataset. | ||