Community Data Science Course (Spring 2016)/Day 6 Coding Challenges
From CommunityData
There is only 1 question for this week. I expect it to take 2 to 6 hours.
Find out how many people edit more than one unique page in the category "Category:Cities_in_Washington_(state)"?
How many people edit only one page?
Please treat IP addresses as separate users.
Hints
Many of you will want to save the output of some call to wikipedia to a file using open("file.tsv", "w"). You can read the file back into python using the code below. We will cover this in more detail on Wednesday.
file_handle = open("my_output.tsv", "r") # the "r" means you are opening the file to read from it, not to write to it. Be careful about the difference!
for line in file_handle:
line_clean = line.strip() # remove the newline char at end of line.
line_parts = line_clean.split('\t') # Make a list by splitting the string on tab chars.
print(len(line_parts)) # print the length of each line.