Community Data Science Course (Spring 2016)/Day 6 Coding Challenges
There is only 1 question for this week. I expect it to take 2 to 6 hours.
Find out how many people edit more than one unique page in the category "Category:Cities_in_Washington_(state)"?
How many people edit only one page?
Please treat IP addresses as separate users.
Many of you will want to save the output of some call to wikipedia to a file using open("file.tsv", "w"). You can read the file back into python using the code below. We will cover this in more detail on Wednesday.
file_handle = open("my_output.tsv", "r") # the "r" means you are opening the file to read from it, not to write to it. Be careful about the difference! for line in file_handle: line_clean = line.strip() # remove the newline char at end of line. line_parts = line_clean.split('\t') # Make a list by splitting the string on tab chars. print(len(line_parts)) # print the length of each line.