Community Data Science Course (Spring 2016)/Day 6 Coding Challenges: Difference between revisions
From CommunityData
(Created page with "There is only 1 question for this week. I expect it to take 2 to 6 hours. Find out how many people edit more than one unique page in the category "Category:Cities_in_Washingt...") |
(added hint) |
||
Line 6: | Line 6: | ||
Please treat IP addresses as separate users. | Please treat IP addresses as separate users. | ||
'''Hints''' | |||
Many of you will want to save the output of some call to wikipedia to a file using open("file.tsv", "w"). You can read | |||
the file back into python using the code below. We will cover this in more detail on Wednesday. | |||
<source type="python"> | |||
file_handle = open("my_output.tsv", "r") # the "r" means you are opening the file to read from it, not to write to it. Be careful about the difference! | |||
for line in file_handle: | |||
line_clean = line.strip() # remove the newline char at end of line. | |||
line_parts = line_clean.split('\t') # Make a list by splitting the string on tab chars. | |||
print(len(line_parts)) # print the length of each line. | |||
</source> |
Latest revision as of 00:50, 7 May 2016
There is only 1 question for this week. I expect it to take 2 to 6 hours.
Find out how many people edit more than one unique page in the category "Category:Cities_in_Washington_(state)"?
How many people edit only one page?
Please treat IP addresses as separate users.
Hints
Many of you will want to save the output of some call to wikipedia to a file using open("file.tsv", "w"). You can read the file back into python using the code below. We will cover this in more detail on Wednesday.
file_handle = open("my_output.tsv", "r") # the "r" means you are opening the file to read from it, not to write to it. Be careful about the difference!
for line in file_handle:
line_clean = line.strip() # remove the newline char at end of line.
line_parts = line_clean.split('\t') # Make a list by splitting the string on tab chars.
print(len(line_parts)) # print the length of each line.