Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Project page
Discussion
Edit
View history
Editing
CommunityData:ORES
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==== Code Example ==== <syntaxhighlight lang="python"> #!/usr/bin/env python3 ########################################################################## ## This script runs the ORES scorer against revision ids by assembling many examples of the following shell command: ## echo -e '{"rev_id": 456789}\n{"rev_id": 3242342}\n{"rev_id": 618882377}' | ores score_revisions https://ores.wikimedia.org enwiki wp10 > thatfile.txt ## ## Inspired by the documentation located here: https://www.mediawiki.org/wiki/ORES ## ## Assumptions: ## ## This script assumes a tab-delimited file with a header, and that one element of that header is 'revid' -- a valid wikipedia revision id ## ########################################################################## ## Warnings: ## ## You will need to edit the command line to reflect the wiki whose you want to score -- enwiki, frwiki, etc. See comment marked (A). ## ## The code is designed to allow ORES to load-balance your queries on your behalf. A group of 100 revids will likely result in two ## parallel threads of 50 revids each, which is the current recommended load. Don't change the way you throttle load without guidance ## from the development team. ## ########################################################################## ## Components: ## (0) Modal Configs and Process Args ## (1) Read in Revision IDs ## (2) Assemble shell command and run repeatedly on groups of IDs. ## ## (0) Modal Configs and Process Args #DEBUG=1 DEBUG=0 import argparse import os import csv theList = [] parser = argparse.ArgumentParser(description='Generates a kajillion shell commands and runs them.') parser.add_argument('-i', help="Infile containing revision IDs to look up.", required=True) args = parser.parse_args() ## (1) Read in Revision IDs givenInfile = args.i with open(givenInfile, 'r') as infileHandle: theInfile = csv.DictReader(infileHandle, delimiter="\t", quotechar='"') for currentLine in theInfile: theList.append(currentLine["revid"]) # makes a list of all the revids in the file chunkSize = 100 # see note (B); it's not recommended to change this chunk size without guidance for i in range(0, len(theList), chunkSize): # iterates over theList in 100-revid chunks chunk = theList[i:i+chunkSize] if DEBUG: # change the modal config to DEBUG=1 if you want to see these messages, leave it 0 if you don't print(chunk) uglyString = "" # ORES is expecting a JSON format; we fake it here in a string I call uglyString for revid in chunk: uglyString = uglyString + "{\"rev_id\": " + revid uglyString = uglyString + "}\\n" if DEBUG: print(uglyString[-2]) if uglyString[-2] == "\\": #we don't need the trailing linebreak uglyString = uglyString[:-2] if DEBUG: print(uglyString) # see note (A); this is where you can change the language #theCommand = '''echo '%s' | ores score_revisions https://ores.wikimedia.org enwiki damaging >> predictDamaging.txt''' % uglyString #theCommand = '''echo '%s' | ores score_revisions https://ores.wikimedia.org ruwiki damaging >> predictDamaging.txt''' % uglyString theCommand = '''echo '%s' | ores score_revisions https://ores.wikimedia.org frwiki damaging >> predictDamaging.txt''' % uglyString if DEBUG: print(theCommand) ## (2) Assemble shell command and run repeatedly on groups of IDs. os.system(theCommand) </syntaxhighlight>
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information