Editing Tor and Wikipedia

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
== When can a Tor exit node IP edit Wikipedia? ==
* '''Theory''' — It takes some time for an exit node to be added to the list of blocked IPs. As a result, Tor users that randomly happen to be routed out of a recently added exit nodes (in the period before Wikipedia has blocked the exit nodes IP) are sometimes allowed to edit.
* '''Theory''' — Some exit nodes don't get added to Wikipedia's block list automatically through TorBlock. Tor users who are routed through these exit nodes are allowed to edit Wikipedia until an administrator or bot notices and blocks the IP address.
* '''Theory''' — Some forms of blocking expire after a certain amount of time and, if a Tor node is blocked with an expiry time, then traffic may be allowed through until it is blocked again. This could account for on-and-off patterns of editing coming from Tor nodes.
== Identifying IPs that are/were Tor exit nodes ==
== Identifying IPs that are/were Tor exit nodes ==


Line 16: Line 10:
What we can tell about how it works:
What we can tell about how it works:


* during a period from XXXX to 2013 it read from the Tor Project bulk list service (https://check.torproject.org/cgi-bin/TorBulkExitList.py?ip=)
* during a period from XXXX to XXXX it read from the Tor Project bulk list service (link?)
* after Jan 2013, it pulls from the newer "Onionoo" service (https://onionoo.torproject.org/details?type=relay&running=true&flag=Exit)
* after XXXX, it pulls from the newer "Onionoo" service
* pulls perioditically, typically from a cronjob
* pulls perioditically, typically from a cronjob


Questions:
=== Blocked by Administrators MediaWiki ===
 
* Although we know when the commits were made to the git repository that added/switched features in Extension:TorBlock, we don't know exactly when things were deployed to WMF servers. It's likely on a delay, but probably not an enormous one. But it clearly bounds it.
* We don't know when the cronjob was run if the timing of the cronjob has been consistent over time.
 
=== Blocked by Administrators "by hand" ===


* Blocks are recorded in Special:Log (e.g., [https://en.wikipedia.org/w/index.php?title=Special%3ALog&type=block&user=&page=&year=&month=-1&tagfilter=&hide_thanks_log=1&hide_patrol_log=1&hide_tag_log=1&hide_review_log=1 ENWP]).
* Blocks are recorded in Special:Log (e.g., [https://en.wikipedia.org/w/index.php?title=Special%3ALog&type=block&user=&page=&year=&month=-1&tagfilter=&hide_thanks_log=1&hide_patrol_log=1&hide_tag_log=1&hide_review_log=1 ENWP]).


=== Blocked by a Bot ===
=== Blocked by a Bot ===
* As above, blocks are recorded in Special:Log (e.g., [https://en.wikipedia.org/w/index.php?title=Special%3ALog&type=block&user=&page=&year=&month=-1&tagfilter=&hide_thanks_log=1&hide_patrol_log=1&hide_tag_log=1&hide_review_log=1 ENWP]).


There are at least some bots that seem to automatically find and block open proxies:
There are at least some bots that seem to automatically find and block open proxies:


* [https://en.wikipedia.org/wiki/User:ProcseeBot ProcseeBot] (apparently closed source but we could contact the author/operator [https://en.wikipedia.org/wiki/User:Slakr Slakr])
* [https://en.wikipedia.org/wiki/User:ProcseeBot ProcseeBot] (apparently closed source but we could contact the author/operator [https://en.wikipedia.org/wiki/User:Slakr Slakr]
* [https://en.wikipedia.org/w/index.php?title=User:TorNodeBot TorNodeBot] is a bot that blocked people who were editing over Tor. It is designed to block Tor nodes that the TorBlock extension failed to notice due to a technical error. In particular, TorBlock only detected ''current'' Tor nodes (nodes that are active at the time of checking) and blocked them, and so sometimes a Tor exit node was detected, but later disabled after being block would later show up as not being a Tor node (maybe overlooked by TorBlock extensions because it stores the blacklist in a cache). [https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/TorNodeBot Bots/Request for approval] explains this issue in some detail. TorNodeBot got deactivated in 2014 and managed to block 32123 users.


== Structure of Special:Log XML file ==
== Structure of Special:Log XML file ==
Line 44: Line 30:
== Open questions ==
== Open questions ==


* There are notes in Special:Log that suggest that some IP addresses are "confirmed Tor Nodes" (e.g., ???) that are blocked by hand. Why were these not caught by TorBlock?
* There are notes in the log that suggest that some IP addresses are "confirmed Tor Nodes" (e.g., ???) that are blocked through MediaWiki. Why are these not caught? Are they are in CollectTor?
** Are these exit nodes present in our CollectTor data as well?
* Why does the distribution of edits edits over the time periods that IPs are marked as exit nodes in our dataset of Tor exit node "spells" not bunch up near the beginning of the period when the IP is a new Tor node and the IP seems less likely to be blocked? Why are there Tor exit nodes that seem to have been listed in CollectTor for long periods of time without being blocked by Wikipedia?
* Why does the distribution of edits edits over the time periods that IPs are marked as exit nodes in our dataset of Tor exit node "spells" not bunch up near the beginning of the period when the IP is a new Tor node and the IP seems less likely to be blocked? Why are there Tor exit nodes that seem to have been listed in CollectTor for long periods of time without being blocked by Wikipedia?
* If TorBlock identifies a Tor exit node, are these IP addresses added to or reflected in the Special:Log block log?
* If TorBlock identifies a Tor exit node, are these IP addresses added to or reflected in the Special:Log block log?
* Does an IP address need to make an edit first in order to be blocked as an open proxy? Can we find examples of this happening? If so, is it always bots that doing the blocking?
* Does an IP address need to make an edit first in order to be blocked as an open proxy? Can we find examples of this happening?
* What bots are involved in detecting and blocking IP addresses that are open proxies (especially Tor).
* It seems that after some time in 2014, the amount of users blocked by Wikipedia dramatically reduced, and in their reason for blocking, they stopped mentioning Tor. Did they change their blocking method as well as policy? Did the new method help reduce the number of succesful revision attempts through Tor, or did they allow Tor users to start editing Wiki pages? Note that, during this time, TorNodeBot also got deactivated.
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)