Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Page
Discussion
Edit
View history
Editing
Wikiq
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Pattern matching arguments === Users can now search for patterns in edit revision text, with a list of matches for each edit being output in columns (a column for each pattern indicated by the pattern arguments below). Users may provide multiple revision patterns and accompanying labels. The patterns and the labels must be provided in the same order for wikiq to be able to correctly label the output columns. <code>-RP</code> <code>--revision-pattern</code>: a regular expression <code>-RPl</code> <code>--revision-pattern-label</code>: a label for the columns output based on matching revisions against the pattern. In addition to revisions, we also wish to support pattern matching against revision ''summaries'' (comments). Therefore we also have corresponding command line arguments. <code>-CP</code> <code>--comment-pattern</code>: a regular expression <code>-CPl</code> <code>--comment-pattern-label</code>: a label for the columns output based on matching revisions against the pattern. ==== A note on named capture groups in pattern matching ==== The regular expressions in <code>-RP</code> and <code>-CP</code> may include one or more [https://docs.python.org/3.7/howto/regex.html#non-capturing-and-named-groups named capture groups]. If the `pattern` matches, it will then also capture values for each named capture group. If a <code>pattern</code> has one or more ''named capture groups'' wikiq will output a new column for each named capture group to store these values, with the column getting named: <code><pattern-label>_<capture-group-name></code>. Since a `pattern` can match a revision more than once it is possible that more than one value should go in this column (regardless of named capture group or not). For cases in which the <code>-RP</code> or <code>-CP</code> have more than one named capture group and part of the searched string being searched matches for more than one capture group, only the first capture group will indicate a match because matching consumes characters in Python. For example, if a regular expression is <code>r"(?P<3_letters>\b\w{3}\b)|(?P<number>\b\d+\b)"</code> and the test string being searched is <code>dog and 500 bits of kibble</code>, we note that <code>500</code> works for both the <code>3_letters</code> and <code>number</code>. However, the capture group listed first (<code>3_letters</code>) consumes '500' when it matches, so the <code>3_letters</code> column will contain the list <code>[dog, and, 500]</code> while the <code>number</code> column will simple have <code>None</code>. As a result, one should consider the order of capture groups or create separate regular expression and label pairs.
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information