You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jason Frisvold <xe...@gmail.com> on 2005/12/27 14:30:55 UTC
Re: What's does m{} do ?
On 12/27/05, Mark R. London <mr...@psfc.mit.edu> wrote:
> What does m{} do, like in the following test?
>
> body DRUG_DOSAGE m{[\d\.]+ *\$? *(?:[\\/]|per) *d.?o.?s.?e}i
Looks like a case insensitive match .. Let's see..
[\d\.]+ matches a digit or a period one or more times
* (that's space asterisk) matches 0 or more spaces
\$? matches a dollar sign 0 or 1 time
* (that's space asterisk) matches 0 or more spaces
(?:[\\/]|per) I'm not 100% sure on.. It looks like it matches either
:V or per ...
* (that's space asterisk) matches 0 or more spaces
d.?o.?s.?e matches d followed by 0 or 1 period, o followed by 0 or 1
period, s followed by 0 or 1 period, and e
Standard perl regex .. Check out these sites :
http://www.intuitive.com/spam-assassin-rule-help.html
http://www.english.uga.edu/humcomp/perl/regex2a.html
http://www.troubleshooters.com/codecorn/littperl/perlreg.htm
--
Jason 'XenoPhage' Frisvold
XenoPhage0@gmail.com
Re: What's does m{} do ?
Posted by Matt Kettler <mk...@comcast.net>.
At 09:34 AM 12/27/2005, Mark London wrote:
>rather than simply //, or are they identical? (There are only a couple of
>tests which use m{} in Spamassassin).
They are identical, but they do have one advantage.. you can use / inside
the rule text without having it escape it.
it makes things like http:// much more readable, as in a normal / delimited
rule you'd have to write http:\/\/
The rules that use m{ likely contain many /'es in the text, so this was
done for readability.
Re: What's does m{} do ?
Posted by Mark London <mr...@psfc.mit.edu>.
Sorry, I wasn't clear about my question, which is why is m{} used in that test
rather than simply //, or are they identical? (There are only a couple of
tests which use m{} in Spamassassin).
Re: What's does m{} do ?
Posted by Jason Frisvold <xe...@gmail.com>.
On 12/27/05, Loren Wilton <lw...@earthlink.net> wrote:
> Close, but not quite.
>
> (?:[\\/]|per)
>
> The (?:) is bracketing. A normal pair of parends would be 'capturing' and
> keep track of what was found within the grouping. The ?: modifier tells
> Perl to not bother capturing the contents, since it won't be used later.
> This is an efficiency concern.
Ahh, I was not aware of that.. That does come in handy.. Thanks for
that info :)
> The [\\/] is a character set match. It is looking for either / or \. The
> other side of the alternation is 'per'. Thus it is looking for 'per', or a
> slash or backslash as in $1.25/dose.
Heh.. font issue.. I could have *sworn* that was \V and not \\/ I
had no idea what \V meant and couldnt find a reference to it.. *grin*
> d.?o.?s.?e matches d followed by 0 or 1 *any character*, followed by o, etc.
> A bare dot in a regex is a 'match any character except newline' character.
> So this is looking for 'dose', 'd ose', 'd*o*s*e', or any other random form
> of one-character obfuscation.
Typo on my part.. I meant any character... Sorry bout that.. :)
> Loren
Thanks for clearing everything else up.. My regex foo is still a little weak..
--
Jason 'XenoPhage' Frisvold
XenoPhage0@gmail.com
Re: What's does m{} do ?
Posted by Loren Wilton <lw...@earthlink.net>.
[\d\.]+ matches a digit or a period one or more times
* (that's space asterisk) matches 0 or more spaces
\$? matches a dollar sign 0 or 1 time
* (that's space asterisk) matches 0 or more spaces
(?:[\\/]|per) I'm not 100% sure on.. It looks like it matches either
:V or per ...
* (that's space asterisk) matches 0 or more spaces
d.?o.?s.?e matches d followed by 0 or 1 period, o followed by 0 or 1
period, s followed by 0 or 1 period, and e
Close, but not quite.
(?:[\\/]|per)
The (?:) is bracketing. A normal pair of parends would be 'capturing' and
keep track of what was found within the grouping. The ?: modifier tells
Perl to not bother capturing the contents, since it won't be used later.
This is an efficiency concern.
The [\\/] is a character set match. It is looking for either / or \. The
other side of the alternation is 'per'. Thus it is looking for 'per', or a
slash or backslash as in $1.25/dose.
d.?o.?s.?e matches d followed by 0 or 1 *any character*, followed by o, etc.
A bare dot in a regex is a 'match any character except newline' character.
So this is looking for 'dose', 'd ose', 'd*o*s*e', or any other random form
of one-character obfuscation.
Loren