You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Reindl Harald <h....@thelounge.net> on 2015/01/03 21:08:45 UTC

regex: chars to escape bsides @

by writing some custom rules like below i found out that @ needs to be 
esacped additionally to http://php.net/manual/de/function.preg-quote.php

are there other chars which needs special handling?

header    CUST_MANY_SPAM_TO  X-Local-Envelope-To =~ 
/^(\<h\.reindl\@thelounge\.net\>)$/i
score     CUST_MANY_SPAM_TO  -4.0
describe  CUST_MANY_SPAM_TO  Custom Scoring



Re: regex: chars to escape bsides @

Posted by RW <rw...@googlemail.com>.
On Sun, 04 Jan 2015 01:10:17 +0100
Reindl Harald wrote:

> 
> Am 04.01.2015 um 00:55 schrieb Dave Funk:
> > On Sat, 3 Jan 2015, Reindl Harald wrote:
> >
> >> by writing some custom rules like below i found out that @ needs
> >> to be esacped additionally to
> >> http://php.net/manual/de/function.preg-quote.php
> >>
> >> are there other chars which needs special handling?
> >>
> >> header    CUST_MANY_SPAM_TO  X-Local-Envelope-To =~
> >> /^(\<h\.reindl\@thelounge\.net\>)$/i
> >> score     CUST_MANY_SPAM_TO  -4.0
> >> describe  CUST_MANY_SPAM_TO  Custom Scoring
> >
> > Umm, SA is written in Perl, not PHP. So you should look at Perl
> > regex documentation, not PHP docs
> 
> so what - @ is not a to escape char in whatever language and hence SA 
> specific and it don't matter in what language SA is written if you
> write a *backend* in PHP - guess what "preg" means even if you are
> too lazy to click on the link -> "Perl Compatible Regular Expressions
> (PCRE)"
> 

Escaping often happens at more than one level, whilst @ doesn't have a
special meaning in perl regular expressions, it does in perl. A perl RE
tutorial would have mentioned this, so it is a fair point.

> so do me a favor: if you don't have a answer leave me in peace
> because i am tired of answers with no content at all

Re: regex: chars to escape bsides @

Posted by Reindl Harald <h....@thelounge.net>.
Am 05.01.2015 um 19:14 schrieb Bowie Bailey:
> On 1/4/2015 5:42 AM, Reindl Harald wrote:
>> Am 04.01.2015 um 06:43 schrieb Bob Proulx:
>>> The additional issue was that you were referencing PHP documentation.
>>> Might as well have been referencing Lisp documentation for all of the
>>> relevance it had. That was the point I saw being addressed at that
>>> point. PHP is a similar syntax that came after Perl but it very much
>>> is its own thing and Perl does not derive from it
>>
>> we are talking about PCRE and the PHP function preg_quote() to escape
>> a string for *perl* regular expressions not the language syntax and
>> since the backneds are alreay there i just asked *what* chars
>> *additional* to that function needs to be escaped *besides* the @ i
>> found out myself
>>
>> that is a short and simple question and not refer to the used PHP
>> function would not have been hepful at all
>
> Ok.  Here is an attempt at a simple answer off the top of my head. (Not
> guaranteed to be complete)
>
> The following characters may need to be escaped in a Perl regex (as used
> in SA) if intended to be used as literal characters:
>
> $%@/[{*+?\
>
> ] and } may need to be escaped as well, but I don't think it is required.
>
> You can avoid having to escape the slash (/) by using a different
> separator for the regex.  This can avoid "leaning toothpick syndrome."

thanks, i ended last night in the wrapper-function below while / already 
got escaped since this is also needed for postfix header_checks

so finally there was only % missing i got aware on that thread, @ was 
already handeled after taking notice and the rest is done properly by 
preg_quote()

[harry@srv-rhsoft:/downloads]$ cat test.php
#!/usr/bin/php
<?php
  $chars = '$ % @ / [ ] { } \ * + ? ! \\';
  echo $chars . "\n";
  echo preg_quote($chars) . "\n";
  echo sa_quote($chars) . "\n";
  function sa_quote($input)
  {
   $input = str_replace(array("\n", "\r", "\0"), '', trim($input));
   $input = str_replace("\t", ' ', $input);
   $input = preg_quote($input);
   $input = str_replace('/', '\/', $input);
   $input = str_replace('@', '\@', $input);
   $input = str_replace('%', '\%', $input);
   return $input;
  }
?>

[harry@srv-rhsoft:/downloads]$ ./test.php
$ % @ / [ ] { } \ * + ? ! \
\$ % @ / \[ \] \{ \} \\ \* \+ \? \! \\
\$ \% \@ \/ \[ \] \{ \} \\ \* \+ \? \! \\



Re: regex: chars to escape bsides @

Posted by Reindl Harald <h....@thelounge.net>.
Am 05.01.2015 um 22:13 schrieb John Hardin:
> On Mon, 5 Jan 2015, Bowie Bailey wrote:
>
>> You can avoid having to escape the slash (/) by using a different
>> separator for the regex.  This can avoid "leaning toothpick syndrome."
>>
>> For example:
>>   m#http://match/this/url/#
>
> Ouch. # won't work for that (in SA at least) as it comments out the rest
> of the RE.

which means # needs to be escaped too... thanks!

[harry@srv-rhsoft:~]$ php -r 'echo preg_quote("#");'
#

function sa_quote($input)
{
  $input = str_replace(array("\n", "\r", "\0"), '', trim($input));
  $input = str_replace("\t", ' ', $input);
  $input = preg_quote($input);
  $input = str_replace('/', '\/', $input);
  $input = str_replace('@', '\@', $input);
  $input = str_replace('%', '\%', $input);
  $input = str_replace('#', '\#', $input);
  return $input;
}


Re: regex: chars to escape bsides @

Posted by Daniel Staal <DS...@usa.net>.
--As of January 5, 2015 4:38:03 PM -0800, John Hardin is alleged to have 
said:

> On Mon, 5 Jan 2015, Bowie Bailey wrote:
>
>> On 1/5/2015 4:13 PM, John Hardin wrote:
>>>  On Mon, 5 Jan 2015, Bowie Bailey wrote:
>>>
>>> >  You can avoid having to escape the slash (/) by using a different
>>> >  separator for the regex.  This can avoid "leaning toothpick
>>> >  syndrome."
>>> >
>>> >  For example:
>>> >    m#http://match/this/url/#
>>>
>>>  Ouch. # won't work for that (in SA at least) as it comments out the
>>>  rest of the RE.
>>
>> Ack!  Forgot about that minor difference with SA.  # is my general go-to
>> character for that in normal Perl scripts.
>>
>> This should illustrate the same point with the minor improvement of
>> actually  *working* in SA:
>>   m^http://match/this/url/^
>
> I tend to avoid using symbols that are syntactically significant in REs
> for that purpose. In your example, you can't then anchor the RE at the
> beginning of the URL because ^ has been repurposed as the RE delimiter.

--As for the rest, it is mine.

Since we've already established this is Perl...

I like to use braces.  Perl handles them (and brackets or parens) 
specially: Open with the opening brace and you close with the closing 
brace.  I think Perl will parse for balance as well, but I haven't checked 
at the moment.

  m{http://match/this/url}

In general though I do tend to stick with slashes unless it's going to be a 
problem; it's just more common and easier for people to recognize.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------

Re: regex: chars to escape bsides @

Posted by John Hardin <jh...@impsec.org>.
On Mon, 5 Jan 2015, Bowie Bailey wrote:

> On 1/5/2015 4:13 PM, John Hardin wrote:
>>  On Mon, 5 Jan 2015, Bowie Bailey wrote:
>> 
>> >  You can avoid having to escape the slash (/) by using a different 
>> >  separator for the regex.  This can avoid "leaning toothpick syndrome."
>> > 
>> >  For example:
>> >    m#http://match/this/url/#
>>
>>  Ouch. # won't work for that (in SA at least) as it comments out the rest
>>  of the RE.
>
> Ack!  Forgot about that minor difference with SA.  # is my general go-to 
> character for that in normal Perl scripts.
>
> This should illustrate the same point with the minor improvement of actually 
> *working* in SA:
>   m^http://match/this/url/^

I tend to avoid using symbols that are syntactically significant in REs 
for that purpose. In your example, you can't then anchor the RE at the 
beginning of the URL because ^ has been repurposed as the RE delimiter.


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   The social contract exists so that everyone doesn't have to squat
   in the dust holding a spear to protect his woman and his meat all
   day every day. It does not exist so that the government can take
   your spear, your meat, and your woman because it knows better what
   to do with them.                           -- Dagny @ Ace of Spades
-----------------------------------------------------------------------
  12 days until Benjamin Franklin's 309th Birthday

Re: regex: chars to escape bsides @

Posted by Reindl Harald <h....@thelounge.net>.
Am 05.01.2015 um 23:14 schrieb Bowie Bailey:
> On 1/5/2015 5:02 PM, Reindl Harald wrote:
>> Am 05.01.2015 um 22:57 schrieb Bowie Bailey:
>>> On 1/5/2015 4:13 PM, John Hardin wrote:
>>>>> For example:
>>>>>   m#http://match/this/url/#
>>>>
>>>> Ouch. # won't work for that (in SA at least) as it comments out the
>>>> rest of the RE.
>>>
>>> Ack!  Forgot about that minor difference with SA.  # is my general go-to
>>> character for that in normal Perl scripts.
>>>
>>> This should illustrate the same point with the minor improvement of
>>> actually *working* in SA:
>>> m^http://match/this/url/^
>>
>> i would not play such games because it makes it even more likely that
>> a common escape function is missing something, besides that
>>
>> header    CUST_MANY_SPAM_TO  X-Local-Envelope-To =~
>> /^(\<h\.reindl\@thelounge\.net\>)$/i
>
> It is mainly useful for creating a regex to match a URI without needing
> to escape every slash in the string.  Trying to sort out something like:
> /http:\/\/domain\.com\/xxx\/blah\/blah\// gets old fast.  So you pick a
> character that you don't need in the regex and use it as a separator
> character instead so you don't need nearly as many escapes.

agreed but normally one would use a common escape function and that 
becomes complicated if you have to escape different chars here and there

> Now that I think about it, there is at least one other character that
> needs escaping in a regex: the pipe symbol |

well, that's luckily covered, i only did not expect that i need to 
handle @ % and # manually especially because @ does not need to be 
escaped for grep or postfix-pcre rules

[harry@srv-rhsoft:~]$ php -r "echo preg_quote('|');"
\|


Re: regex: chars to escape bsides @

Posted by Bowie Bailey <Bo...@BUC.com>.
On 1/5/2015 5:02 PM, Reindl Harald wrote:
>
>
> Am 05.01.2015 um 22:57 schrieb Bowie Bailey:
>> On 1/5/2015 4:13 PM, John Hardin wrote:
>>> On Mon, 5 Jan 2015, Bowie Bailey wrote:
>>>
>>>> You can avoid having to escape the slash (/) by using a different
>>>> separator for the regex.  This can avoid "leaning toothpick syndrome."
>>>>
>>>> For example:
>>>>   m#http://match/this/url/#
>>>
>>> Ouch. # won't work for that (in SA at least) as it comments out the
>>> rest of the RE.
>>
>> Ack!  Forgot about that minor difference with SA.  # is my general go-to
>> character for that in normal Perl scripts.
>>
>> This should illustrate the same point with the minor improvement of
>> actually *working* in SA:
>> m^http://match/this/url/^
>
> i would not play such games because it makes it even more likely that 
> a common escape function is missing something, besides that
>
> header    CUST_MANY_SPAM_TO  X-Local-Envelope-To =~ 
> /^(\<h\.reindl\@thelounge\.net\>)$/i

It is mainly useful for creating a regex to match a URI without needing 
to escape every slash in the string.  Trying to sort out something like: 
/http:\/\/domain\.com\/xxx\/blah\/blah\// gets old fast.  So you pick a 
character that you don't need in the regex and use it as a separator 
character instead so you don't need nearly as many escapes.

Now that I think about it, there is at least one other character that 
needs escaping in a regex: the pipe symbol |.

-- 
Bowie

Re: regex: chars to escape bsides @

Posted by Reindl Harald <h....@thelounge.net>.

Am 05.01.2015 um 22:57 schrieb Bowie Bailey:
> On 1/5/2015 4:13 PM, John Hardin wrote:
>> On Mon, 5 Jan 2015, Bowie Bailey wrote:
>>
>>> You can avoid having to escape the slash (/) by using a different
>>> separator for the regex.  This can avoid "leaning toothpick syndrome."
>>>
>>> For example:
>>>   m#http://match/this/url/#
>>
>> Ouch. # won't work for that (in SA at least) as it comments out the
>> rest of the RE.
>
> Ack!  Forgot about that minor difference with SA.  # is my general go-to
> character for that in normal Perl scripts.
>
> This should illustrate the same point with the minor improvement of
> actually *working* in SA:
> m^http://match/this/url/^

i would not play such games because it makes it even more likely that a 
common escape function is missing something, besides that

header    CUST_MANY_SPAM_TO  X-Local-Envelope-To =~ 
/^(\<h\.reindl\@thelounge\.net\>)$/i




Re: regex: chars to escape bsides @

Posted by Bowie Bailey <Bo...@BUC.com>.
On 1/5/2015 4:13 PM, John Hardin wrote:
> On Mon, 5 Jan 2015, Bowie Bailey wrote:
>
>> You can avoid having to escape the slash (/) by using a different 
>> separator for the regex.  This can avoid "leaning toothpick syndrome."
>>
>> For example:
>>   m#http://match/this/url/#
>
> Ouch. # won't work for that (in SA at least) as it comments out the 
> rest of the RE.

Ack!  Forgot about that minor difference with SA.  # is my general go-to 
character for that in normal Perl scripts.

This should illustrate the same point with the minor improvement of 
actually *working* in SA:
   m^http://match/this/url/^

-- 
Bowie


Re: regex: chars to escape bsides @

Posted by Martin Gregorie <ma...@gregorie.org>.
On Mon, 2015-01-05 at 13:13 -0800, John Hardin wrote:
> Ouch. # won't work for that (in SA at least) as it comments out the rest 
> of the RE.
> 
But at least you can escape the # if you need it in a regex.

Martin






Re: regex: chars to escape bsides @

Posted by John Hardin <jh...@impsec.org>.
On Mon, 5 Jan 2015, Bowie Bailey wrote:

> You can avoid having to escape the slash (/) by using a different 
> separator for the regex.  This can avoid "leaning toothpick syndrome."
>
> For example:
>   m#http://match/this/url/#

Ouch. # won't work for that (in SA at least) as it comments out the rest 
of the RE.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Where are my space habitats? Where is my flying car?
   It's 2010 and all I got from the SF books of my youth
   is the lousy dystopian government.                      -- perlhaqr
-----------------------------------------------------------------------
  12 days until Benjamin Franklin's 309th Birthday

Re: regex: chars to escape bsides @

Posted by Bowie Bailey <Bo...@BUC.com>.
On 1/4/2015 5:42 AM, Reindl Harald wrote:
>
> Am 04.01.2015 um 06:43 schrieb Bob Proulx:
>> The additional issue was that you were referencing PHP documentation.
>> Might as well have been referencing Lisp documentation for all of the
>> relevance it had. That was the point I saw being addressed at that
>> point. PHP is a similar syntax that came after Perl but it very much
>> is its own thing and Perl does not derive from it
>
> we are talking about PCRE and the PHP function preg_quote() to escape 
> a string for *perl* regular expressions not the language syntax and 
> since the backneds are alreay there i just asked *what* chars 
> *additional* to that function needs to be escaped *besides* the @ i 
> found out myself
>
> that is a short and simple question and not refer to the used PHP 
> function would not have been hepful at all
>

Ok.  Here is an attempt at a simple answer off the top of my head. (Not 
guaranteed to be complete)

The following characters may need to be escaped in a Perl regex (as used 
in SA) if intended to be used as literal characters:

$%@/[{*+?\

] and } may need to be escaped as well, but I don't think it is required.

You can avoid having to escape the slash (/) by using a different 
separator for the regex.  This can avoid "leaning toothpick syndrome."

For example:
    m#http://match/this/url/#
instead of
    /http:\/\/match\/this\/url\//

Note that whatever character you use will then need to be escaped if 
used within the regex.

-- 
Bowie

Re: regex: chars to escape bsides @

Posted by Reindl Harald <h....@thelounge.net>.
Am 04.01.2015 um 06:43 schrieb Bob Proulx:
> The additional issue was that you were referencing PHP documentation.
> Might as well have been referencing Lisp documentation for all of the
> relevance it had. That was the point I saw being addressed at that
> point. PHP is a similar syntax that came after Perl but it very much
> is its own thing and Perl does not derive from it

we are talking about PCRE and the PHP function preg_quote() to escape a 
string for *perl* regular expressions not the language syntax and since 
the backneds are alreay there i just asked *what* chars *additional* to 
that function needs to be escaped *besides* the @ i found out myself

that is a short and simple question and not refer to the used PHP 
function would not have been hepful at all


Re: regex: chars to escape bsides @

Posted by Reindl Harald <h....@thelounge.net>.

Am 04.01.2015 um 11:21 schrieb Tom Hendrikx:
> On 04-01-15 11:03, Reindl Harald wrote:
>
>
>> Am 04.01.2015 um 09:44 schrieb Henrik K:
>>> On Sat, Jan 03, 2015 at 10:43:49PM -0700, Bob Proulx wrote:
>>>>
>>>> Maybe someone else will come up with a better documentation
>>>> pointer for variables expanded inside Perl strings.
>>>
>>> Umm.. (sorry) for once Reindl is somewhat correct. We are writing
>>> rules using _SpamAssassin_, not coding Perl.  What low-level
>>> regex/variables do in any language is meaningless in this context
>>> as SpamAssassin might manipulate things in any number of ways.
>>> Quoting requirements and other strange things should be
>>> documented in SpamAssassin, but at a quick glance nothing is
>>> mentioned about @, only # is referred as needing quoting.  So
>>> documentation could use an update.
>
>> and h ebiggest issue is that the testmail from gmail hitted
>> "MISSING_HEADERS" and "MISSING_SUBJECT" while both where present
>> and so it looks the whole rule enigne is going crazy because one
>> unescaped @
>
>
> If you add custom rule that don't pass a lint test, you pretty much
> screwed it up yourself. You can't blame spamassassin for that

i can and will blame it for not just skip a broken rule by not skip it 
like postfix does for a invalid pcre-regex the same way as any of my 
users blame me for unexpected behavior

but the *one and only* question is in the subject and *NO* i do not need 
poting to damned perl docs - that i can do myself and for that i don't 
need a mailing-list - the whole purpose of a mailing list is to ask if 
somebody has a already done something similar, ran in the same issues 
and it is *not* unlikely that somebody out there already wrote a php 
backend for generate SA rules


Re: regex: chars to escape bsides @

Posted by Tom Hendrikx <to...@whyscream.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 04-01-15 11:03, Reindl Harald wrote:
> 
> 
> Am 04.01.2015 um 09:44 schrieb Henrik K:
>> On Sat, Jan 03, 2015 at 10:43:49PM -0700, Bob Proulx wrote:
>>> 
>>> Maybe someone else will come up with a better documentation
>>> pointer for variables expanded inside Perl strings.
>> 
>> Umm.. (sorry) for once Reindl is somewhat correct. We are writing
>> rules using _SpamAssassin_, not coding Perl.  What low-level
>> regex/variables do in any language is meaningless in this context
>> as SpamAssassin might manipulate things in any number of ways.
>> Quoting requirements and other strange things should be
>> documented in SpamAssassin, but at a quick glance nothing is 
>> mentioned about @, only # is referred as needing quoting.  So 
>> documentation could use an update.
> 
> and h ebiggest issue is that the testmail from gmail hitted 
> "MISSING_HEADERS" and "MISSING_SUBJECT" while both where present
> and so it looks the whole rule enigne is going crazy because one
> unescaped @
> 

If you add custom rule that don't pass a lint test, you pretty much
screwed it up yourself. You can't blame spamassassin for that.

Regards,

	Tom
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBCAAGBQJUqRRCAAoJEJPfMZ19VO/1UjYP/Ay0lEpNoLw2WgVVJwW+x5j5
wJW3WpFBHiENUUT1tFtUX/a6YjDN5V0kV5gXy7zvOt9VDD3fkuRLxgjQ6koWKgDK
DA3eF7D6nr+nlU7FIZqVb6LO7GtPHMoi+3MEcHwNTysyQTjz3XS6Xe71U5NtRHfG
xv+/w9t+St2TTII1mWBPn+GSH0BH2dlQm8V1cIYHBsI9VmAZj96/wWQVx18jLOOc
ZKpFUnHhRiC9HDud0jeOm4nMg0dY5eZh1BNUtz0bkzDB6u/rkmYkDIy88y752jAQ
DThue9V1nNosEmqkMNA665QHprNg3NewwysErprEak5v+0C3+qL3/hYtA4niGomN
N7t9Dgvh7JvFG/Va70RO2Cd0VyY3LEgnOi2qyqQnwAlbkFMyEH9UMzVrH8snf03D
RbGAId5Ja/Gkuw0LDt2CTnWshpKkyd+Jb1Vo0sktDnH/BtMUzUVpNMdxXbT95/DG
eQG9G1Z/I7O9h1ZGGmJl3r4jfZMiKNhlkLVZItBnngtG0QoiGk9/+CUGD8WCFlso
tH1l3Z5Nyo9abYGG0fcxmPKQENthVG64oqqO3hOKsBPb6hJiVQd7s7P7QXdAZ8Sy
Ebf9OAURabv2ibwTheDLTiPHVnDSXNU4kz4jJrVabgevC+tWgwDtK/7CquXyUINt
Y7S79bkM4zbdZ6kyo+5u
=SpVP
-----END PGP SIGNATURE-----

Re: regex: chars to escape bsides @

Posted by Reindl Harald <h....@thelounge.net>.

Am 04.01.2015 um 09:44 schrieb Henrik K:
> On Sat, Jan 03, 2015 at 10:43:49PM -0700, Bob Proulx wrote:
>>
>> Maybe someone else will come up with a better documentation pointer
>> for variables expanded inside Perl strings.
>
> Umm.. (sorry) for once Reindl is somewhat correct. We are writing rules
> using _SpamAssassin_, not coding Perl.  What low-level regex/variables do in
> any language is meaningless in this context as SpamAssassin might manipulate
> things in any number of ways.  Quoting requirements and other strange things
> should be documented in SpamAssassin, but at a quick glance nothing is
> mentioned about @, only # is referred as needing quoting.  So documentation
> could use an update.

and h ebiggest issue is that the testmail from gmail hitted 
"MISSING_HEADERS" and "MISSING_SUBJECT" while both where present and so 
it looks the whole rule enigne is going crazy because one unescaped @


Re: regex: chars to escape bsides @

Posted by Henrik K <he...@hege.li>.
On Sat, Jan 03, 2015 at 10:43:49PM -0700, Bob Proulx wrote:
> 
> Maybe someone else will come up with a better documentation pointer
> for variables expanded inside Perl strings.

Umm.. (sorry) for once Reindl is somewhat correct. We are writing rules
using _SpamAssassin_, not coding Perl.  What low-level regex/variables do in
any language is meaningless in this context as SpamAssassin might manipulate
things in any number of ways.  Quoting requirements and other strange things
should be documented in SpamAssassin, but at a quick glance nothing is
mentioned about @, only # is referred as needing quoting.  So documentation
could use an update.

Anyway, atleast --lint complains for me if using bad syntax, so it should be
a clear indicator if something requires attention..

$ spamassassin --lint
Jan  4 10:34:06.520 [6260] warn: Possible unintended interpolation of @baz in string at /usr/local/perl/etc/mail/spamassassin/test.cf, rule TEST_1, line 1, <GEN0> line 17.
Jan  4 10:34:06.527 [6260] warn: rules: failed to compile Mail::SpamAssassin::Plugin::Check::_body_tests_0_3, skipping:
Jan  4 10:34:06.528 [6260] warn:  (Global symbol "@baz" requires explicit package name at /usr/local/perl/etc/mail/spamassassin/test.cf, rule TEST_1, line 1, <GEN0> line 17.)
Jan  4 10:34:06.770 [6260] warn: lint: 1 issues detected, please rerun with debug enabled for more information


Re: regex: chars to escape bsides @

Posted by Bob Proulx <bo...@proulx.com>.
Reindl Harald wrote:
> schrieb Bob Proulx:
> >Reindl, Please play nice.  Dave was exactly correct and friendly with
> >his response to you.  Liberty, tolerance and respect are not zero sum
> >concepts.  (Stealing the excellent phrase from Judge Robert Hinkle.)
> 
> friendly would have been without the "Umm" and with a link

"Umm..." is a friendly way to say something in English.  I usually
spell it "Uhm...".  It is a sound not a word.  It is a very casual way
to soften the words.  I realize that translations can be hard but
please file that one away as being *soft* and *friendly*.  If someone
says "uhm" or "umm" or other spellings (it is a sound not a word and
so has no exact spelling) then it means they are simply trying to put
something forward *without* making a strong statement of the
imperative about it.  You can read it as "Please excuse me but..."

> the question was "are there additional chars to escape" and not "should i
> write all my admin backends in perl"

The additional issue was that you were referencing PHP documentation.
Might as well have been referencing Lisp documentation for all of the
relevance it had.  That was the point I saw being addressed at that
point.  PHP is a similar syntax that came after Perl but it very much
is its own thing and Perl does not derive from it.

> >As Dave said, SpamAssassin is written in Perl not PHP.  The *Perl*
> >docs are the ones you should reference not the PHP docs.
> >
> >   http://perldoc.perl.org/perlre.html
> 
> don't refer to why @ inside a regex rule needs escaping nor are we at perl

No.  That was in the second reference perldata.html which describes
perl data structures including @arrayvariable constructs.  The above
reference covered the regular expressions being used by SpamAssassin.

> low-level by writing SA-rules, otherwise "blacklist_from" would need escape
> @ too but they don't - frankly i would even call it a bug

No.  It is not a bug.  It is standard Perl syntax.  And instead of
trying to write write all of the documentation again in random emails
and getting half of it wrong and creating new misunderstandings it
makes the most sense simply to reference the perl documentation that
covers the topic.

If there is something that isn't clear in the reference documentation
then it is more than welcome to ask questions about it.  That is the
best place for mailing list discussion.  But at least start with the
docs and then work from there.  Otherwise we will all be doing
duplicate work by having documentation and also having mailing list
messages that say almost all same thing but being written quickly will
contain errors and will create misconceptions.

> >The reason @ needs to be escaped is because in Perl "@array" is a
> >string containing an array variable that expands to an array value.
> >To be a literal at sign in the string it needs "\@array" and that is
> >the Perl syntax.
> >
> >   http://perldoc.perl.org/perldata.html
> 
> don't also show a example of \@ and again i call it a bug to expose that
> *low* level to SA-rules in a not consistent way

Feel free to complain about my choice of documentation pointer in the
above.  I admit I didn't do very well with that one.  I was trying to
find the relevant documentation but that was what I found in the time
I had available to respond.  Perl is a popular language and has a
*lot* of documentation available for it.  So much that it can be
difficult to locate specific things sometimes.  That was the best I
could come up with at the moment.

Looking a little closer I see this:

  http://perldoc.perl.org/perlop.html#Regexp-Quote-Like-Operators

It states there that m// and s/// are double-quote like operators and
therefore will expand variables.  Variables are $variable and
@variable and other expansions unless quoted.  Therefore @ is one of
the items that need to be quoted.  Along with $ and %.  And since I am
simply summarizing Perl documentation I may have that wrong.  Which is
why it is better to read the official documentation.

Maybe someone else will come up with a better documentation pointer
for variables expanded inside Perl strings.

Bob

Re: regex: chars to escape bsides @

Posted by Reindl Harald <h....@thelounge.net>.
Am 04.01.2015 um 02:38 schrieb Bob Proulx:
> Reindl Harald wrote:
>> schrieb Dave Funk:
>>> Umm, SA is written in Perl, not PHP. So you should look at Perl
>>> regex documentation, not PHP docs
>>
>> so what - @ is not a to escape char in whatever language and hence SA
>> specific and it don't matter in what language SA is written if you write a
>> *backend* in PHP - guess what "preg" means even if you are too lazy to click
>> on the link -> "Perl Compatible Regular Expressions (PCRE)"
>>
>> so do me a favor: if you don't have a answer leave me in peace because i am
>> tired of answers with no content at all
>>
>> http://php.net/manual/en/ref.pcre.php
>
> Reindl, Please play nice.  Dave was exactly correct and friendly with
> his response to you.  Liberty, tolerance and respect are not zero sum
> concepts.  (Stealing the excellent phrase from Judge Robert Hinkle.)

friendly would have been without the "Umm" and with a link

the question was "are there additional chars to escape" and not "should 
i write all my admin backends in perl"

> As Dave said, SpamAssassin is written in Perl not PHP.  The *Perl*
> docs are the ones you should reference not the PHP docs.
>
>    http://perldoc.perl.org/perlre.html

don't refer to why @ inside a regex rule needs escaping nor are we at 
perl low-level by writing SA-rules, otherwise "blacklist_from" would 
need escape @ too but they don't - frankly i would even call it a bug

> The reason @ needs to be escaped is because in Perl "@array" is a
> string containing an array variable that expands to an array value.
> To be a literal at sign in the string it needs "\@array" and that is
> the Perl syntax.
>
>    http://perldoc.perl.org/perldata.html

don't also show a example of \@ and again i call it a bug to expose that 
*low* level to SA-rules in a not consistent way


Re: regex: chars to escape bsides @

Posted by Bob Proulx <bo...@proulx.com>.
Reindl Harald wrote:
> schrieb Dave Funk:
> > Umm, SA is written in Perl, not PHP. So you should look at Perl
> > regex documentation, not PHP docs
> 
> so what - @ is not a to escape char in whatever language and hence SA
> specific and it don't matter in what language SA is written if you write a
> *backend* in PHP - guess what "preg" means even if you are too lazy to click
> on the link -> "Perl Compatible Regular Expressions (PCRE)"
> 
> so do me a favor: if you don't have a answer leave me in peace because i am
> tired of answers with no content at all
> 
> http://php.net/manual/en/ref.pcre.php

Reindl, Please play nice.  Dave was exactly correct and friendly with
his response to you.  Liberty, tolerance and respect are not zero sum
concepts.  (Stealing the excellent phrase from Judge Robert Hinkle.)

As Dave said, SpamAssassin is written in Perl not PHP.  The *Perl*
docs are the ones you should reference not the PHP docs.

  http://perldoc.perl.org/perlre.html

The reason @ needs to be escaped is because in Perl "@array" is a
string containing an array variable that expands to an array value.
To be a literal at sign in the string it needs "\@array" and that is
the Perl syntax.

  http://perldoc.perl.org/perldata.html

Bob

Re: regex: chars to escape bsides @

Posted by Reindl Harald <h....@thelounge.net>.
Am 04.01.2015 um 00:55 schrieb Dave Funk:
> On Sat, 3 Jan 2015, Reindl Harald wrote:
>
>> by writing some custom rules like below i found out that @ needs to be
>> esacped additionally to http://php.net/manual/de/function.preg-quote.php
>>
>> are there other chars which needs special handling?
>>
>> header    CUST_MANY_SPAM_TO  X-Local-Envelope-To =~
>> /^(\<h\.reindl\@thelounge\.net\>)$/i
>> score     CUST_MANY_SPAM_TO  -4.0
>> describe  CUST_MANY_SPAM_TO  Custom Scoring
>
> Umm, SA is written in Perl, not PHP. So you should look at Perl
> regex documentation, not PHP docs

so what - @ is not a to escape char in whatever language and hence SA 
specific and it don't matter in what language SA is written if you write 
a *backend* in PHP - guess what "preg" means even if you are too lazy to 
click on the link -> "Perl Compatible Regular Expressions (PCRE)"

so do me a favor: if you don't have a answer leave me in peace because i 
am tired of answers with no content at all

http://php.net/manual/en/ref.pcre.php



Re: regex: chars to escape bsides @

Posted by Dave Funk <db...@engineering.uiowa.edu>.
On Sat, 3 Jan 2015, Reindl Harald wrote:

> by writing some custom rules like below i found out that @ needs to be 
> esacped additionally to http://php.net/manual/de/function.preg-quote.php
>
> are there other chars which needs special handling?
>
> header    CUST_MANY_SPAM_TO  X-Local-Envelope-To =~ 
> /^(\<h\.reindl\@thelounge\.net\>)$/i
> score     CUST_MANY_SPAM_TO  -4.0
> describe  CUST_MANY_SPAM_TO  Custom Scoring

Umm, SA is written in Perl, not PHP. So you should look at Perl
regex documentation, not PHP docs.

-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: regex: chars to escape bsides @

Posted by Martin Gregorie <ma...@gregorie.org>.
On Sat, 2015-01-03 at 21:08 +0100, Reindl Harald wrote:
> by writing some custom rules like below i found out that @ needs to be 
> esacped additionally to http://php.net/manual/de/function.preg-quote.php
> 
> are there other chars which needs special handling?
> 
> header    CUST_MANY_SPAM_TO  X-Local-Envelope-To =~ 
> /^(\<h\.reindl\@thelounge\.net\>)$/i
> score     CUST_MANY_SPAM_TO  -4.0
> describe  CUST_MANY_SPAM_TO  Custom Scoring
> 
One more, I think: # as in  the HTML character encoding &#959; which
needs to become \&\#959; in a regex if you don't want SA lint to
complain.


Martin