You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Pedro David Marco <pe...@yahoo.com> on 2018/12/06 17:52:13 UTC
Understanding header ALL
Hi,
i need some wisdom from SA monks please...
Can anyone explain briefly how header ALL work?
if i try a rule like this:
header TESTRULE1 ALL =~ /.+/ism
Using -D debug mode i only "see" the first header of the email... shouldn't i see all headers?
it works nice if i check for something slightly more complex, such as....
header TESTRULE2 ALL =~ /From=.*pedro.* To=.*pedro.*/ism
but i am trying to understand how it works... and why i only see one line in Debug mode...
Thx,
--------PedroD
Re: Understanding header ALL
Posted by John Hardin <jh...@impsec.org>.
On Thu, 6 Dec 2018, Pedro David Marco wrote:
> Hi,
> i need some wisdom from SA monks please...
> Can anyone explain briefly how header ALL work?
> if i try a rule like this:
> header TESTRULE1 ALL =~ /.+/ism
> Using -D debug mode i only "see" the first header of the email... shouldn't i see all headers?
>
> it works nice if i check for something slightly more complex, such as....
> header TESTRULE2 ALL =~ /From=.*pedro.* To=.*pedro.*/ism
> but i am trying to understand how it works... and why i only see one line in Debug mode...
> Thx,
"." apparently doesn't match line breaks (I'm sure that's documented
somewhere in the RE language spec but I can't be bothered to dig it up
right now :) ).
There's two ways to do this:
# All headers, one per hit
header __ALL_HEADERS ALL =~ /.+/sm
tflags __ALL_HEADERS multiple
# All headers together in one hit
header __ALL_HEADERS_ALL ALL =~ /(?:.+$)+/sm
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
USMC Rules of Gunfighting #6: If you can choose what to bring
to a gunfight, bring a long gun and a friend with a long gun.
-----------------------------------------------------------------------
Tomorrow: The 77th anniversary of Pearl Harbor
Re: Understanding header ALL
Posted by Benny Pedersen <me...@junc.eu>.
Pedro David Marco skrev den 2018-12-06 22:29:
> if your rule worked, it would only match FROM or TO... the great
> advantage of the ALL is that i "sees" all headers in one string so we
> can match FROM 'and' TO at the same time
i know from my own rules it sometime possitive to limit data to what is
wanted :=)
all header could be upto 64kb if i remember smtp specs well, so i would
check this to see if i miss something with it, to tired for today to
test it, my own intrest begin to make smtp test rules, inspired from
rspamd into spamassassin, i will not disclose it, since spammers might
listen here what i do, i stand behind open source, but will not help
spammers make a game
i had using rspamd, but lost intrest in it, to complicated for me to
manage, and ucl was and is not well supported in linux, why did thay not
use xml where there is plenty of tools to build edit and manage it,
thanks to spamassassin it not that complicated to make things working
Re: Understanding header ALL
Posted by Henrik K <he...@hege.li>.
Why do you need to match them at the same time? Using meta would be more
effective. Also this doesn't care what order From and To appear in headers.
header __FOO1 From =~ /pedro/i
header __FOO2 To =~ /pedro/i
meta FOO __FOO1 && __FOO2
But just to put it out there, if you want to capture and match things from
another header, proper way would be with positive lookaheads, header order
won't matter then. This finds pedro from To:.
header FOO ALL =~ /^(?=.*?\nFrom:[^\n]*(pedro))(?=.*?\nTo:[^\n]*\1)/si
On Thu, Dec 06, 2018 at 09:29:40PM +0000, Pedro David Marco wrote:
> Thanks Benny,
>
> if your rule worked, it would only match FROM or TO... the great advantage of
> the ALL is that i "sees" all headers in one string so we can match FROM 'and'
> TO at the same time
>
> ------
> PedroD
>
>
>
> On Thursday, December 6, 2018, 10:23:17 PM GMT+1, Benny Pedersen <me...@junc.eu>
> wrote:
>
>
> Pedro David Marco skrev den 2018-12-06 21:25:
>
>
> > header TESTRULE2 ALL =~ /From=.*pedro.*
> > To=.*pedro.*/ism
> > This is a mistery... :-?
>
>
> header TESTRULE (From|To) =~ /\.*pedro\.*/ism
>
> dont know if it works, just my silly thinking right now
>
Re: Understanding header ALL
Posted by Pedro David Marco <pe...@yahoo.com>.
Thanks Benny,
if your rule worked, it would only match FROM or TO... the great advantage of the ALL is that i "sees" all headers in one string so we can match FROM 'and' TO at the same time
------PedroD
On Thursday, December 6, 2018, 10:23:17 PM GMT+1, Benny Pedersen <me...@junc.eu> wrote:
Pedro David Marco skrev den 2018-12-06 21:25:
> header TESTRULE2 ALL =~ /From=.*pedro.*
> To=.*pedro.*/ism
> This is a mistery... :-?
header TESTRULE (From|To) =~ /\.*pedro\.*/ism
dont know if it works, just my silly thinking right now
Re: Understanding header ALL
Posted by Benny Pedersen <me...@junc.eu>.
Pedro David Marco skrev den 2018-12-06 21:25:
> header TESTRULE2 ALL =~ /From=.*pedro.*
> To=.*pedro.*/ism
> This is a mistery... :-?
header TESTRULE (From|To) =~ /\.*pedro\.*/ism
dont know if it works, just my silly thinking right now
Re: Understanding header ALL
Posted by John Hardin <jh...@impsec.org>.
On Fri, 7 Dec 2018, Bill Cole wrote:
> This is entirely a debug message artifact. In fact, '/.+/' will match the
> entire header block, however the 'dbg()' function won't print all of that,
> apparently due to an expansion artifact in Mail::SpamAssassin::Logger
Aha! Thanks for explaining that!
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
7 days until Bill of Rights day
Re: Understanding header ALL
Posted by Pedro David Marco <pe...@yahoo.com>.
$BillCole++ ; # :-)
Thanks Bill.. that was my concern and what i was suspecting...
----------Pedro.D
On Saturday, December 8, 2018, 3:59:12 AM GMT+1, Bill Cole <sa...@billmail.scconsult.com> wrote:
On 6 Dec 2018, at 15:25, Pedro David Marco wrote:
> Thanks Bill and John...
> Your words make sense to me. It seems that ALL means that SA puts all
> headers into a Perl string (including \n chars) and tries the regex...
> As John Hardin correctly states, a dot does not match the \n but
> this is changed with the "s" regex flag.
> In fact it works like a charm if i try a rule like this:
> header TESTRULE2 ALL =~
> /From=.*pedro.* To=.*pedro.*/ism
> This is a mistery... :-?
No mystery: misunderstanding. I thought you were expecting multiple
hits, but now I realize that you are just asking about the debug
message.
This is entirely a debug message artifact. In fact, '/.+/' will match
the entire header block, however the 'dbg()' function won't print all of
that, apparently due to an expansion artifact in
Mail::SpamAssassin::Logger
--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole
Re: Understanding header ALL
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 6 Dec 2018, at 15:25, Pedro David Marco wrote:
> Thanks Bill and John...
> Your words make sense to me. It seems that ALL means that SA puts all
> headers into a Perl string (including \n chars) and tries the regex...
> As John Hardin correctly states, a dot does not match the \n but
> this is changed with the "s" regex flag.
> In fact it works like a charm if i try a rule like this:
> header TESTRULE2 ALL =~
> /From=.*pedro.* To=.*pedro.*/ism
> This is a mistery... :-?
No mystery: misunderstanding. I thought you were expecting multiple
hits, but now I realize that you are just asking about the debug
message.
This is entirely a debug message artifact. In fact, '/.+/' will match
the entire header block, however the 'dbg()' function won't print all of
that, apparently due to an expansion artifact in
Mail::SpamAssassin::Logger
--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole
Re: Understanding header ALL
Posted by Pedro David Marco <pe...@yahoo.com>.
Thanks Bill and John...
Your words make sense to me. It seems that ALL means that SA puts all headers into a Perl string (including \n chars) and tries the regex...
As John Hardin correctly states, a dot does not match the \n but this is changed with the "s" regex flag.
In fact it works like a charm if i try a rule like this:
header TESTRULE2 ALL =~ /From=.*pedro.* To=.*pedro.*/ism
This is a mistery... :-?
Thanks to all...
---PedroD
On Thursday, December 6, 2018, 8:32:46 PM GMT+1, Bill Cole <sa...@billmail.scconsult.com> wrote:
On 6 Dec 2018, at 13:36, Pedro David Marco wrote:
> Thanks a lot Bill..
> i already considered the "multiple" flag and it did not work
> either... i mean... the rule works but i only see the first line
> in Debug mode...
> ----Pedrod
Having pondered this for a bit and looked at unhelpful docs, I *think* I
understand what's going on.
You cannot get multiple hits from an ALL rule because the regex is
matched against the whole block of headers. Once it matches, the test is
done.
It might make sense to add an "ANY" pseudo-header that tests against
each header, rather than "ALL" which tests against the whole text of all
the headers.
--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole
Re: Understanding header ALL
Posted by RW <rw...@googlemail.com>.
On Fri, 07 Dec 2018 09:14:11 -0500
Bill Cole wrote:
> On 7 Dec 2018, at 8:33, RW wrote:
>
> > On Thu, 06 Dec 2018 14:32:37 -0500
> > Bill Cole wrote:
> >
> >> You cannot get multiple hits from an ALL rule because the regex is
> >> matched against the whole block of headers. Once it matches, the
> >> test is done.
> >
> > Just for the record, that isn't a limitation of "multiple"
>
> Right. It's inherent in the logic of the "ALL" pseudo-header: an
> aggregate of all headers, not an array of discrete headers.
I wouldn't expect it to make any difference. In the body each paragraph
is a separate string (unfortunately), and hit counts are aggregated
across all of them.
Re: Understanding header ALL
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 7 Dec 2018, at 8:33, RW wrote:
> On Thu, 06 Dec 2018 14:32:37 -0500
> Bill Cole wrote:
>
>> You cannot get multiple hits from an ALL rule because the regex is
>> matched against the whole block of headers. Once it matches, the test
>> is done.
>
> Just for the record, that isn't a limitation of "multiple"
Right. It's inherent in the logic of the "ALL" pseudo-header: an
aggregate of all headers, not an array of discrete headers.
Re: Understanding header ALL
Posted by RW <rw...@googlemail.com>.
On Thu, 06 Dec 2018 14:32:37 -0500
Bill Cole wrote:
> You cannot get multiple hits from an ALL rule because the regex is
> matched against the whole block of headers. Once it matches, the test
> is done.
Just for the record, that isn't a limitation of "multiple"
header T_TEST1 Subject =~ /\w+/
tflags T_TEST1 multiple
$ echo "Subject: Mary had a little lamb" | spamassassin -D 2>&1 | grep -o 'T_TEST1.*'
T_TEST1 ======> got hit: "Mary"
T_TEST1 ======> got hit: "had"
T_TEST1 ======> got hit: "a"
T_TEST1 ======> got hit: "little"
T_TEST1 ======> got hit: "lamb"
Re: Understanding header ALL
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 6 Dec 2018, at 13:36, Pedro David Marco wrote:
> Thanks a lot Bill..
> i already considered the "multiple" flag and it did not work
> either... i mean... the rule works but i only see the first line
> in Debug mode...
> ----Pedrod
Having pondered this for a bit and looked at unhelpful docs, I *think* I
understand what's going on.
You cannot get multiple hits from an ALL rule because the regex is
matched against the whole block of headers. Once it matches, the test is
done.
It might make sense to add an "ANY" pseudo-header that tests against
each header, rather than "ALL" which tests against the whole text of all
the headers.
--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole
Re: Understanding header ALL
Posted by Pedro David Marco <pe...@yahoo.com>.
Thanks a lot Bill..
i already considered the "multiple" flag and it did not work either... i mean... the rule works but i only see the first line in Debug mode...
----Pedrod
On Thursday, December 6, 2018, 7:21:46 PM GMT+1, Bill Cole <sa...@billmail.scconsult.com> wrote:
On 6 Dec 2018, at 12:52, Pedro David Marco wrote:
> Hi,
> i need some wisdom from SA monks please...
> Can anyone explain briefly how header ALL work?
> if i try a rule like this:
> header TESTRULE1 ALL =~ /.+/ism
> Using -D debug mode i only "see" the first header of the email...
> shouldn't i see all headers?
>
> it works nice if i check for something slightly more complex, such
> as....
> header TESTRULE2 ALL =~
> /From=.*pedro.* To=.*pedro.*/ism
> but i am trying to understand how it works... and why i only see one
> line in Debug mode...
> Thx,
> --------PedroD
For a rule to match more than once per message, it needs to have the
'multiple' tflag set, e.g.:
tflags TESTRULE1 multiple maxhits=50
(It's generally wise to set *some* 'maxhits' value on a 'multiple' rule,
since it can save you from runaway scanning of pathological messages.)
--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole
Re: Understanding header ALL
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 6 Dec 2018, at 12:52, Pedro David Marco wrote:
> Hi,
> i need some wisdom from SA monks please...
> Can anyone explain briefly how header ALL work?
> if i try a rule like this:
> header TESTRULE1 ALL =~ /.+/ism
> Using -D debug mode i only "see" the first header of the email...
> shouldn't i see all headers?
>
> it works nice if i check for something slightly more complex, such
> as....
> header TESTRULE2 ALL =~
> /From=.*pedro.* To=.*pedro.*/ism
> but i am trying to understand how it works... and why i only see one
> line in Debug mode...
> Thx,
> --------PedroD
For a rule to match more than once per message, it needs to have the
'multiple' tflag set, e.g.:
tflags TESTRULE1 multiple maxhits=50
(It's generally wise to set *some* 'maxhits' value on a 'multiple' rule,
since it can save you from runaway scanning of pathological messages.)
--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole