You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Dave Funk <db...@engineering.uiowa.edu> on 2021/05/03 15:18:51 UTC

Counting number of instances of a particular header

I'm trying to create a rule to count the number of instances of a particular 
header.
IE in email messages there could be zero or more instances of a particular 
header and I want to know how many there are so I can use that info in a meta to 
detect a spam sign.

I first crafted a rule:
header L_MY_HEADER   X-My-Header !~ /^UNSET$/ [if-unset: UNSET]
describe L_MY_HEADER has X-My_header
score L_MY_HEADER    0.1

Which did correctly detect the existence of 'X-My-Header'. Then to count the 
number of them I added a 'tflags':
tflags L_MY_HEADER  multiple maxhits=10

But that would always fire 10 times if there were any instances of 'X-My-Header' 
(even if there was only one).

So I modified the pattern match part of the rule:
header L_MY_HEADER  X-My-Header =~ /./

Which had the same effect as the first form (IE either zero or 10 firings).

As the header would have at least 6 characters but less than 150 I then tried:
header L_MY_HEADER  X-My-Header =~ /^.{5,200}/

Which would fire only once, even if there were 5 or more instances of the 
header.

What am I doing wrong? How should I craft a rule to count the number of 
instances of that header?

Thanks,
Dave

-- 
Dave Funk                               University of Iowa
<dbfunk (at) engineering.uiowa.edu>     College of Engineering
319/335-5751   FAX: 319/384-0549        1256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin         Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: Counting number of instances of a particular header

Posted by RW <rw...@googlemail.com>.
On Mon, 3 May 2021 10:18:51 -0500 (CDT)
Dave Funk wrote:

> I'm trying to create a rule to count the number of instances of a
> particular header.
...
> What am I doing wrong? How should I craft a rule to count the number
> of instances of that header?

It's important to understand that when headers are repeated, the match
runs against a single string with multiple lines, not multiple strings. 

So

  header L_MY_HEADER  X-My-Header =~ /^.{5,200}/

can only match once because, by default, ^ matches the beginning of
the string. 

For header tests involving multiple lines, the /m and /s modifiers
can be useful, but are often not essential as you can test for newline
characters instead. 

You only need tflags multiple if you need to get a numerical 
value for meta-rules, you can write a rule for N or more headers using
a single header rule.

e.g. to test for two or more headers (if empty headers aren't a
concern) you can simply use:

header L_MULTIPLE_MY_HEADER X-My-Header =~ /\n./

Re: Counting number of instances of a particular header

Posted by Henrik K <he...@hege.li>.
https://cwiki.apache.org/confluence/display/SPAMASSASSIN/WritingRulesAdvanced

You need m-modifier, matched string is all the header values separated by
newline, so you want to match all of the line starts.

header L_MY_HEADER  X-My-Header =~ /^/m
tflags L_MY_HEADER multiple


On Mon, May 03, 2021 at 10:18:51AM -0500, Dave Funk wrote:
> I'm trying to create a rule to count the number of instances of a particular
> header.
> IE in email messages there could be zero or more instances of a particular
> header and I want to know how many there are so I can use that info in a
> meta to detect a spam sign.
> 
> I first crafted a rule:
> header L_MY_HEADER   X-My-Header !~ /^UNSET$/ [if-unset: UNSET]
> describe L_MY_HEADER has X-My_header
> score L_MY_HEADER    0.1
> 
> Which did correctly detect the existence of 'X-My-Header'. Then to count the
> number of them I added a 'tflags':
> tflags L_MY_HEADER  multiple maxhits=10
> 
> But that would always fire 10 times if there were any instances of
> 'X-My-Header' (even if there was only one).
> 
> So I modified the pattern match part of the rule:
> header L_MY_HEADER  X-My-Header =~ /./
> 
> Which had the same effect as the first form (IE either zero or 10 firings).
> 
> As the header would have at least 6 characters but less than 150 I then tried:
> header L_MY_HEADER  X-My-Header =~ /^.{5,200}/
> 
> Which would fire only once, even if there were 5 or more instances of the
> header.
> 
> What am I doing wrong? How should I craft a rule to count the number of
> instances of that header?
> 
> Thanks,
> Dave
> 
> -- 
> Dave Funk                               University of Iowa
> <dbfunk (at) engineering.uiowa.edu>     College of Engineering
> 319/335-5751   FAX: 319/384-0549        1256 Seamans Center, 103 S Capitol St.
> Sys_admin/Postmaster/cell_admin         Iowa City, IA 52242-1527
> #include <std_disclaimer.h>
> Better is not better, 'standard' is better. B{

Re: Counting number of instances of a particular header

Posted by RW <rw...@googlemail.com>.
On Mon, 03 May 2021 13:17:59 -0400
Bill Cole wrote:

> On 3 May 2021, at 11:18, Dave Funk wrote:

> >
> > I first crafted a rule:
> > header L_MY_HEADER   X-My-Header !~ /^UNSET$/ [if-unset: UNSET]  
> 
>
> > But that would always fire 10 times if there were any instances of 
> > 'X-My-Header' (even if there was only one).  
> 
> I guess that's an artifact of combining the 'if-unset' functionality 
> with 'tflags multiple' or possibly the negative match test or both.


Probably the combination of !~ with tflags multiple.


> I'm not sure that it is exactly a bug, because I can't say how SA
> "should" deal with that combination of syntax.

It could be seen as a bug since without 'maxhits' it would
presumably have looped until timeout. 

Re: Counting number of instances of a particular header

Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 3 May 2021, at 11:18, Dave Funk wrote:

> I'm trying to create a rule to count the number of instances of a 
> particular header.
> IE in email messages there could be zero or more instances of a 
> particular header and I want to know how many there are so I can use 
> that info in a meta to detect a spam sign.
>
> I first crafted a rule:
> header L_MY_HEADER   X-My-Header !~ /^UNSET$/ [if-unset: UNSET]

????
That's a deeply weird rule.

Try just this:

header L_MY_HEADER   X-My-Header =~ /^./m


> describe L_MY_HEADER has X-My_header
> score L_MY_HEADER    0.1
>
> Which did correctly detect the existence of 'X-My-Header'. Then to 
> count the number of them I added a 'tflags':
> tflags L_MY_HEADER  multiple maxhits=10
>
> But that would always fire 10 times if there were any instances of 
> 'X-My-Header' (even if there was only one).

I guess that's an artifact of combining the 'if-unset' functionality 
with 'tflags multiple' or possibly the negative match test or both. I'm 
not sure that it is exactly a bug, because I can't say how SA "should" 
deal with that combination of syntax.



>
> So I modified the pattern match part of the rule:
> header L_MY_HEADER  X-My-Header =~ /./
>
> Which had the same effect as the first form (IE either zero or 10 
> firings).
>
> As the header would have at least 6 characters but less than 150 I 
> then tried:
> header L_MY_HEADER  X-My-Header =~ /^.{5,200}/
>
> Which would fire only once, even if there were 5 or more instances of 
> the header.
>
> What am I doing wrong? How should I craft a rule to count the number 
> of instances of that header?
>
> Thanks,
> Dave
>
> -- 
> Dave Funk                               University of Iowa
> <dbfunk (at) engineering.uiowa.edu>     College of Engineering
> 319/335-5751   FAX: 319/384-0549        1256 Seamans Center, 103 S 
> Capitol St.
> Sys_admin/Postmaster/cell_admin         Iowa City, IA 52242-1527
> #include <std_disclaimer.h>
> Better is not better, 'standard' is better. B{


-- 
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire