You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Pedro David Marco <pe...@yahoo.com> on 2018/09/17 15:29:48 UTC

Rule for multiple paragraphs

Hi!
is there any trick to make a rule work along different body paragraphs?? or maybe the only way is via plugins...
Regards,

-----PedroD

Re: Rule for multiple paragraphs

Posted by RW <rw...@googlemail.com>.
On Tue, 18 Sep 2018 08:20:38 +0100
Marisa Clardy wrote:

> Hello,
> 
> How I'd do it is with this regex:
> 
> /^(apache(\r|\n)+)+$/s

Owing to the way the normalized body is stored, this kind of thing won't
work with body rules - that's what the thread is about. It can work with
rawbody rules, but they have problems of their own.

> The 's' flag would be necessary here.

It wouldn't since there's no '.' in the rule.

>  Technically there should always be a single \r\n at the end of the
> line,

That's required during an SMTP transaction, but emails are usually
converted to native format before being stored. Most SpamAssassin rules
operate on derived data stored with native line endings.

Re: Rule for multiple paragraphs

Posted by RW <rw...@googlemail.com>.
On Mon, 17 Sep 2018 18:04:00 +0100
RW wrote:

> If the normalized body were
> stored as single-line paragraphs separated by newlines (perhaps broken
> into large blocks), it would make make it possible to write more
> reliable body rules without changing the behaviour of existing rules.

I'll rephrase that as: without much impact on existing rules.

Re: Rule for multiple paragraphs

Posted by RW <rw...@googlemail.com>.
On Mon, 17 Sep 2018 16:33:53 +0000 (UTC)
Pedro David Marco wrote:

>  
> 
>     On Monday, September 17, 2018, 6:29:33 PM GMT+2, RW
> <rw...@googlemail.com> wrote:  
>  >If that actually occurred in the body it would be normalized to  
> >apache apache apache
> >
> >If you mean >apache
> >
> >apache
> >  
> >apache>then my understanding is that a body rule would run
> >apache>independently oneach instance of 'apache', not on
> >apache>'apache\napache\napache\n'.  
> 
> Yes you are right... the question is how to "regex" along different
> paragraphs...   

You can sometimes work around it with rawbody rules.

I don't know why body rules work that way. If the normalized body were
stored as single-line paragraphs separated by newlines (perhaps broken
into large blocks), it would make make it possible to write more
reliable body rules without changing the behaviour of existing rules.

Re: Rule for multiple paragraphs

Posted by Pedro David Marco <pe...@yahoo.com>.
 

    On Monday, September 17, 2018, 6:29:33 PM GMT+2, RW <rw...@googlemail.com> wrote:  
 >If that actually occurred in the body it would be normalized to  
>apache apache apache
>
>If you mean >apache
>
>apache
>
>apache>then my understanding is that a body rule would run independently oneach instance of 'apache', not on  'apache\napache\napache\n'.

Yes you are right... the question is how to "regex" along different paragraphs...   

Re: Rule for multiple paragraphs

Posted by RW <rw...@googlemail.com>.
On Mon, 17 Sep 2018 15:47:20 +0000 (UTC)
Pedro David Marco wrote:

>  >On Monday, September 17, 2018, 5:34:48 PM GMT+2, Antony Stone
>  ><An...@spamassassin.open.source.it> wrote: Give us a bit
>  >more of a clue what you are trying / hoping to do?
> >In what way do you want to identify different paragraphs in an
> >email, and how should the rules be applied differently?  
> 
> Sure, thanks Antony...
> I want i to detect multiple consecutive strings separated by a
> carriage return, like this:
> 
> 
> apache
> apache
> apache
>  

If that actually occurred in the body it would be normalized to  

apache apache apache

If you mean 

apache

apache

apache

then my understanding is that a body rule would run independently on
each instance of 'apache', not on  'apache\napache\napache\n'.


Re: Rule for multiple paragraphs

Posted by Pedro David Marco <pe...@yahoo.com>.
 >On Monday, September 17, 2018, 5:34:48 PM GMT+2, Antony Stone <An...@spamassassin.open.source.it> wrote:
 >Give us a bit more of a clue what you are trying / hoping to do?
>In what way do you want to identify different paragraphs in an email, and how should the rules be applied differently?

Sure, thanks Antony...
I want i to detect multiple consecutive strings separated by a carriage return, like this:


apache
apache
apache
apache



something like:
body     __REPEATED_APACHES     /^apache$/
tflags    __REPEATED_APACHES     multiplemeta    REPEATED_APACHES      __REPEATED_APACHES > 3
will do the job BUT is not valid because it will trigger as well in a body like this:



apache
apache
Groucho
apache
apache


where apaches are not consecutive...


-----PedroD







  

Re: Rule for multiple paragraphs

Posted by Antony Stone <An...@spamassassin.open.source.it>.
On Monday 17 September 2018 at 17:29:48, Pedro David Marco wrote:

> Hi!
> is there any trick to make a rule work along different body paragraphs?? or
> maybe the only way is via plugins...

Give us a bit more of a clue what you are trying / hoping to do?

In what way do you want to identify different paragraphs in an email, and how 
should the rules be applied differently?

> Regards,
> 
> -----PedroD


Antony.

-- 
Why is "dylexia" so difficult to spell, and why can I never remember "aphasia" 
when I want to?

                                                   Please reply to the list;
                                                         please *don't* CC me.