You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Pedro David Marco <pe...@yahoo.com> on 2017/10/21 07:45:43 UTC

Preventing duplicated matches

Hi everybody...
is there any way to avoid duplicated matches when tflag is set to "multiple"?
Thanks!
---Pedro

Re: Preventing duplicated matches

Posted by John Hardin <jh...@impsec.org>.
On Mon, 23 Oct 2017, Pedro David Marco wrote:

>> Can you provide a concrete example of *why* you would want to set "tflags 
>> multiple" in the first place if you do not want duplicate/multiple
>> matches for that rule?
>
> Actually multiple counts the sum of matches in text and html parts. So a 
> value of 2 (for example) means either that a match is found in text and 
> in html parts or that the rule matched twice in either of them...

Ah, yes, indeed, I was aware of that...

> a  "multiple_uniq"  flag is not the perfect solution but may help a lot!

The problem it solves is different.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Your mouse has moved. Your Windows Operating System must be
   relicensed due to this hardware change. Please contact Microsoft
   to obtain a new activation key. If this hardware change results in
   added functionality you may be subject to additional license fees.
   Your system will now shut down. Thank you for choosing Microsoft.
-----------------------------------------------------------------------
  207 days since the first commercial re-flight of an orbital booster (SpaceX)

Re: Preventing duplicated matches

Posted by Pedro David Marco <pe...@yahoo.com>.
>Can you provide a concrete example of *why* you would want to set "tflags 
>multiple" in the first place if you do not want duplicate/multiple 
>matches for that rule?
Actually multiple counts the sum of matches in text and html parts. So a value of 2 (for example) means either that a match is found in text and in html parts or that the rule matched twice in either of them...  
a  "multiple_uniq"  flag is not the perfect solution but may help a lot!

-------Pedro







   

Re: Preventing duplicated matches

Posted by John Hardin <jh...@impsec.org>.
On Sat, 21 Oct 2017, Kevin A. McGrail wrote:

> Neat idea.

Potentially...

> On October 21, 2017 8:37:41 AM EDT, RW <rw...@googlemail.com> wrote:
>> On Sat, 21 Oct 2017 13:09:24 +0200
>> Matus UHLAR - fantomas wrote:
>>
>>> On 21.10.17 07:45, Pedro David Marco wrote:
>>>> is there any way to avoid duplicated matches when tflag is set to
>>>> "multiple"?

Can you provide a concrete example of *why* you would want to set "tflags 
multiple" in the first place if you do not want duplicate/multiple 
matches for that rule?

The only plausible scenario I can see for this is to override the behavior 
of base rules having "tflags multiple" set. That's done for a reason, and 
turning off the multiple-hits behavior will break those rules.

You may be able to achieve that by setting "tflags maxhits=1" for that 
rule...

>>> that's the whole point of multiple. you can limit it to some number by 
>>> "maxhits" option.

...as suggested by Matus.

If you don't want the score to multiply, then put the "tflags multiple" on 
a __subrule, and score a meta that checks the number of hits exceeds some 
minimum. There are rules like that in the base ruleset and the 
documentation if you want examples of how to do it.

>> I think that the question was about counting the distinct strings, so
>> if you got the matches:
>>
>>  "Spam", "Sausage", "Spam", "Egg", "Spam"
>>
>> it would count as 3 rather than 5.
>>
>> This would be useful, but AFAIK it's not possible.

I *think* I understand that - one hit counted for "Spam", one for 
"Sausage" and one for "Egg"? Right now it doesn't capture the matches, 
that would need to change. As KAM said, open a bugzille new feature 
request, ideally with a useful example.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  205 days since the first commercial re-flight of an orbital booster (SpaceX)

Re: Preventing duplicated matches

Posted by "Kevin A. McGrail" <ke...@mcgrail.com>.
Neat idea.  I would open a bugzilla and any thoughts on a patch.  With a real world sample that shows how it might help. Explaining the fn and fp reasons.
Regards,
KAM

On October 21, 2017 8:37:41 AM EDT, RW <rw...@googlemail.com> wrote:
>On Sat, 21 Oct 2017 13:09:24 +0200
>Matus UHLAR - fantomas wrote:
>
>> On 21.10.17 07:45, Pedro David Marco wrote:
>> >is there any way to avoid duplicated matches when tflag is set to
>> >"multiple"?  
>> 
>> that's the whole point of multiple. you can limit it to some number
>by
>> "maxhits" option.
>
>I think that the question was about counting the distinct strings, so
>if you got the matches:
>
>  "Spam", "Sausage", "Spam", "Egg", "Spam"
>
>it would count as 3 rather than 5.
>
>This would be useful, but AFAIK it's not possible. 

Re: Preventing duplicated matches

Posted by RW <rw...@googlemail.com>.
On Sat, 21 Oct 2017 13:09:24 +0200
Matus UHLAR - fantomas wrote:

> On 21.10.17 07:45, Pedro David Marco wrote:
> >is there any way to avoid duplicated matches when tflag is set to
> >"multiple"?  
> 
> that's the whole point of multiple. you can limit it to some number by
> "maxhits" option.

I think that the question was about counting the distinct strings, so
if you got the matches:

  "Spam", "Sausage", "Spam", "Egg", "Spam"

it would count as 3 rather than 5.

This would be useful, but AFAIK it's not possible. 

Re: Preventing duplicated matches

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 21.10.17 07:45, Pedro David Marco wrote:
>is there any way to avoid duplicated matches when tflag is set to "multiple"?

that's the whole point of multiple. you can limit it to some number by
"maxhits" option.
-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety. -- Benjamin Franklin, 1759