You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Michael Monnerie <mi...@is.it-management.at> on 2011/12/13 08:34:39 UTC
Re: Using ZMI_GERMAN ruleset
On Montag, 31. Oktober 2011 Axb wrote:
> tried it and dumped due to low hit rate
>
> stuff like
>
> body ZMIde_JOBSEARCH6 /Dank sehr grossen Angagement, aber auch
> der Umsetzung verschiedener Inovationen, konnte unsere Firma schon
> nach vier Jahren auf die internationalen Ebene hinaufsteigen/
>
> is not efficient
Its "efficient" in terms of "filtering only spam with zero false
positives", which is top priority for this ruleset. And you picked a
very old and very long rule. Most rules nowadays are just one or even
only part of a sentence, and it prooves very efficient. Stuff like the
__ZMIde_JOBEARN1-28 rules move false positives to 0, and I'm constantly
adding stuff.
I've now tried to remove all old cruft, that means single-line rules.
Rulesize went from 350KB to 296KB, that should save some RAM and CPU.
--
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc
it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531
Re: Using ZMI_GERMAN ruleset
Posted by Michael Monnerie <mi...@is.it-management.at>.
On Dienstag, 13. Dezember 2011 Axb wrote:
> patterns with >120 characters are not really efficient, in terms of
> speed and hit rate. They are very specific to certain campaigns and
> minimal template changes will render them useless as in:
>
> body __ZMIde_STOCK34 /Wir sind .{0,2}berzeugt, dass der
> Zeitpunkt sich an einem Unternehmen zu beteiligen, welches
> erfolgreich im Edelmetallhandel t.{0,2}tig ist, nicht besser sein
> k.{0,2}nnte/
>
> or
>
> body __ZMIde_SALE5 /In den letzten 5 Jahren hatte ich .{0,2}ber
> drei dutzend gut funktionierende Strategien, um die Zahl meiner
> Webseitenbesucher drastisch zu erh.{0,2}hen und dadurch meinen
> Umsatz anzukurbeln/
Since they get hits, no need to change them. Once I get reports about a
sentence that has been modified, I apply that to the rules. I need
feedback for those, of course.
And it should still be fast in terms of CPU as if there's (rule
__ZMIde_SALE5) no "In den l" in the message, the regex shouldn't have to
search too much, right? At least I'd guess it's an optimized search
which compares in 64bit steps, which is 8 chars at once?
Look for the "Krankenkassa" ruleset, this has been very active these
last weeks. All the time modifications from them, I get reports and
modify the rules accordingly.
And not to forget: Long sentences mean chance for a false positive drops
--
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc
it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531
Re: Using ZMI_GERMAN ruleset
Posted by Axb <ax...@gmail.com>.
On 2011-12-13 8:34, Michael Monnerie wrote:
> On Montag, 31. Oktober 2011 Axb wrote:
>> tried it and dumped due to low hit rate
>>
>> stuff like
>>
>> body ZMIde_JOBSEARCH6 /Dank sehr grossen Angagement, aber auch
>> der Umsetzung verschiedener Inovationen, konnte unsere Firma schon
>> nach vier Jahren auf die internationalen Ebene hinaufsteigen/
>>
>> is not efficient
>
> Its "efficient" in terms of "filtering only spam with zero false
> positives", which is top priority for this ruleset. And you picked a
> very old and very long rule. Most rules nowadays are just one or even
> only part of a sentence, and it prooves very efficient. Stuff like the
> __ZMIde_JOBEARN1-28 rules move false positives to 0, and I'm constantly
> adding stuff.
>
> I've now tried to remove all old cruft, that means single-line rules.
> Rulesize went from 350KB to 296KB, that should save some RAM and CPU.
>
patterns with >120 characters are not really efficient, in terms of
speed and hit rate. They are very specific to certain campaigns and
minimal template changes will render them useless as in:
body __ZMIde_STOCK34 /Wir sind .{0,2}berzeugt, dass der Zeitpunkt
sich an einem Unternehmen zu beteiligen, welches erfolgreich im
Edelmetallhandel t.{0,2}tig ist, nicht besser sein k.{0,2}nnte/
or
body __ZMIde_SALE5 /In den letzten 5 Jahren hatte ich .{0,2}ber drei
dutzend gut funktionierende Strategien, um die Zahl meiner
Webseitenbesucher drastisch zu erh.{0,2}hen und dadurch meinen Umsatz
anzukurbeln/