You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Loren Wilton <lw...@earthlink.net> on 2004/02/11 12:22:00 UTC

Score generation for repeated patterns

A number of rules seem to have the general form (blah){5,25}, and if that
gets a hit it is scored with some hard value, say 1.0.  Then there is
(blah){25,30} scored at 2 points, etc.  The intent is clearly to have a
minimum threashold and then an increasing score for an increasing number of
repetitions.

Now, my question is, can a rule be written in such a way that it can
self-generate a variable score?  I know diddly about regexps and perl, but
reading the docs last night it appears that you can imbed perl inside a
regexp and do things like count the number of hits on a repetition clause.
So at least in theory in perl you could count the repetitions and do
something with it.

What I don't know is if that would be valid in a SA rule, and how you could
extract the accumulated count (possibly scaled by some factor) into the
score value for that rule.

Anyone know if such a thing is possible?  If so this seems like something
that could somewhat cut down on the number of rules, and possibly give some
finer-grained control over results in some cases.

        Loren