You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Marc Perkel <ma...@perkel.com> on 2004/04/06 01:37:50 UTC
Rule Classifications - I like it!
I like the new 20_drugs.cf file and I'm wondering if we should create
ather classifications of rules like this. One for porn - one for
finance/mortgage/creditcards - etc.
Also - have the ability to declare a default score for everything in
that file. So that before it's scored - you can say give it a 3 instead
of the default 1.
I've also thought that rule classifications could be scored in a way
that they had independent totals - the dug score - the sex score - the
credit card scam score - etc - with the idea of maybe being able to
apply a scaling factor to the classification. A church might want to
scale up the porn score. A reality company might want to scale down the
financials scores.
Anyhow - in the interest of fine grained controls and managability - I
make this suggestion.
Re: Rule Classifications - I like it!
Posted by Pete McNeil <ma...@microneil.com>.
At 07:37 PM 4/5/2004, Mark wrote:
>I like the new 20_drugs.cf file and I'm wondering if we should create
>ather classifications of rules like this. One for porn - one for
>finance/mortgage/creditcards - etc.
>
>Also - have the ability to declare a default score for everything in that
>file. So that before it's scored - you can say give it a 3 instead of the
>default 1.
>
>I've also thought that rule classifications could be scored in a way that
>they had independent totals - the dug score - the sex score - the credit
>card scam score - etc - with the idea of maybe being able to apply a
>scaling factor to the classification. A church might want to scale up the
>porn score. A reality company might want to scale down the financials scores.
We do precisely this with our Message Sniffer product with mixed results.
It turns out that rules that score highly for drugs (snakeoil) frequently
match credit card (debt) and even porn (adult) classifications. In practice
there is little distinction except perhaps for porn/adult. Spammers tend to
reuse domains and other header & obfuscation patterns across these three
categories in particular.
It turns out that most of the time if a customer ranks one of the groups
higher it is not because they have a particular filtering classification in
mind, but rather because a particular classification tends to have higher
accuracy in general... Due to the way we source our rules the porn/adult
group tends to be slightly more accurate than some general rules - but
about the same as drugs. Debt can sometimes be less accurate but not often.
Frequently the slight distinction is amplified in the mind of the end user
more than the statistics really support...
I suspect that similar classifications implemented directly in SA would
have similar statistics.
$0.02
_M
Ref Classifications:
http://www.sortmonster.com/MessageSniffer/Help/ResultCodesHelp.html
Ref SA Plugin:
http://www.sortmonster.com/MessageSniffer/Installation/SpamAssassin.html