You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Thomas Harold <th...@nybeta.com> on 2009/12/01 03:27:59 UTC

Scoring for DATE_IN_FUTURE_96_XX

While looking at the scores in 50_scores.cf, I noticed the following:

score DATE_IN_FUTURE_03_06 2.303 0.416 1.461 0.274
score DATE_IN_FUTURE_06_12 3.099 3.099 2.136 1.897
score DATE_IN_FUTURE_12_24 3.300 3.299 3.000 2.189
score DATE_IN_FUTURE_24_48 3.599 2.800 3.599 3.196
score DATE_IN_FUTURE_48_96 3.199 3.182 3.199 3.199
score DATE_IN_FUTURE_96_XX 3.899 3.899 2.598 1.439

Why does the 96+ hour rule score so much lower then the 48-96 hour test 
for the last two entries?

(I'm also wondering if there should be an even higher score rule for 
stuff over 168 hours in the future or past.)

Re: Scoring for DATE_IN_FUTURE_96_XX

Posted by Matt Kettler <mk...@verizon.net>.
Thomas Harold wrote:
> On 11/30/2009 9:27 PM, Thomas Harold wrote:
>> While looking at the scores in 50_scores.cf, I noticed the following:
>>
>> score DATE_IN_FUTURE_03_06 2.303 0.416 1.461 0.274
>> score DATE_IN_FUTURE_06_12 3.099 3.099 2.136 1.897
>> score DATE_IN_FUTURE_12_24 3.300 3.299 3.000 2.189
>> score DATE_IN_FUTURE_24_48 3.599 2.800 3.599 3.196
>> score DATE_IN_FUTURE_48_96 3.199 3.182 3.199 3.199
>> score DATE_IN_FUTURE_96_XX 3.899 3.899 2.598 1.439
>>
>> Why does the 96+ hour rule score so much lower then the 48-96 hour test
>> for the last two entries?
>>
>> (I'm also wondering if there should be an even higher score rule for
>> stuff over 168 hours in the future or past.)
>
> I did dig up the following thread from back in Oct '06...
>
> http://mail-archives.apache.org/mod_mbox/spamassassin-users/200611.mbox/browser
>
>
> I'm guessing that what it boils down to is contained in the wiki page?
> The spam is better off caught by another rule once network tests are
> allowed?
Yep, since SA is scored as a set, "score stealing" between rules is
pretty common when there's a lot of overlap between two rules and one
performs slightly better than the other. It's also possible for there to
be more complicated cascades where one rule affects another, which in
turn affects a third, which affects a fourth...

Also looking at the above scores, there's likely no spam network tests
that cover the same mail as 48_96, because its score is pretty much the
same.

 "On average" the scores of all non-network spam rules should go down a
little bit when the network tests are enabled there are more rules in
the set competing for score. However since the distribution of hits
across rules is distinctly not random, you'll see a lot of non-average
cases, which means some rules will be:
    staying the same because they cover mail the network tests don't
    going down radically due to heavy overlap
    going up because they correct false negatives in some of the
non-spam network tests.
>
> http://wiki.apache.org/spamassassin/HowScoresAreAssigned
>


Re: Scoring for DATE_IN_FUTURE_96_XX

Posted by Thomas Harold <th...@nybeta.com>.
On 11/30/2009 9:27 PM, Thomas Harold wrote:
> While looking at the scores in 50_scores.cf, I noticed the following:
>
> score DATE_IN_FUTURE_03_06 2.303 0.416 1.461 0.274
> score DATE_IN_FUTURE_06_12 3.099 3.099 2.136 1.897
> score DATE_IN_FUTURE_12_24 3.300 3.299 3.000 2.189
> score DATE_IN_FUTURE_24_48 3.599 2.800 3.599 3.196
> score DATE_IN_FUTURE_48_96 3.199 3.182 3.199 3.199
> score DATE_IN_FUTURE_96_XX 3.899 3.899 2.598 1.439
>
> Why does the 96+ hour rule score so much lower then the 48-96 hour test
> for the last two entries?
>
> (I'm also wondering if there should be an even higher score rule for
> stuff over 168 hours in the future or past.)

I did dig up the following thread from back in Oct '06...

http://mail-archives.apache.org/mod_mbox/spamassassin-users/200611.mbox/browser

I'm guessing that what it boils down to is contained in the wiki page? 
The spam is better off caught by another rule once network tests are 
allowed?

http://wiki.apache.org/spamassassin/HowScoresAreAssigned