You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Paul R. Ganci" <pr...@mric.coop> on 2005/05/16 03:31:11 UTC

Using 1st server Spamassassin score as starting value for 2nd server.

I run the Email servers for a small, rural mountain WISP and have a 
situation where all Email 1st comes into a "scrubber" server (RaQ 550) 
and once it passes an initial set of virus/spam tests is sent on to a 
2nd server (alsa a RaQ 550) where the actual user accounts reside. On 
the first server Spam/Virii are stopped in 3 steps:

1.) Custom blocklists
2.) Greylister
3.) MailScanner/Spamassassin 3.0.3 - Network checks/General rules.

These stop spam with as little work as possible and as soon as possible. 
Anything flagged with a high score at this point is quarantined just in 
case, but not sent to the user. Messages with scores lower than the high 
score threshold are sent on to the 2nd server.

On the 2nd server only MailScanner (virus checks bypassed since they 
were done already)/Spamassassin 3.02.(Spamd/Spamc) are run. Since Spamc 
runs as the Email recipient I can use each user's custom settings and 
bayes databases to do a final check on spam so that they may do what 
they want with such messages. These tests tend to be more expensive than 
those done on the 1st server and vary from user to user, but nonetheless 
I don't want to repeat the tests from the 1st server. Can I set up a 
header with the spam score from the 1st server and then set up the 2nd 
server's spamassassin to use that header as the initial score for any 
subsequent tests? My idea is to turn off  as many body checks on the 1st 
server and to wait until the last possible moment without duplicating 
1st server tests before doing the bayes stuff on the 2nd server. The 
final Spam score would be the accumulated sum from both servers.

It wasn't clear from the documentation or google searches that I could 
do this. It seemed that I could create a header and then give a score 
based upon its presence. But it wasn't obvious to me that I could 
actually read the headers value and set the initial spam score from the 
value.  Any hints are graciously accepted and greatly appreciated.

-- 
Paul (prganci@mric.coop)


Re: Using 1st server Spamassassin score as starting value for 2nd server.

Posted by Loren Wilton <lw...@earthlink.net>.
>
> Again thanks so much Loren!

Quite welcome.  It was a fun hack to think through.

There is one fix required in that file - change the word 'hits' to 'score'
in the rules.  I wrote those based off a 2.64 example, and the word changed
in 3.0.

        Loren


Re: Using 1st server Spamassassin score as starting value for 2nd server.

Posted by "Paul R. Ganci" <pr...@mric.coop>.
Loren Wilton wrote:

>>Any hints are graciously accepted and greatly appreciated.
>>    
>>
>You can't, so far as I know, do exactly what you want.  However, you may be
>able to come close.
>
><xnip>
>
>Oh heck.  I started the ruleset above, I just finished the thing.  File
>attached.
>Note these rules are UNTESTED, and may not work as expected.  They may not
>even lint for that matter.  But they might be useful if they do work.
>  
>
Gees I wanted some ideas but didn't expect anyone to do it for me! :)

As I said I would graciously accept and greatly appreciate some hints. 
Thanks so much for getting me going on this ... it was way beyond my 
expectations. I will try these out as time allows and if I find problems 
I will repost them.

 From the information given in your post I see I was not thinking in the 
spamassassin way. I am new to the rule writing part so I learned 
something. Your idea will do pretty much exactly what I want just not in 
the arithmetic way I was thinking.

Again thanks so much Loren!

-- 
Paul (prganci@mric.coop)


Re: Using 1st server Spamassassin score as starting value for 2nd server.

Posted by Loren Wilton <lw...@earthlink.net>.
> It wasn't clear from the documentation or google searches that I could
> do this. It seemed that I could create a header and then give a score
> based upon its presence. But it wasn't obvious to me that I could
> actually read the headers value and set the initial spam score from the
> value.  Any hints are graciously accepted and greatly appreciated.

You can't, so far as I know, do exactly what you want.  However, you may be
able to come close.

SA uses text pattern matching for rules, rather than any concept of
arithmetic numbers.  You can make multiple rules with various scores and use
them to look at the score lines in the header, at least if you are on 3.0 or
later.  (In 2.6x these lines will be stripped before you can look at them).

I think you will have to use 'full' rules to look at this data (or write a
plugin, which might be the better idea) since it will probably be removed
from 'header' rule data before you can look at it.

I would suggest a simple collection of rules that look at the number of
stars in the report line, and score each one at 1 point.  This will give you
the score rounded to the nearest point, which might be "good enough".  You
could actually make a whole series of rules (it would take 40 rules) to look
at the actual score value and provide a decimal result.  For instance,
grabbing part of the summary line from your message, I see

X-Spam-Status: No, hits=-94.4

(This is on 2.64, check the format for 3.0 in case it changed.)

Now I could write a bunch of rules like

full    SA_SCORE_100    /^X-Spam-Status:.{0,20}hits=1\d\d/s
score    SA_SCORE_100    100
full    SA_SCORE_10    /^X-Spam-Status:.{0,20}hits=\d*1\d[^\d]/s
score    SA_SCORE_10    10
full    SA_SCORE_20    /^X-Spam-Status:.{0,20}hits=\d*2\d[^\d]/s
score    SA_SCORE_20    20
full    SA_SCORE_30    /^X-Spam-Status:.{0,20}hits=\d*3\d[^\d]/s
score    SA_SCORE_30    30
full    SA_SCORE_40    /^X-Spam-Status:.{0,20}hits=\d*4\d[^\d]/s
score    SA_SCORE_40    40
full    SA_SCORE_1    /^X-Spam-Status:.{0,20}hits=\d*1[^\d]/s
score    SA_SCORE_1    1
full    SA_SCORE_2    /^X-Spam-Status:.{0,20}hits=\d*2[^\d]/s
score    SA_SCORE_2    2
full    SA_SCORE_3    /^X-Spam-Status:.{0,20}hits=\d*3[^\d]/s
score    SA_SCORE_3    3
full    SA_SCORE_4    /^X-Spam-Status:.{0,20}hits=\d*4[^\d]/s
score    SA_SCORE_4    4
full    SA_SCORE_point1    /^X-Spam-Status:.{0,20}hits=\d*\.1/s
score    SA_SCORE_point1    0.1
full    SA_SCORE_point2    /^X-Spam-Status:.{0,20}hits=\d*\.2/s
score    SA_SCORE_point2    0.2
full    SA_SCORE_point3    /^X-Spam-Status:.{0,20}hits=\d*\.3/s
score    SA_SCORE_point3    0.3
full    SA_SCORE_point4    /^X-Spam-Status:.{0,20}hits=\d*\.4/s
score    SA_SCORE_point4    0.4

Obviously you need 0-9 in each case.

That will handle positive scores and ignore negative scores.  (If you can't
have 3 digit scores, you can simplify the regex in the first casea above to
something like "=1\d/s" on the end).  If you want to deal with negative
scores also, you will need a bunch more rules to deal with them.

Oh heck.  I started the ruleset above, I just finished the thing.  File
attached.
Note these rules are UNTESTED, and may not work as expected.  They may not
even lint for that matter.  But they might be useful if they do work.

        Loren