You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Lucio Chiappetti <lu...@lambrate.inaf.it> on 2009/04/02 19:17:54 UTC

update Re: quirks with bayes ?

We did something yesterday in the lines I described, which sort-of 
improved the situation. Also there was a mistake in one of the things I 
said:

When I reported sa-learn -magic saying 30,000 spam 300,000 ham, it 
occurred that I was quoting the value for our secondary MX. The primary MX 
has a 1:1 ratio, sort of 250,000 to 250,000 !

In fact the daily traffic ratio is 2000:1000 mail through the two servers, 
while the rejected spam is sort of 1000:150.

Anyhow what we did yesterday ON THE PRIMARY MX was :

  - clean the AWL of all entries with a single occurrence (including
    those  user@ourdomain|x.y where x.y is NOT our IP)
  - remove from AWL all entries for  user@ourdomain
  - change whitelist_from to whitelist_from_rcvd
  - sa-learn all the quarantined spam of the last 10 days
  - lower the ham learn threshold to -2

This seemed to reject more spam (all the "casino" one, and most of the 
"advertising job in bad italian" [I've been told this is a scam actually] 
... in particular the latter is now getting no longer BAYES_00 but higher 
probability ranges)

Today we applied the same to the secondary MX, with the variant that it 
sa-learned the last 10 days of spam from BOTH servers.

And although the traffic is still in the same ratio, the secondary is now 
rejected more of the "advertising job in bad italian" and with even higher 
bayes ranges (BAYES_80 or higher) than the primary.

We hope this will settle in a short time (my colleague will change the 
crontab so that both servers sa-learn the daily quarantine of both)

Thanks to all for the hints.

-- 
Lucio Chiappetti - INAF/IASF - via Bassini 15 - I-20133 Milano (Italy)
For more info : http://www.iasf-milano.inaf.it/~lucio/personal.html
-----------------------------------------------------------------------
"Nature" on government cuts to research       http://snipurl.com/4erid
"Nature" e i tagli del governo alla ricerca   http://snipurl.com/4erko

Re: update Re: quirks with bayes ?

Posted by LuKreme <kr...@kreme.com>.
On 2-Apr-2009, at 11:17, Lucio Chiappetti wrote:
> - clean the AWL of all entries with a single occurrence (including
>   those  user@ourdomain|x.y where x.y is NOT our IP)


Probably not a good idea.  I would think a lot of those would be SPAM  
indicators you flushed out of your AWL.  However:

> - remove from AWL all entries for  user@ourdomain
> - change whitelist_from to whitelist_from_rcvd

Probably means you have a net win.

-- 
Like the moment when the brakes lock/And you slide towards the big
	truck/You stretch the frozen moments with your fear