You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Bill Polhemus <bi...@gmail.com> on 2013/06/03 18:42:24 UTC

"2" Seems To Be My Sweet Spot

Hello. 

I am not a major admin. I have used a Linux box w/ Sendmail + Spamassassin off and on for years, just for personal and small-biz email. I have only two dozen or so accounts allocated among three domains. 

Using third-party email service for many years, which supposedly includes Spam filtering, I noticed that gradually, of  ~500 or so mails per account per day,  about 40% are spam. And in fact I noticed perhaps half again as many spam were getting through as were caught in my email service provider's Spam trap (I have no idea what they use).

Decided to take things in hand again. 

After about 3 months of fiddling I've got it to the point where I'm down to maybe two Spam per account per day getting through. 

Typical SA Bayes files sizes are about 650K Bayes_seen/AWL  and 1.2G Bayes_toks

Thing is, in order to get this performance I've had to set the threshold for Spam/Ham at a SA score of 2, after all hand-feeding and tweaking I know to do. I lowered it gradually over time by 0.5 every two weeks or so, to this point.

So far I've found maybe 1 or 2 false positives per account per week at this scoring. 

I'm fine with it as is, but thought some folks here might find it interesting to note.

William L. Polhemus, Jr. P.E.
Sent from my iPhone 5

Re: "2" Seems To Be My Sweet Spot

Posted by da...@chaosreigns.com.
The default rule scores are generated with an assumed threshold of 5
and a target of 1 false positive in 2,500 non-spams.  It sounds like you
may be substantially increasing the false positive rate.  Which you are
certainly entitled to do, but I would not recommend.

http://wiki.apache.org/spamassassin/ImproveAccuracy

On 06/03, Bill Polhemus wrote:
> Hello. 
> 
> I am not a major admin. I have used a Linux box w/ Sendmail + Spamassassin off and on for years, just for personal and small-biz email. I have only two dozen or so accounts allocated among three domains. 
> 
> Using third-party email service for many years, which supposedly includes Spam filtering, I noticed that gradually, of  ~500 or so mails per account per day,  about 40% are spam. And in fact I noticed perhaps half again as many spam were getting through as were caught in my email service provider's Spam trap (I have no idea what they use).
> 
> Decided to take things in hand again. 
> 
> After about 3 months of fiddling I've got it to the point where I'm down to maybe two Spam per account per day getting through. 
> 
> Typical SA Bayes files sizes are about 650K Bayes_seen/AWL  and 1.2G Bayes_toks
> 
> Thing is, in order to get this performance I've had to set the threshold for Spam/Ham at a SA score of 2, after all hand-feeding and tweaking I know to do. I lowered it gradually over time by 0.5 every two weeks or so, to this point.
> 
> So far I've found maybe 1 or 2 false positives per account per week at this scoring. 
> 
> I'm fine with it as is, but thought some folks here might find it interesting to note.
> 
> William L. Polhemus, Jr. P.E.
> Sent from my iPhone 5

-- 
"Believe nothing, no matter where you read it or who has said it, even
if I have said it, unless it agrees with your own reason and your own
common sense." - Buddha, 563-483 B.C.
http://www.ChaosReigns.com