You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Cris Fuhrman <fu...@gmail.com> on 2005/02/20 20:23:33 UTC

Lightweight setup (a la SpamPal)

Hello,

First off, let me say that I'm an indirect user of SpamAssassin, and
have great respect for the time and engergy that has been put into
developing and deploying it.

At my day job, our sysadmins have been using SpamAssassin for over a
year. I'm not sure they feed it ham/spam as needed to keep the
accuracy up to snuff -- I don't know if it's configured to use
block-lists. It blocks offensive spams, probably using some regex
stuff. However, I still get 25+ spams/day. They initially asked us to
send them spams that get through, which I was happy to do. But I
frankly have better things to do now with my time, especially given
the amount of spam I get.

I've already used SpamPal for years on my home ISP's email, which does
no filtering of spam (I disabled it, as they would not give me details
about how they filter spam!). SpamPal works wonderfully in this
regard; few spams get through.

Frustrated with how things are on my work email, I began to use
SpamPal 1.591 on my Windows client connected to this same mail host
with IMAP. With no configuration needed, SpamPal detects all of the
spam that was getting through. I only enabled one additional plug-in
(URL body), which checks IPs of URLs in the body for black lists. No
content filtering takes place otherwise. By the way, there are some
quirks with SpamPal, IMAP and Thunderbird 1.0, but that's another
story...

All of this makes me think about SpamAssassin's technique with
Bayesian filters on content, and the Law of Diminishing Returns
(http://www.bartleby.com/65/di/diminish.html). Is the extra time and
energy required to keep the Bayesian filters working well worth the
additional spam that it filters?

SpamPal only uses some basic technique with black lists of IPs in the
SMTP headers (I think). Unless you enable reg-ex or URL filtering in
the body, it doesn't do anything like that. In my case, both on my
home and work accounts, the vast majority of spams are filtered. I
don't have numbers, but I'd be willing to guess it's over 90%.

My question is this: can anyone recommend a lightweight,
no-feeding-required setup for SpamAssassin 3.x similar to how SpamPal
works? It would be ideal for getting the best return on investment
from SpamAssassin, I think.

I think our sysadmins (and I'm sure they're not unique) don't have
enough time to feed ham/spam to the B. filters. Given the
effectiveness of SpamPal's simple approach, it seems to me that the
return on investment for correct Bayesion filtering is not worth it in
our case. I'm not criticizing the approach, per se. Just saying it
would be good to know how to set up a lightweight, low-maintenance
version of SpamAssassin.

I assume the underlying technologies of SpamAssassin and SpamPal are
roughly equivalent in this regard, and it's just a question of
configuring SpamAssassin to work with the black lists the way SpamPal
does. Apologies if I'm off base in my assumptions, or I've
misunderstood how SpamAssassin works.

Cheers!

Re: Lightweight setup (a la SpamPal)

Posted by Kai Schaetzl <ma...@conactive.com>.
Cris Fuhrman wrote on Sun, 20 Feb 2005 14:23:33 -0500:

> I think our sysadmins (and I'm sure they're not unique) don't have 
> enough time to feed ham/spam to the B. filters. Given the 
> effectiveness of SpamPal's simple approach, it seems to me that the 
> return on investment for correct Bayesion filtering is not worth it in 
> our case. I'm not criticizing the approach, per se. Just saying it 
> would be good to know how to set up a lightweight, low-maintenance 
> version of SpamAssassin.
>

1. nobody knows how your sysadmins have setup your SA or even what version 
it is. If there is so much spam getting thru it's either outdated or not 
sufficiently configured. Talk to them, let them read spamassassin.org and 
let them ask here. It doesn't make sense to tell them thru a user what 
they should do.

2. Bayes needs constant training. However, after the initial setup it 
usually trains itself by autolearn. We don't train Bayes on any of our 
machines by hand for a long time and the spam detection rate is excellent.

3. It's possible to get very good spam detection with just a few rules or 
some other technique. The problem is that this may work for some people 
but doesn't for others, f.i. not for large and diverse environments. 
Especially the false positives can be quite high for one person or group 
and low for another, depends on what mail they get.

4. To answer your question in short. With a little bit of SARE rules and 
SURBL SA 3.x works just fine out-of-the-box. Once you have an initialized 
Bayes DB it adds to that.


Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
IE-Center: http://ie5.de & http://msie.winware.org




Re: Lightweight setup (a la SpamPal)

Posted by Jeff Chan <je...@surbl.org>.
On Sunday, February 20, 2005, 11:23:33 AM, Cris Fuhrman wrote:
> At my day job, our sysadmins have been using SpamAssassin for over a
> year. I'm not sure they feed it ham/spam as needed to keep the
> accuracy up to snuff -- I don't know if it's configured to use
> block-lists. It blocks offensive spams, probably using some regex
> stuff. However, I still get 25+ spams/day. They initially asked us to
> send them spams that get through, which I was happy to do. But I
> frankly have better things to do now with my time, especially given
> the amount of spam I get.

Please ask them to use SURBLs, like you are at home.  That should
catch a lot more spams if they're not using them, even without
Bayes.

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/