You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jeff Portwine <jd...@veritime.com> on 2006/04/25 21:32:54 UTC

having trouble with SA

I am running exim 3.35 in debian.    We were using spamassassin 3.0, but we 
have been having a lot of trouble with spam getting through.    Some gets 
caught but a lot doesn't and over time it gets worse and worse.     The 
person who originally set up our mailserver and spamassassin left the 
company a while ago and it's been nothing but trouble since then.   Part of 
the reason from what I've been able to gather is that the bayes database 
keeps breaking itself.    I cleared the database before, and retrained it 
with a bunch of spam and ham and it seemed somewhat improved for a while. 
However, the way our system was set up was that whenever anybody got spam 
they would forward it to a spam email address and sa-learn would 
automatically learn from the mail sent there.    I learned that this is an 
ineffective way to handle this because the headers all get re-written when 
users forward their spam, in addition to the fact that over time the 
database gets very little ham and tons of spam and eventually the database 
gets more and more ineffective.

The spam levels are getting high again, users are complaining, and so today 
I did an apt-get spamassassin to upgrade to version 3.1.0.      I then used 
the configuration tool at http://www.yrex.com/spam/spamconfig.php to create 
a new local.cf and replaced the old one, which was outdated even for our 
previous version.     Now however, when I try to start he spamassassin 
daemon I get the message:   SpamAssassin Mail Filter Daemon: disabled, see 
/etc/default/spamassassin   and I'm really not sure what's wrong there.

As you can tell i'm a complete SA newbie and my exim experience is somewhat 
limited as well so I'm pretty much starting at the bottom of the learning 
curve.   I haven't been able to find any very complete or concise 
information about SA on the net, even the SA web page has a lot of scattered 
and outdated information so I'm not sure where to go from here to get this 
working.   Any advice would be very very much appreciated.

Thanks!
-Jeff
 


Re: having trouble with SA

Posted by Matt Kettler <mk...@comcast.net>.
Bill Landry wrote:
>
> So, Matt, are you doing something like:
>
>    bayes_ignore_from *@spamassassin.apache.org
I can check the exact syntax, but yes.
>
> How does this compare to:
>
>    bayes_ignore_to users@spamassassin.apache.org
>
> Is one way preferable to the other?
The bayes_ignore_to will fail for:
    messages sent to spamassassin-users@incubator.apache.org (still works)
    messages bcc'ed to the list for some reason or another.

However, the bayes_ignore_to will catch messages sent directly to you
and Cc'ed to the list, which is a good thing.

Ideally you should use both. (which I do, along with a whitelist_from_spf)

>
> Bill
>


Re: having trouble with SA

Posted by Bill Landry <bi...@pointshare.com>.
----- Original Message ----- 
From: "Matt Kettler" <mk...@evi-inc.com>

>>> No.. the only thing I generally whitelist is spam discussion lists
>>> like this one
>>> (and I do bayes_ignore_from for them as well). It would be better to
>>> bypass SA
>>> entirely, but I don't have that option in my setup.
>>
>> Matt, just for clarification, shouldn't that be bayes_ignore_to instead
>> of "from" when talking about discussion lists?  For example:
>>
>> bayes_ignore_to users@spamassassin.apache.org
>>
>
> No.. it should be from. I'm matching against the Return-Path header, not 
> the To:
> header.

So, Matt, are you doing something like:

    bayes_ignore_from *@spamassassin.apache.org

How does this compare to:

    bayes_ignore_to users@spamassassin.apache.org

Is one way preferable to the other?

Bill 


Re: having trouble with SA

Posted by Matt Kettler <mk...@evi-inc.com>.
Bill Landry wrote:
> ----- Original Message ----- From: "Matt Kettler" <mk...@evi-inc.com>
> 
>>> Final question for the moment... our old local.cf file had a lengthy
>>> whitelist included.   Is there any reason necessarily to have a
>>> whitelist?
>>
>> No.. the only thing I generally whitelist is spam discussion lists
>> like this one
>> (and I do bayes_ignore_from for them as well). It would be better to
>> bypass SA
>> entirely, but I don't have that option in my setup.
> 
> Matt, just for clarification, shouldn't that be bayes_ignore_to instead
> of "from" when talking about discussion lists?  For example:
> 
> bayes_ignore_to users@spamassassin.apache.org
> 

No.. it should be from. I'm matching against the Return-Path header, not the To:
header.

Re: having trouble with SA

Posted by Bill Landry <bi...@pointshare.com>.
----- Original Message ----- 
From: "Matt Kettler" <mk...@evi-inc.com>

>> Final question for the moment... our old local.cf file had a lengthy
>> whitelist included.   Is there any reason necessarily to have a
>> whitelist?
>
> No.. the only thing I generally whitelist is spam discussion lists like 
> this one
> (and I do bayes_ignore_from for them as well). It would be better to 
> bypass SA
> entirely, but I don't have that option in my setup.

Matt, just for clarification, shouldn't that be bayes_ignore_to instead of 
"from" when talking about discussion lists?  For example:

bayes_ignore_to users@spamassassin.apache.org

Bill 


Re: having trouble with SA

Posted by Matt Kettler <mk...@evi-inc.com>.
Jeff Portwine wrote:
>   Is there any good method
> for users to submit email as spam when spam gets through to help SA
> learn it as spam?

Yes, that can be a tough one. If you have a standardized mail client, you might
look and see if it has a reasonable "redirect" or "bounce" feature that
preserves the headers.

Another option would be to have them forward the original message as an
attachment and have a script strip the attachments and feed those to sa-learn.


> 
> Final question for the moment... our old local.cf file had a lengthy
> whitelist included.   Is there any reason necessarily to have a
> whitelist? 

No.. the only thing I generally whitelist is spam discussion lists like this one
(and I do bayes_ignore_from for them as well). It would be better to bypass SA
entirely, but I don't have that option in my setup.


> Since i'm training SA with ham, most of which would be coming
> from  our servers,  that mail should be processed as ham anyway.   Or
> does the whitelist just help to give more security that mail isn't going
> to be marked as spam erroneously?

Well, it would add extra security against FPs. However, for your internal mail
you should also get ALL_TRUSTED firing off. (note: you may want to manually
configure trusted_networks.. SA's guessing is not 100% here)



> 
> Thanks again,
> Jeff
> 


Re: having trouble with SA

Posted by Jeff Portwine <jd...@veritime.com>.
----- Original Message ----- 
From: "Matt Kettler" <mk...@evi-inc.com>
To: "Jeff Portwine" <jd...@veritime.com>
Cc: <us...@spamassassin.apache.org>
Sent: Tuesday, April 25, 2006 3:38 PM
Subject: Re: having trouble with SA


> Jeff Portwine wrote:
>
>> The spam levels are getting high again, users are complaining, and so
>> today I did an apt-get spamassassin to upgrade to version 3.1.0.      I
>> then used the configuration tool at
>> http://www.yrex.com/spam/spamconfig.php to create a new local.cf and
>> replaced the old one, which was outdated even for our previous
>> version.     Now however, when I try to start he spamassassin daemon I
>> get the message:   SpamAssassin Mail Filter Daemon: disabled, see
>> /etc/default/spamassassin   and I'm really not sure what's wrong there.
>
> So what does /etc/default/spamassassin look like? My guess is this file is 
> a
> debian-specific file that configures the startup script, and it's probably 
> set
> to disable spamd. However, I'm not a debian user, so it's a guess, but it 
> would
> be helpful to see what's there.
>
>
>
> Also, have you run spamassassin --lint? This checks your config files for
> errors. It should run with no output at all, but if there are problems it 
> will
> complain.
>

You were right about /etc/default/spamassassin.   I looked at it earlier but 
I guess my head was cloudy from the other stuff I'd been looking at because 
the answer to that particular problem was obvious and when I looked again it 
was clear why it was disabling spamd.

So now that I have that running... i'm currently digging through my exim 
config to try to verify that SA is configured properly there.     Once I can 
determine that, the next order is to rebuild my bayes database.    At that 
point I have some questions though.     Is there any good method for users 
to submit email as spam when spam gets through to help SA learn it as spam? 
Currently, mail is received by exim, and it is passed through SA and tagged 
spam or left alone and placed in the users mail box and they retrieve their 
mail via POP3.   Having them forward spam doesn't work since all the headers 
get re-written.    I can't seem to come up with a good way to do this other 
than asking them to manually copy spam into a folder or something where I 
could have a script learn the spam, but getting our users to take the time 
to do that would be a battle.

Final question for the moment... our old local.cf file had a lengthy 
whitelist included.   Is there any reason necessarily to have a whitelist? 
Since i'm training SA with ham, most of which would be coming from  our 
servers,  that mail should be processed as ham anyway.   Or does the 
whitelist just help to give more security that mail isn't going to be marked 
as spam erroneously?

Thanks again,
Jeff


Re: having trouble with SA

Posted by Stuart Johnston <st...@ebby.com>.
Matt Kettler wrote:
> Jeff Portwine wrote:
> 
>> The spam levels are getting high again, users are complaining, and so
>> today I did an apt-get spamassassin to upgrade to version 3.1.0.      I
>> then used the configuration tool at
>> http://www.yrex.com/spam/spamconfig.php to create a new local.cf and
>> replaced the old one, which was outdated even for our previous
>> version.     Now however, when I try to start he spamassassin daemon I
>> get the message:   SpamAssassin Mail Filter Daemon: disabled, see
>> /etc/default/spamassassin   and I'm really not sure what's wrong there.
> 
> So what does /etc/default/spamassassin look like? My guess is this file is a
> debian-specific file that configures the startup script, and it's probably set
> to disable spamd. However, I'm not a debian user, so it's a guess, but it would
> be helpful to see what's there.

Yes, Matt is right.  There is a line that says 'ENABLED=0'.  Change that 
0 to 1 and it will work.  You can also set options such as max-children 
in this file.

-Stuart

Re: having trouble with SA

Posted by Matt Kettler <mk...@evi-inc.com>.
Jeff Portwine wrote:

> The spam levels are getting high again, users are complaining, and so
> today I did an apt-get spamassassin to upgrade to version 3.1.0.      I
> then used the configuration tool at
> http://www.yrex.com/spam/spamconfig.php to create a new local.cf and
> replaced the old one, which was outdated even for our previous
> version.     Now however, when I try to start he spamassassin daemon I
> get the message:   SpamAssassin Mail Filter Daemon: disabled, see
> /etc/default/spamassassin   and I'm really not sure what's wrong there.

So what does /etc/default/spamassassin look like? My guess is this file is a
debian-specific file that configures the startup script, and it's probably set
to disable spamd. However, I'm not a debian user, so it's a guess, but it would
be helpful to see what's there.



Also, have you run spamassassin --lint? This checks your config files for
errors. It should run with no output at all, but if there are problems it will
complain.