You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Marshall Roch <ma...@exclupen.com> on 2005/08/25 06:04:26 UTC

Training a forwarding filter

Hi all,

This is my first post here.

I have a Gentoo box running Postfix IMAP and SpamAssassin.  Each user 
has a "Learn Spam" folder, and a cron job passes anything in that folder 
to sa-learn and then moves it to the Junk folder.

This works great, but the load is too much.  I'm trying to move 
SpamAssassin to a second server that will serve as a relay in front of 
the existing server.  However, I don't want to lose the ability to train 
the filter by simply moving bad mail into a different folder.

As far as I can tell, it's not possible to tell sa-learn to connect to a 
remote server.  So how can I get those spams into sa-learn on the relay?

--
Marshall Roch

Re: Training a forwarding filter

Posted by Loren Wilton <lw...@earthlink.net>.
> This works great, but the load is too much.  I'm trying to move
> SpamAssassin to a second server that will serve as a relay in front of
> the existing server.  However, I don't want to lose the ability to train
> the filter by simply moving bad mail into a different folder.
>
> As far as I can tell, it's not possible to tell sa-learn to connect to a
> remote server.  So how can I get those spams into sa-learn on the relay?

Share the disk that has the training folders on it?  If the only thing it is
used for (shared) is the spam learning the overhead shouldn't be a problem.

Use IMAP folders on one of the boxes?

Probably half a dozen other solutions.

        Loren


Re: Training a forwarding filter

Posted by Bob Proulx <bo...@proulx.com>.
Marshall Roch wrote:
> I have a Gentoo box running Postfix IMAP and SpamAssassin.  Each user 
> has a "Learn Spam" folder, and a cron job passes anything in that folder 
> to sa-learn and then moves it to the Junk folder.

Okay.  Sounds good.

> This works great, but the load is too much.

The system load is a count of how many concurrent processes are in the
run queue at the same time.  So if you only have one process that is
doing what you say then the load should never be able to get above
one, because there is only one process.  If you really have a high
system load then this is telling me that you have multiple processes
running all at the same time.  That would be a different problem.
Because you probably should not be having that many processes all
running at the same time.

One common problem with cron tasks is that they can overrun the
previous task if it is still running.  If a cron task runs longer than
the time between cron tasks then you will start to stack up running
cron tasks.  Each task will get less cpu and take even longer.  More
tasks pile up, it is an unstable situation, and the system melts down.
I don't know if this is your problem but it very well could be.

If that is your problem then what I suggest is putting a semaphore in
your script so that if cron launches a task and the previous one is
still running that the later cron exits.  This will prevent tasks from
stacking up and will keep your system from melting down.  If this is
your problem and you would like a hint I would be happy to share some
code snippets from my scripts that do this.

> I'm trying to move SpamAssassin to a second server that will serve
> as a relay in front of the existing server.  However, I don't want
> to lose the ability to train the filter by simply moving bad mail
> into a different folder.
>
> As far as I can tell, it's not possible to tell sa-learn to connect to a 
> remote server.  So how can I get those spams into sa-learn on the relay?

You could always shuffle the mail from the bad folder over to the
other machine.

Bob