You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by John Davis <jd...@envy.com> on 2006/03/13 23:53:33 UTC

using sa-learn offline

I am trying to train spamassassin using spam and ham I've collected.  My
problem is the sa-learn script is using too many resources on the server
(my spam folder had ~1000 messages).

Is there a way I can run sa-learn on my PC and then merge the results with 
the server?

-- 
--
jdavis@envy.com        Student: Master, does Emacs have the Buddha Nature?
                       Master:  Why not? It has damn near everything else!


Re: using sa-learn offline

Posted by Matt Kettler <mk...@evi-inc.com>.
John Davis wrote:
> I am trying to train spamassassin using spam and ham I've collected.  My
> problem is the sa-learn script is using too many resources on the server
> (my spam folder had ~1000 messages).
> 
> Is there a way I can run sa-learn on my PC and then merge the results with 
> the server?
> 

Longer answer than before:

No, you cannot run sa-learn on one machine, then later merge the results onto
your server.


HOWEVER, you if you are running SpamAssassin 3.1.0 or higher, and you've started
spamd with --allow-tell, you can use spamc -L on your PC and have it feed to
your server's spamd.

This might lighten your overhead a little bit, because it's not going to have to
invoke a new perl instance, but the grunt-work of analyzing for tokens and
placing them into the bayes db is still going to happen on the server side
within spamd.

Re: using sa-learn offline

Posted by Rick Macdougall <ri...@ummm-beer.com>.
Matt Kettler wrote:
> John Davis wrote:
>> I am trying to train spamassassin using spam and ham I've collected.  My
>> problem is the sa-learn script is using too many resources on the server
>> (my spam folder had ~1000 messages).
>>
>> Is there a way I can run sa-learn on my PC and then merge the results with 
>> the server?
> 
> No.
> 

No, BUT!  If you are using mysql for your bayes backend, you can use a 
separate server to run sa-learn on.  Might be a solution for you.

(That's what we do here).

Regards,

Rick


Re: using sa-learn offline

Posted by Matt Kettler <mk...@evi-inc.com>.
John Davis wrote:
> I am trying to train spamassassin using spam and ham I've collected.  My
> problem is the sa-learn script is using too many resources on the server
> (my spam folder had ~1000 messages).
> 
> Is there a way I can run sa-learn on my PC and then merge the results with 
> the server?

No.


Re: using sa-learn offline

Posted by Robert Menschel <Ro...@Menschel.net>.
Hello John,

Monday, March 13, 2006, 2:53:33 PM, you wrote:

JD> I am trying to train spamassassin using spam and ham I've collected.  My
JD> problem is the sa-learn script is using too many resources on the server
JD> (my spam folder had ~1000 messages).

JD> Is there a way I can run sa-learn on my PC and then merge the results with
JD> the server?

As reported, no.  But there is a reasonable solution -- don't feed all
the spam at one shot.  Break up your spam folder into bunches of
200-300 emails, and learn each of those separately.  The more emails
sa-learn tries to learn in one pass, the more resources it will
require.