You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by ram01 <ra...@yahoo.com> on 2007/03/02 00:01:25 UTC

Re: [2] best method for sa-learn

OK so if I have a global database but separate IMAP spame boxes how should i
go about actually running sa-learn.  from the cron point of view I would
have to enum all the users and do /home/user/isSpam, /home/userb/isSpam,
etc.    I could also run it in procmail, this would run more often but on
smaller files.  What are my other options, which is best.


Alexey-12 wrote:
> 
> I prefer single(global) DB because all users get the same spam, and I
> learn from any spam and "secure" with this
> others - IMO it is a good idea.
>> What is the best way to run sa-learn with separate IMAP training folders
>> per
>> user, but global bayes database.
>> --
>> View this message in context:
>> http://www.nabble.com/best-method-for-sa-learn-tf3329149.html#a9256805
>> Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
> -- 
> Best regards, Alexey.
> 
> 

-- 
View this message in context: http://www.nabble.com/best-method-for-sa-learn-tf3329149.html#a9260650
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: [2] best method for sa-learn

Posted by Alexey <ro...@smallserver.org>.
Personally $me is using simple bash-script, like
...
#!/bin/bash
echo "Learning SPAM..."
sa-learn --progress --spam /opt/system/mail/spamfilter/spam/
rm -fr /opt/system/mail/spamfilter/spam/*
echo "Learning HAM..."
sa-learn --progress --ham /opt/system/mail/spamfilter/ham/
rm -fr /opt/system/mail/spamfilter/ham/*
...
I am using my own delivery system, and users are able to forward mails to addresses like getspam@ and
getham@mydomain.tld. You can configure procmail to do the same thing. You may also use crontab to periodically run
sa-learn script. In my opinion it isn't good to run sa-learn exactly in procmail, because it may cause high CPU usage
if you will have large amount of incoming mails. Maybe i am wrong ;)
> OK so if I have a global database but separate IMAP spame boxes how should i go about actually running sa-learn. 
from the cron point of view I would have to enum all the users and do /home/user/isSpam, /home/userb/isSpam, etc.   
I could also run it in procmail, this would run more often but on smaller files.  What are my other options, which
is best.
>
>
> Alexey-12 wrote:
>> I prefer single(global) DB because all users get the same spam, and I learn from any spam and "secure" with this
>> others - IMO it is a good idea.
>>> What is the best way to run sa-learn with separate IMAP training folders per
>>> user, but global bayes database.
>> --
>> Best regards, Alexey.
-- 
Best regards, Alexey.

Re: [2] best method for sa-learn

Posted by ram01 <ra...@yahoo.com>.
Thanks I think this is what I am looking for, have not had a chance to try it
though.


ram01 wrote:
> 
> yes the learning folders are stored on the server.  The same server is
> running IMAP, MTA, and SA.  The site does not have a high load of email,
> so processor load shouldn't be a big issue, not to say that I don't care
> about efficiency.
>     
> 
> John D. Hardin wrote:
>> 
>> On Thu, 1 Mar 2007, ram01 wrote:
>> 
>>> OK so if I have a global database but separate IMAP spame boxes
>>> how should i go about actually running sa-learn.  from the cron
>>> point of view I would have to enum all the users and do
>>> /home/user/isSpam, /home/userb/isSpam, etc.  I could also run it
>>> in procmail, this would run more often but on smaller files.  
>>> What are my other options, which is best.
>> 
>> Are the folders actually stored locally to the SA box, and IMAP only 
>> comes in for the users?
>> 
>> Take a look at the sa training script in 
>> http://www.impsec.org/~jhardin/antispam/
>> It is based on multiple users, local mailboxes (which the users may 
>> actually be accessing via IMAP) and global Bayes.
>> 
>> --
>>  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
>>  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
>>  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
>> -----------------------------------------------------------------------
>>   The first time I saw a bagpipe, I thought the player was torturing
>>   an octopus. I was amazed they could scream so loudly.
>>                                         -- cat_herder_5263 on Y! SCOX
>> -----------------------------------------------------------------------
>>  12 days until Albert Einstein's 128th Birthday
>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/best-method-for-sa-learn-tf3329149.html#a9326813
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: [2] best method for sa-learn

Posted by "John D. Hardin" <jh...@impsec.org>.
On Thu, 1 Mar 2007, ram01 wrote:

> yes the learning folders are stored on the server.  The same server is
> running IMAP, MTA, and SA.  The site does not have a high load of email, so
> processor load shouldn't be a big issue, not to say that I don't care about
> efficiency.

Then this script may be very near to what you want.

> > Take a look at the sa training script in 
> > http://www.impsec.org/~jhardin/antispam/
> > It is based on multiple users, local mailboxes (which the users may 
> > actually be accessing via IMAP) and global Bayes.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Think Microsoft cares about your needs at all?
  "A company wanted to hold off on upgrading Microsoft Office for a
  year in order to do other projects. So Microsoft gave a 'free' copy
  of the new Office to the CEO -- a copy that of course generated
  errors for anyone else in the firm reading his documents. The CEO
  got tired of getting the 'please re-send in XX format' so he
  ordered other projects put on hold and the Office upgrade to be top
  priority."                                    -- Cringely, 4/8/2004
-----------------------------------------------------------------------
 11 days until Albert Einstein's 128th Birthday


Re: [2] best method for sa-learn

Posted by ram01 <ra...@yahoo.com>.
yes the learning folders are stored on the server.  The same server is
running IMAP, MTA, and SA.  The site does not have a high load of email, so
processor load shouldn't be a big issue, not to say that I don't care about
efficiency.
    

John D. Hardin wrote:
> 
> On Thu, 1 Mar 2007, ram01 wrote:
> 
>> OK so if I have a global database but separate IMAP spame boxes
>> how should i go about actually running sa-learn.  from the cron
>> point of view I would have to enum all the users and do
>> /home/user/isSpam, /home/userb/isSpam, etc.  I could also run it
>> in procmail, this would run more often but on smaller files.  
>> What are my other options, which is best.
> 
> Are the folders actually stored locally to the SA box, and IMAP only 
> comes in for the users?
> 
> Take a look at the sa training script in 
> http://www.impsec.org/~jhardin/antispam/
> It is based on multiple users, local mailboxes (which the users may 
> actually be accessing via IMAP) and global Bayes.
> 
> --
>  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
>  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
>  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
>   The first time I saw a bagpipe, I thought the player was torturing
>   an octopus. I was amazed they could scream so loudly.
>                                         -- cat_herder_5263 on Y! SCOX
> -----------------------------------------------------------------------
>  12 days until Albert Einstein's 128th Birthday
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/best-method-for-sa-learn-tf3329149.html#a9263952
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: [2] best method for sa-learn

Posted by "John D. Hardin" <jh...@impsec.org>.
On Thu, 1 Mar 2007, ram01 wrote:

> OK so if I have a global database but separate IMAP spame boxes
> how should i go about actually running sa-learn.  from the cron
> point of view I would have to enum all the users and do
> /home/user/isSpam, /home/userb/isSpam, etc.  I could also run it
> in procmail, this would run more often but on smaller files.  
> What are my other options, which is best.

Are the folders actually stored locally to the SA box, and IMAP only 
comes in for the users?

Take a look at the sa training script in 
http://www.impsec.org/~jhardin/antispam/
It is based on multiple users, local mailboxes (which the users may 
actually be accessing via IMAP) and global Bayes.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The first time I saw a bagpipe, I thought the player was torturing
  an octopus. I was amazed they could scream so loudly.
                                        -- cat_herder_5263 on Y! SCOX
-----------------------------------------------------------------------
 12 days until Albert Einstein's 128th Birthday