You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Florian Lindner <ma...@xgm.de> on 2013/10/15 02:07:56 UTC

When/How to train bayes from user mail?

Hey!

I manage the email accounts for my family and a couple of friends. They are 
located on a (virtual) server, adminstrated by myself.

Since we move our server (and upgrade from oldstabe to stable) I want to 
reconsider how I organize mails serverside.

Debian, MTA is postfix, MDA maildrop (like procmail), IMAP was courier, will be 
dovecot.

Planned setup is like that:

All mail is spamfiltered by spamassassin. Mail with a very high score (e.g. 
>10) is immediately deleted.

Mail classified as spam is delivered to UserMaildir/.Spam (= IMAP Spam 
folder).

My biggest open question is how to integrate the SA bayes filter, esp. when and 
on what folders to do training.

a) All users use the bayes filter, but I train the mails only on my spam/ham. 
Probably not a good idea I think

b) Learn Spam daily from each users Spam directory, this may involve 
reinforcing false positives. Open question remains, how to learn ham? I could 
learn from every users inbox. False negatives will be moved to Spam later the 
bayes filter will forget them being ham when learned as spam.

c) Learn Ham like b) and Spam just from me.

Though I'm quite computer literate most of my users are not. I do the IMAP 
account setup, it's fine for them to move a message from/to Spam but not a lot 
more.

What do you think?

Thx!
Florian

Re: When/How to train bayes from user mail?

Posted by Tom Hendrikx <to...@whyscream.net>.
On 10/15/2013 09:03 PM, Florian Lindner wrote:
> Am Dienstag, 15. Oktober 2013, 07:19:01 schrieb Andreas Schulze:
>> Zitat von Florian Lindner <ma...@xgm.de>:
>>> Since we move our server (and upgrade from oldstabe to stable) I want to
>>> reconsider how I organize mails serverside.
>>>
>>> Debian, MTA is postfix, MDA maildrop (like procmail), IMAP was
>>> courier, will be dovecot.
>>
>> if you use dovecot, maildrop is obsolete.
>> deliver your mail via LMTP (or dovecot-lda) to dovecot and let
>> dovecot-sieve do the filtering to subfolders.
> 
> I object.
> AFAIK when using dovecot an LDA there is a 1:1 relation of mail adress and 
> mailbox. When using maildrop I can deliver multiple adresses to a single 
> maildir or one adress to multiple maildirs.

I would really keep those requirements at the MTA level using aliasing
and whatnot. Delivering a message to multiple folders within a single
maildir (duplication) can be one using sieve.

> Additionally sieve can not call 
> external programms.

Wrong. http://wiki2.dovecot.org/Pigeonhole/Sieve lists several (albeit
experimental) extensions that can run external programs. Setup is not as
easy as with procmail (sorry, no maildrop experience here) but it gives
the administrator control over what disasters can be triggered when
email triggers execution of external tools set up by users.

>  
>> Also consider using amavisd-new + clamav + spamassassin to REJECT
>> mails. (not accept + delete)
>> You may connect amavisd-new as SMTPD_PROXY or using amavisd-milter to
>> your postfix MTA.
>>
>>> My biggest open question is how to integrate the SA bayes filter,
>>> esp. when and on what folders to do training.
>>
>> I train sa only using the autolearn feature.
> 

dovecot has the antispam plugin, which can be used to call sa-learn (or
any other tool) for each message that is moved in/out of the spam
folder. See http://wiki2.dovecot.org/Plugins/Antispam for details

Regards,
	Tom


Re: When/How to train bayes from user mail?

Posted by Florian Lindner <ma...@xgm.de>.
Am Dienstag, 15. Oktober 2013, 07:19:01 schrieb Andreas Schulze:
> Zitat von Florian Lindner <ma...@xgm.de>:
> > Since we move our server (and upgrade from oldstabe to stable) I want to
> > reconsider how I organize mails serverside.
> > 
> > Debian, MTA is postfix, MDA maildrop (like procmail), IMAP was
> > courier, will be dovecot.
> 
> if you use dovecot, maildrop is obsolete.
> deliver your mail via LMTP (or dovecot-lda) to dovecot and let
> dovecot-sieve do the filtering to subfolders.

I object.
AFAIK when using dovecot an LDA there is a 1:1 relation of mail adress and 
mailbox. When using maildrop I can deliver multiple adresses to a single 
maildir or one adress to multiple maildirs. Additionally sieve can not call 
external programms.
 
> Also consider using amavisd-new + clamav + spamassassin to REJECT
> mails. (not accept + delete)
> You may connect amavisd-new as SMTPD_PROXY or using amavisd-milter to
> your postfix MTA.
> 
> > My biggest open question is how to integrate the SA bayes filter,
> > esp. when and on what folders to do training.
> 
> I train sa only using the autolearn feature.

Ok.

Regards,
Florian


Re: When/How to train bayes from user mail?

Posted by Andreas Schulze <sc...@andreasschulze.de>.
Zitat von Florian Lindner <ma...@xgm.de>:

> Since we move our server (and upgrade from oldstabe to stable) I want to
> reconsider how I organize mails serverside.
>
> Debian, MTA is postfix, MDA maildrop (like procmail), IMAP was  
> courier, will be dovecot.

if you use dovecot, maildrop is obsolete.
deliver your mail via LMTP (or dovecot-lda) to dovecot and let  
dovecot-sieve do the filtering to subfolders.

Also consider using amavisd-new + clamav + spamassassin to REJECT  
mails. (not accept + delete)
You may connect amavisd-new as SMTPD_PROXY or using amavisd-milter to  
your postfix MTA.

> My biggest open question is how to integrate the SA bayes filter,  
> esp. when and on what folders to do training.
I train sa only using the autolearn feature.

Andreas