You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Mathieu R." <ma...@400iso.net> on 2013/09/11 18:25:59 UTC

Antispam plugin / sa-learn

Hello,

Sorry for posting on both list spamassassin and dovecot : my question is
on dovecot antispam plugin, used to learn spamassassin with sa-learn.

I wonder if there is a way to confirme sa-learn is correctly feeded by
the antispam plugin.

dovecot version : 2.1.7
spamassassin version : 3.3.2
 (both packaged in debian stable, with postfix and amavis)

i configured dovecot's antispam plugin this way :
plugin {
  ...
#Antispam
  antispam_debug_target = syslog
  antispam_verbose_debug = 1
  antispam_backend = pipe
  antispam_trash = Trash
  antispam_spam = Junk
  antispam_allow_append_to_spam = no
  antispam_pipe_program = /srv/datadisk01/bin/sa-learn-pipe.sh
  antispam_pipe_program_spam_arg = --spam
  antispam_pipe_program_notspam_arg = --ham
}

refering to : http://wiki2.dovecot.org/Plugins/Antispam

using that script to pipe message to sa-learn :

#!/bin/sh
echo /usr/bin/sa-learn $* /tmp/sendmail-msg-$$.txt ;
echo "$$-start ($*)" >> /tmp/sa-learn-pipe.log ;
#echo $* > /tmp/sendmail-parms.txt ;
cat<&0 >> /tmp/sendmail-msg-$$.txt ;
/usr/bin/sa-learn $* /tmp/sendmail-msg-$$.txt ;
rm -f /tmp/sendmail-msg-$$.txt ;
echo "$$-end" >> /tmp/sa-learn-pipe.log ;
exit 0;

here is what i got when i move a mail to Junk folder :

Sep 11 18:10:10 effraie01 imap: antispam: plugin initialising
(2.0-notgit)
Sep 11 18:10:10 effraie01 imap: antispam: verbose debug enabled
Sep 11 18:10:10 effraie01 imap: antispam: "Junk" is exact match spam
folder
Sep 11 18:10:10 effraie01 imap: antispam: no unsure folders
Sep 11 18:10:10 effraie01 imap: antispam: "Trash" is exact match trash
folder
Sep 11 18:10:10 effraie01 imap: antispam: pipe backend spam argument =
--spam
Sep 11 18:10:10 effraie01 imap: antispam: pipe backend not-spam argument
= --ham
Sep 11 18:10:10 effraie01 imap: antispam: pipe backend program
= /srv/datadisk01/bin/sa-learn-pipe.sh
Sep 11 18:10:10 effraie01 imap: antispam: pipe backend tmpdir /tmp
Sep 11 18:11:10 effraie01 imap: antispam: plugin initialising
(2.0-notgit)
Sep 11 18:11:10 effraie01 imap: antispam: verbose debug enabled
Sep 11 18:11:10 effraie01 imap: antispam: "Junk" is exact match spam
folder
Sep 11 18:11:10 effraie01 imap: antispam: no unsure folders
Sep 11 18:11:10 effraie01 imap: antispam: "Trash" is exact match trash
folder
Sep 11 18:11:10 effraie01 imap: antispam: pipe backend spam argument =
--spam
Sep 11 18:11:10 effraie01 imap: antispam: pipe backend not-spam argument
= --ham
Sep 11 18:11:10 effraie01 imap: antispam: pipe backend program
= /srv/datadisk01/bin/sa-learn-pipe.sh
Sep 11 18:11:10 effraie01 imap: antispam: pipe backend tmpdir /tmp
Sep 11 18:12:04 effraie01 imap: antispam: mailbox_is_unsure(Junk): 0
Sep 11 18:12:04 effraie01 imap: antispam: mailbox_is_trash(INBOX): 0
Sep 11 18:12:04 effraie01 imap: antispam: mailbox_is_trash(Junk): 0
Sep 11 18:12:04 effraie01 imap: antispam: mail copy: from trash: 0, to
trash: 0
Sep 11 18:12:04 effraie01 imap: antispam: mailbox_is_spam(INBOX): 0
Sep 11 18:12:04 effraie01 imap: antispam: mailbox_is_spam(Junk): 1
Sep 11 18:12:04 effraie01 imap: antispam: mailbox_is_unsure(INBOX): 0
Sep 11 18:12:04 effraie01 imap: antispam: mail copy: src spam: 0, dst
spam: 1, src unsure: 0
Sep 11 18:12:04 effraie01 imap: antispam: running mailtrain backend
program /srv/datadisk01/bin/sa-learn-pipe.sh
Sep 11 18:12:04 effraie01 imap: antispam: running mailtrain backend
program /srv/datadisk01/bin/sa-learn-pipe.sh
Sep 11 18:12:04 effraie01 imap: antispam: running mailtrain backend
program parameter 1 --spam

and here is what i got in /tmp/sa-learn-pipe.log:

10545-start (--spam)
10545-end

For me, it's working, but when i run sa-learn --backup, i just get
this :

v       3       db_version # this must be the first line!!!
v       0       num_spam
v       0       num_nonspam

it's probably cause i'm using ***STANDARD-ANTI-UBE-TEST-EMAIL*** wich
probably teach nothing to sa-learn, but i wonder if i can find somewher
a log or something confirming sa-learn correctly get the email i pipe to
it.

thanks a lot in advance

--

Mathieu










Re: Antispam plugin / sa-learn

Posted by RW <rw...@googlemail.com>.
On Wed, 11 Sep 2013 18:25:59 +0200
Mathieu R. wrote:

> Hello,
> 
> Sorry for posting on both list spamassassin and dovecot : my question
> is on dovecot antispam plugin, used to learn spamassassin with
> sa-learn.
> 
> I wonder if there is a way to confirme sa-learn is correctly feeded by
> the antispam plugin.
> ...
> and here is what i got in /tmp/sa-learn-pipe.log:
> 
> 10545-start (--spam)
> 10545-end
> 
> For me, it's working, but when i run sa-learn --backup, i just get
> this :
> 
> v       3       db_version # this must be the first line!!!
> v       0       num_spam
> v       0       num_nonspam
> 
> it's probably cause i'm using ***STANDARD-ANTI-UBE-TEST-EMAIL*** wich
> probably teach nothing to sa-learn,

It should still have been learned. Usually this kind of thing is due
to different invocations looking for the Bayes database in
different places.

IIWY I'd modify the script to run sa-learn with -D bayes and have
it dump stderr to a file. If you are attempting to use per unix user
databases it might be useful to log $HOME as well.


I'm sceptical that the Antispam plugin can learn enough ham this way.
As I understand it the only mail that gets learnt as ham will be
false-positives based on the overall spamassassin score, irrespective of
the Bayes result. Bayes needs (by default) 200 spams and hams to even
start classifying and much more for optimal results - I don't expect to
get 200 FPs in the rest of my life. Unless this is high volume server
with a shared database, I'd suggest either learning a few thousand hams
manually, or implementing an unsure folder. You can also mitigate the
problem by  autotraining with a high ham threshold, but then you
really need to be careful to move all spam to the spam folder.