You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Nels Lindquist <nl...@maei.ca> on 2005/10/11 22:51:01 UTC
Problems with sa-learn and Fetchmail
Hi there.
I'm trying to set up an IMAP based Bayesian training system using
fetchmail as per the RemoteIMAPFolder and SingleUserUNIXInstall
sections of the SpamAssassin wiki.
I'm running into difficulty with messages which have been marked up
with report_safe by spamassassin. When I retrieve such messages with
Fetchmail and feed them to sa-learn directly, the markup is not
detected and removed and the messages are learned improperly.
Since I'm testing this with Dovecot using Maildir-based folders, I
was able to change into the appropriate Maildir directory and run sa-
learn on the mail messages directly from the filesystem. When I do
that, the SA markup is properly detected and removed prior to
learning. Similarly, if I tell fetchmail to dump a message to a
textfile instead of directly to sa-learn, then the resulting textfile
is identical to the Maildir mail message (assuming I use the --
invisible option for fetchmail), and sending it to sa-learn results
in the SA markup once again being properly detected and removed prior
to learning.
The problem then seems to be caused by the fetchmail process. I
noticed when using the "-v" (verbose) option with fetchmail and "sa-
learn -D --spam" that the message header and body are retrieved
separately, and sa-learn seems to start its processing before the
message body is retrieved from the IMAP server:
fetchmail: IMAP> A0010 FETCH 3 RFC822.HEADER
fetchmail: IMAP< * 3 FETCH (RFC822.HEADER {1262}
reading message defang@rapier.smilodon.ca:3 of 3 (1262 header octets)
fetchmail: about to deliver with: sa-learn -D --spam
#
fetchmail: IMAP< )
fetchmail: IMAP< A0010 OK Fetch completed.
fetchmail: IMAP> A0011 FETCH 3 BODY[TEXT]
[6637] dbg: logger: adding facilities: all
[6637] dbg: logger: logging level is DBG
[6637] dbg: generic: SpamAssassin version 3.1.0
[6637] dbg: config: score set 0 chosen.
[ .... lots more SA dbg lines here ... ]
*.****************.*****************.****************.****************
*.****************.*****************.*****************.***************
**
fetchmail: IMAP< )
fetchmail: IMAP< A0011 OK Fetch completed.
[6626] dbg: learn: learning spam
[6626] dbg: dns: dns_available set to yes in config file, skipping
test
[6626] dbg: metadata: X-Spam-Relays-Trusted:
[6626] dbg: metadata: X-Spam-Relays-Untrusted:
[6626] dbg: message: ---- MIME PARSER START ----
[6626] dbg: message: main message type: text/plain
[6626] dbg: message: parsing normal part
[6626] dbg: message: added part, type: text/plain
[6626] dbg: message: ---- MIME PARSER END ----
[6626] dbg: message: no encoding detected
[ .... SA processing continues .... ]
At no point is there a "dbg: markup: removing markup" line as there
is when I run sa-learn on the message files directly. My theory is
that fetchmail is feeding the message header and body as two separate
events, and sa-learn isn't detecting them as a single message.
Any ideas?
----
Nels Lindquist <*>
Information Systems Manager
Morningstar Air Express Inc.
Re: Problems with sa-learn and Fetchmail
Posted by Michael Monnerie <m....@zmi.at>.
On Dienstag, 11. Oktober 2005 22:51 Nels Lindquist wrote:
> Any ideas?
I use:
sudo -H -u $user fetchmail -a -s -n -p IMAP --folder 'SPAM_yes' --auth
'password' -m "formail -d -I \"From \" -a \"From \" -s >>$checkspam"
$imapserver
Possibly you need the "From " Header? Anyway, afterwards I do:
sudo -H -u $user spamassassin -r --mbox $checkspam
and I *could* do:
formail <$checkspam -n 3 -s "tee >(spamc -u $user -L spam)|spamc -u
$user -C report"
but that doesn't train the per-user-bayes, while calling spamassassin
does. Works nice and as expected.
mfg zmi
--
// Michael Monnerie, Ing.BSc --- it-management Michael Monnerie
// http://zmi.at Tel: 0660/4156531 Linux 2.6.11
// PGP Key: "lynx -source http://zmi.at/zmi2.asc | gpg --import"
// Fingerprint: EB93 ED8A 1DCD BB6C F952 F7F4 3911 B933 7054 5879
// Keyserver: www.keyserver.net Key-ID: 0x70545879