You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Charles H. Shooshan III" <ch...@snet.net> on 2005/02/16 05:54:32 UTC

bayes_ignore_header help

Hi!

We've just successfully setup SA 3.0.2 on our Apple X Serve 10.3.8, 
using postfix, procmail, and Squirrelmail.

I have trained the Bayes with some spam archives and I would like our 
users to send mail to a specific mailbox for training.

For certain specific users, I have had them add IsSpam and NotSpam 
folders in their Squirrelmail interfaces and I have been able to use 
the command:

sa-learn --showdots --spam /var/spool/imap/user/<username>/IsSpam/*.

This works nicely because the cyrus imap structure has all the 
individual messages named <some_number><period>

I also would like every user to be able to "bounce" mail to a special 
spamreport account and to that end, I have added a Squirrelmail 
"bounce" plug-in that come the closest to allowing our users to 
"bounce" mail unscathed. Unfortunately, the Return-Path header is 
changed to the last person (our own user) when the message is bounced.

Here are the two sets of headers (the original and after the bounce) 
with a little bit of personal identification removed:

The Original:
==========

Return-Path: <so...@snet.net>
Received: from mail.my_domain.org ([unix socket]) (authenticated 
user=this_is_me bits=0)
     by mail.my_domain.org (Cyrus v2.1.13) with LMTP; Tue, 15 Feb 2005 
23:11:02 -0500
X-Sieve: CMU Sieve 2.2
Received: from smtp812.mail.sc5.yahoo.com (smtp812.mail.sc5.yahoo.com 
[66.163.170.82])
     by mail.my_domain.org (Postfix) with SMTP id 8053D22D779
     for <ch...@my_domain.org>; Tue, 15 Feb 2005 23:10:58 -0500 (EST)
Received: from unknown (HELO ?192.168.2.5?) 
(someone_else@snet.net@66.159.222.222 with plain)
     by smtp812.mail.sc5.yahoo.com with SMTP; 16 Feb 2005 04:10:57 -0000
Mime-Version: 1.0 (Apple Message framework v619.2)
Content-Transfer-Encoding: 7bit
Message-Id: <a7...@snet.net>
Content-Type: text/plain; charset=US-ASCII; format=flowed
To: Charlie Itsme <ch...@my_domain.org>
From: Someone Else <so...@snet.net>
Subject: Test1
Date: Tue, 15 Feb 2005 23:10:56 -0500
X-Mailer: Apple Mail (2.619.2)
X-Spam-Level:
X-Spam-Status: No, score=-1.8 required=4.0 tests=AWL,BAYES_00 
autolearn=ham
     version=3.0.2
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on 
mail.my_domain.org

After the Bounce:
=============

Return-Path: <Ch...@my_domain.org>
Received: from mail.my_domain.org ([unix socket]) (authenticated 
user=bounce_recipient bits=0)
     by mail.my_domain.org (Cyrus v2.1.13) with LMTP; Tue, 15 Feb 2005 
23:12:28 -0500
X-Sieve: CMU Sieve 2.2
Received: from mail.my_domain.org (localhost [127.0.0.1])
     by mail.my_domain.org (Postfix) with ESMTP id CBF1F22D7D6
     for <jo...@my_domain.org>; Tue, 15 Feb 2005 23:12:26 -0500 (EST)
Received: from 66.159.222.222
     (SquirrelMail authenticated user this_is_me);
     by mail.my_domain.org with HTTP;
     Tue, 15 Feb 2005 23:12:26 -0500 (EST)
X-Received: from mail.my_domain.org ([unix socket]) (authenticated
     user=this_is_me bits=0) by mail.my_domain.org (Cyrus v2.1.13)
     with LMTP; Tue, 15 Feb 2005 23:11:02 -0500
X-Sieve: CMU Sieve 2.2
X-Received: from smtp812.mail.sc5.yahoo.com (smtp812.mail.sc5.yahoo.com
     [66.163.170.82]) by mail.my_domain.org (Postfix) with SMTP id
     8053D22D779 for <ch...@my_domain.org>; Tue,
     15 Feb 2005 23:10:58 -0500 (EST)
X-Received: from unknown (HELO ?192.168.2.5?)
     (someone_else@snet.net@66.159.222.222 with plain) by
     smtp812.mail.sc5.yahoo.com with SMTP; 16 Feb 2005 04:10:57 -0000
Mime-Version: 1.0 (Apple Message framework v619.2)
Content-Transfer-Encoding: 7bit
Message-Id: <a7...@snet.net>
Content-Type: text/plain; charset=US-ASCII; format=flowed
To: Charlie Itsme <ch...@my_domain.org>
From: Someone Else <so...@snet.net>
Subject: Test1
Date: Tue, 15 Feb 2005 23:10:56 -0500
X-Mailer: Apple Mail (2.619.2)
ReSent-Date: Tue, 15 Feb 2005 23:12:26 -0500 (EST)
Resent-From: "Charlie Itsme" <Ch...@my_domain.org>
Resent-To: john@mail.my_domain.org
ReSent-Message-ID: 
<56...@66.159.222.222>
X-Spam-Level:
X-Spam-Status: No, score=-3.0 required=4.0 
tests=ALL_TRUSTED,AWL,BAYES_00,
     BLANK_LINES_70_80 autolearn=ham version=3.0.2
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on 
mail.my_domain.org

========

I think I should add (in local.cf):

bayes_ignore_header ReSent-Date
bayes_ignore_header ReSent-From
bayes_ignore_header ReSent-To
bayes_ignore_header ReSent-Message-ID

I don't believe I have to add any of the X-Spam headers since SA knows 
about them.

But I am very reluctant to add:

bayes_ignore_header Return-Path

since that header seems important for training for all other sources 
other than these "bounced" messages.

So my questions:

Is SA smart enough to ignore the Return-Path if its bounced mail?

Will adding "bayes_ignore_header Return-Path" undermine the training?

Should I somehow copy these bounced messages and strip out the 
Return-Path in these specific messages before training?

Any other suggestions (forward and forward as attachment didn't seem to 
provide me with any better alternatives)?
[I was going to add shared IMAP folders but I can't seem to get past a 
"Can't locate Cyrus/IMAP/Shell.pm in @INC ..." error trying to start 
cyradm.]

Thanks for any help,
Charlie