You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Rodney Baker <ro...@jeremiah31-10.net> on 2011/08/15 16:57:21 UTC

Inconsistent spam scores between spam headers and rewritten subject line.

Hi all. I'm running spamassassin 3.3.1 on my openSuse 11.2 box at home. Mail 
is collected from multiple ISP mail accounts via fetchmail and delivered to 
local IMAP mail folders via procmail. My user account .procmailrc file begins 
thus:

   LOGFILE=$HOME/pm.log

   :0fw: spamassassin.lock 
   | spamc
 

   :0
   * ^Subject.*SPAM\([0-9]{1,3}\.[0-9]\).*
   $HOME/Maildir/.Spam//

I'm attempting to filter on the modified subject line (which for some reason 
isn't working - that rule never seems to match and spam never gets moved into 
the Spam folder, even though I've tested the regex manually). I thought of 
filtering on the X-Spam-Status header instead, but when I had a look at a 
message that was marked as Spam (according to the subject line) I found 
something rather strange...

   X-Virus-Flag: no
   X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on     
<my.local.mailhost.name.removed>
   X-Spam-Level: *
   X-Spam-Status: No, score=1.5 required=6.5 
tests=BAYES_00,IMPOTENCE,NO_RELAYS
         autolearn=no version=3.3.1
   X-Spam-Virus: No
   Received: from localhost by <my.local.mailhost.name.removed>
         with SpamAssassin (version 3.3.1);
         Mon, 15 Aug 2011 18:58:01 +0930
   From: "Adele Key" <spam.address.removed>
   To: another.user@iinet.net.au
   Subject: ****SPAM(10.1)**** <spam-subject-removed>
   Date: Mon, 15 Aug 2011 18:12:48 +0900
   Message-Id: <16...@spamdomain.removed>
   MIME-Version: 1.0
   Content-Type: multipart/mixed;
   boundary="----------=_4E48E6A1.127A41A2"
   X-Length: 7330
   X-UID: 83487
   X-KMail-Filtered: 61220
   Status: R
   X-Status: N
   X-KMail-EncryptionState: 
   X-KMail-SignatureState: 
   X-KMail-MDN-Sent: 
 
  Spam detection software, running on the system 
  <my.local.mailhost.name.removed>, has
  identified this incoming email as possible spam.  The original message
  has been attached to this so you can view it (if it isn't spam) or label
  similar future email.  If you have any questions, see
  postmaster for details.


  Content preview:  [...]


  Content analysis details:   (10.1 points, 6.5 required)


   pts rule name              description
   ---- ----------------------  ----------------------------------------------
   3.8 KB_DATE_CONTAINS_TAB   KB_DATE_CONTAINS_TAB
   3.0 IMPOTENCE              BODY: Impotence cure
   -0.0 BAYES_20               BODY: Bayes spam probability is 5 to 20%
                            [score: 0.1050]
   2.0 KB_FAKED_THE_BAT       KB_FAKED_THE_BAT
   1.2 RDNS_NONE              Delivered to internal network by a host with no     
rDNS


I don't get it - the content analysis shows a score of 10.1, the modified 
subject line shows 10.1, but the X-Spam-Status header shows 1.5! What have I 
messed up in my configuration?

My /etc/mail/spamassassin/local.cf looks like this:

   # Add your own customisations to this file.  See 'man         
Mail::SpamAssassin::Conf'
   # for details of what can be tweaked.
   # 


   # do not change the subject
   # to change the subject, e.g. use
   # rewrite_header Subject ****SPAM(_SCORE_)****
   rewrite_header subject ****SPAM(_SCORE_)****

   # Set the score required before a mail is considered spam.
   # required_score 5.00

   # uncomment, if you do not want spamassassin to create a new message
   # in case of detecting spam
   # report_safe 0

   # Enhance the uridnsbl_skip_domain list with some usefull entries
   # Do not block the web-sites of Novell and SUSE
   ifplugin Mail::SpamAssassin::Plugin::URIDNSBL
   uridnsbl_skip_domain suse.de opensuse.org suse.com suse.org
   uridnsbl_skip_domain novell.com novell.org novell.ru novell.de novell.hu        
novell.co.uk
   uridnsbl_skip_domain kernel.org
   endif   # Mail::SpamAssassin::Plugin::URIDNSBL
   # Everything above this line is as per the installed openSuSE default
   
   ok_languages en

   #The combination of SpamAssassin + The Bat! as mail client can cause false    
positives.
    #The reason for the high spam rating is the Reply-To header inserted by 
mailman,
    #which seems to have more quoting than The Bat! can do.
    #If you have such problem activate the next two lines
    #header IS_MAILMAN exists:X-Mailman-Version
    #score IS_MAILMAN -2
    required_score 6.5
    whitelist_from <multiple mailing daemon addresses>
    [...]
    use_bayes 1
    report_header 1
    fold_headers 1
    report_safe 2

Thanks in advance.
Rodney.
-- 
======================================================
Rodney Baker
rodney@jeremiah31-10.net
web: www.jeremiah31-10.net
======================================================

Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2011-08-16 at 22:29 +0930, Rodney Baker wrote:
> On Tue, 16 Aug 2011 05:02:20 John Hardin wrote:

> > Just as a test, if you comment that bit out of your personal .procmailrc
> > does everything work they way you'd expect (i.e. one SA pass, the correct
> > score in the X- headers)?
> 
> Yep,that was the first thing that I did. Somehow spamassassin is still 
> checking the messages, even though they're not being piped through spamc via 
> procmail. I'm sure that fetchmail isn't doing it, so that leaves sendmail, 
> dovecot or kmail. So begins the process of elimination (or maybe I just leave 
> it out of procmailrc and be done with it...).

If you don't use Delivery Control Options with fetchmail (see that
section in the man pages) like an explicit MDA or SMTP, this should not
be where SA gets invoked. You don't, do you? The default is to pass it
on to port 25, which should just be your Sendmail.

A site-wide procmail configuration doesn't exist, as you mentioned in
another reply to this thread.

Dovecot will not filter messages. It's an IMAP server that serves what
has been delivered already. The dovecot MDA could, but you seem to use
procmail for direct delivery into the Maildir store. Another one to rule
out.

Kmail as an MUA must not modify delivered mail (and doesn't), so while
it could call SA again, you won't see SA headers. Both Dovecot and Kmail
are after the procmail recipe you initially showed anyway, so there's no
chance they could cause the matching issues you reported.

Leaves us with Sendmail in the chain to dig further...

After all, procmail already sees SA headers, without a filter. What
you're hunting for is before procmail in the chain.


Regarding "leaving it out of procmail" and being done with it -- maybe.
This is likely to bite later, though. If it is before procmail, odds are
it's using a site-wide user. Which implies Bayes training has to be done
as that user, not the recipient...


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by Rodney Baker <ro...@jeremiah31-10.net>.
On Tue, 16 Aug 2011 05:02:20 John Hardin wrote:
> On Tue, 16 Aug 2011, Rodney Baker wrote:
> >   :0fw: spamassassin.lock
> >   :
> >   | spamc
> 
> Just as a test, if you comment that bit out of your personal .procmailrc
> does everything work they way you'd expect (i.e. one SA pass, the correct
> score in the X- headers)?

Yep,that was the first thing that I did. Somehow spamassassin is still 
checking the messages, even though they're not being piped through spamc via 
procmail. I'm sure that fetchmail isn't doing it, so that leaves sendmail, 
dovecot or kmail. So begins the process of elimination (or maybe I just leave 
it out of procmailrc and be done with it...).

Thanks,
Rodney.

-- 
======================================================
Rodney Baker
rodney@jeremiah31-10.net
web: www.jeremiah31-10.net
======================================================

Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by John Hardin <jh...@impsec.org>.
On Tue, 16 Aug 2011, Rodney Baker wrote:

>   :0fw: spamassassin.lock
>   | spamc

Just as a test, if you comment that bit out of your personal .procmailrc 
does everything work they way you'd expect (i.e. one SA pass, the correct 
score in the X- headers)?

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   ...for a nation to tax itself into prosperity is like a man
   standing in a bucket and trying to lift himself up by the handle.
                                                  -- Winston Churchill
-----------------------------------------------------------------------
  Today: the 66th anniversary of the end of World War II

Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by Bowie Bailey <Bo...@BUC.com>.
On 8/16/2011 8:55 AM, Rodney Baker wrote:
> On Tue, 16 Aug 2011 07:36:05 Karsten Bräckelmann wrote:
>
>> After you fixed your mail processing chain to not have SA chew twice on
>> the spam -- you should manually train Bayes, feeding it a lot of hand
>> classified spam, and possibly ham. Check your 'sa-learn --dump magic'
>> numbers. The Bayes score of 0.1 is way out of line.
> Agreed. I do run sa-learn --spam (actually now have it scheduled to run weekly 
> on a folder into which I drop all the non-classified spam messages) and --ham 
> (on a folder with messages that were false-positives).


When you are trying to fix a Bayes problem, it can be useful to feed it
as much as possible.  Put *all* your ham and *all* your spam (properly
classified or not) into those folders and let Bayes learn from it.

-- 
Bowie

Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by Rodney Baker <ro...@jeremiah31-10.net>.
On Tue, 16 Aug 2011 07:36:05 Karsten Bräckelmann wrote:
> On Tue, 2011-08-16 at 01:07 +0930, Rodney Baker wrote:
> > On Tue, 16 Aug 2011 00:48:13 Bowie Bailey wrote:
> > > >    * ^Subject.*SPAM\([0-9]{1,3}\.[0-9]\).*
> > > >    $HOME/Maildir/.Spam//
> > > > 
> > > > I'm attempting to filter on the modified subject line (which for some
> > > > reason isn't working - that rule never seems to match and spam never
> > > > gets moved into the Spam folder, even though I've tested the regex
> > > > manually). I thought of filtering on the X-Spam-Status header
> > > > instead, but when I had a look at a message that was marked as Spam
> > > > (according to the subject line) I found something rather strange...
> 
> Yes, filtering on the SA X-Spam Status or Level headers is the way to
> go. After you found and fixed where SA gets called a second time
> (actually the first time), these won't be harmed and overwritten -- and
> useful for filtering.
> 
> Anyway, the secret why the above procmail recipe doesn't work is simply,
> because procmail uses a rather limited sub-set of REs and its own
> flavor. It's not PCRE.
> 
> In particular procmail does not understand {x,y} range quantifiers, but
> treats that part as a plain string to match. Which doesn't.
> (Caveat: From memory, not actually looked it up again for verification.)

Ah, thankyou. Despite googling for lots of stuff on procmail I've not been 
able to find a definitive reference for what can and can't be used in a 
procmail recipe. Maybe I just haven't use the right search terms (or maybe I 
just haven't understood what I've read). Anyway, thanks for the clarification.

> 
> > > >     3.8 KB_DATE_CONTAINS_TAB   KB_DATE_CONTAINS_TAB
> > > >     3.0 IMPOTENCE              BODY: Impotence cure
> > > >    
> > > >    -0.0 BAYES_20               BODY: Bayes spam probability is 5 to
> > > >    20%
> > > >    
> > > >                                [score: 0.1050]
> > > >     
> > > >     2.0 KB_FAKED_THE_BAT       KB_FAKED_THE_BAT
> > > >     1.2 RDNS_NONE              Delivered to internal network by a
> > > >     host with no
> > > >     
> > > >                                rDNS
> 
> Oh, yeah, these do ring quite some bells... ;)
> 
> After you fixed your mail processing chain to not have SA chew twice on
> the spam -- you should manually train Bayes, feeding it a lot of hand
> classified spam, and possibly ham. Check your 'sa-learn --dump magic'
> numbers. The Bayes score of 0.1 is way out of line.

Agreed. I do run sa-learn --spam (actually now have it scheduled to run weekly 
on a folder into which I drop all the non-classified spam messages) and --ham 
(on a folder with messages that were false-positives).
 
> 
> Note though, that a previous site-wide SA filter might use a site-wide
> user, not the one owning the procmail recipe. Thus Bayes scores might
> suddenly change once it's run per user. Check the numbers and
> performance for the user you'll use after fixing the chain issue.
> 
> > > You need to fix whatever is causing the message to be scanned twice.
> > 
> > OK - that makes sense. Now I'm wondering if there is a global mail config
> > somewhere that is routing the message through SA, and then my local
> > .procmailrc is doing it again. Time to go digging...
> 
> Site-wide /etc/procmailrc, SMTP server milter, transport or similar, or
> even something like Amavis in the chain?

There is no /etc/procmailrc, no milter that I'm aware of, running 
fetchmail/sendmail/dovecot. This machine doubles as my home mail server/file 
server and desktop machine. The only reason I'm running IMAP is so that I can 
access the same mail from my laptop or netbook when I need to (and I used to 
run squirrelmail to allow access remotely via https webmail, but not any 
more).
 
> 
> > That then leaves the question as to why my procmail recipe isn't
> > triggering on the rewritten subject, but that is probably not for this
> > list.
> 
> It's sufficiently related. ;)  See above.

Thanks again. :-)

-- 
======================================================
Rodney Baker
rodney@jeremiah31-10.net
web: www.jeremiah31-10.net
======================================================

Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2011-08-16 at 01:07 +0930, Rodney Baker wrote:
> On Tue, 16 Aug 2011 00:48:13 Bowie Bailey wrote:

> > >    * ^Subject.*SPAM\([0-9]{1,3}\.[0-9]\).*
> > >    $HOME/Maildir/.Spam//
> > > 
> > > I'm attempting to filter on the modified subject line (which for some
> > > reason isn't working - that rule never seems to match and spam never
> > > gets moved into the Spam folder, even though I've tested the regex
> > > manually). I thought of filtering on the X-Spam-Status header instead,
> > > but when I had a look at a message that was marked as Spam (according to
> > > the subject line) I found something rather strange...

Yes, filtering on the SA X-Spam Status or Level headers is the way to
go. After you found and fixed where SA gets called a second time
(actually the first time), these won't be harmed and overwritten -- and
useful for filtering.

Anyway, the secret why the above procmail recipe doesn't work is simply,
because procmail uses a rather limited sub-set of REs and its own
flavor. It's not PCRE.

In particular procmail does not understand {x,y} range quantifiers, but
treats that part as a plain string to match. Which doesn't.
(Caveat: From memory, not actually looked it up again for verification.)


> > >     3.8 KB_DATE_CONTAINS_TAB   KB_DATE_CONTAINS_TAB
> > >     3.0 IMPOTENCE              BODY: Impotence cure
> > >    -0.0 BAYES_20               BODY: Bayes spam probability is 5 to 20%
> > >                                [score: 0.1050]
> > >     2.0 KB_FAKED_THE_BAT       KB_FAKED_THE_BAT
> > >     1.2 RDNS_NONE              Delivered to internal network by a host with no
> > >                                rDNS

Oh, yeah, these do ring quite some bells... ;)

After you fixed your mail processing chain to not have SA chew twice on
the spam -- you should manually train Bayes, feeding it a lot of hand
classified spam, and possibly ham. Check your 'sa-learn --dump magic'
numbers. The Bayes score of 0.1 is way out of line.

Note though, that a previous site-wide SA filter might use a site-wide
user, not the one owning the procmail recipe. Thus Bayes scores might
suddenly change once it's run per user. Check the numbers and
performance for the user you'll use after fixing the chain issue.


> > You need to fix whatever is causing the message to be scanned twice.
> 
> OK - that makes sense. Now I'm wondering if there is a global mail config 
> somewhere that is routing the message through SA, and then my local 
> .procmailrc is doing it again. Time to go digging...

Site-wide /etc/procmailrc, SMTP server milter, transport or similar, or
even something like Amavis in the chain?

> That then leaves the question as to why my procmail recipe isn't triggering on 
> the rewritten subject, but that is probably not for this list. 

It's sufficiently related. ;)  See above.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by Rodney Baker <ro...@jeremiah31-10.net>.
On Tue, 16 Aug 2011 00:48:13 Bowie Bailey wrote:
> On 8/15/2011 10:57 AM, Rodney Baker wrote:
> > Hi all. I'm running spamassassin 3.3.1 on my openSuse 11.2 box at home.
> > Mail is collected from multiple ISP mail accounts via fetchmail and
> > delivered to local IMAP mail folders via procmail. My user account
> > .procmailrc file begins
> > 
> > thus:
> >    LOGFILE=$HOME/pm.log
> >    
> >    :0fw: spamassassin.lock
> >    :
> >    | spamc
> >    :
> >    :0
> >    
> >    * ^Subject.*SPAM\([0-9]{1,3}\.[0-9]\).*
> >    $HOME/Maildir/.Spam//
> > 
> > I'm attempting to filter on the modified subject line (which for some
> > reason isn't working - that rule never seems to match and spam never
> > gets moved into the Spam folder, even though I've tested the regex
> > manually). I thought of filtering on the X-Spam-Status header instead,
> > but when I had a look at a message that was marked as Spam (according to
> > the subject line) I found something rather strange...
> > 
> >    X-Virus-Flag: no
> >    X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
> > 
> > <my.local.mailhost.name.removed>
> > 
> >    X-Spam-Level: *
> >    X-Spam-Status: No, score=1.5 required=6.5
> > 
> > tests=BAYES_00,IMPOTENCE,NO_RELAYS
> > 
> >          autolearn=no version=3.3.1
> >    
> >    X-Spam-Virus: No
> >    Received: from localhost by <my.local.mailhost.name.removed>
> >    
> >          with SpamAssassin (version 3.3.1);
> >          Mon, 15 Aug 2011 18:58:01 +0930
> >    
> >    From: "Adele Key" <spam.address.removed>
> >    To: another.user@iinet.net.au
> >    Subject: ****SPAM(10.1)**** <spam-subject-removed>
> >    Date: Mon, 15 Aug 2011 18:12:48 +0900
> >    Message-Id: <16...@spamdomain.removed>
> >    MIME-Version: 1.0
> >    Content-Type: multipart/mixed;
> >    boundary="----------=_4E48E6A1.127A41A2"
> >    X-Length: 7330
> >    X-UID: 83487
> >    X-KMail-Filtered: 61220
> >    Status: R
> >    X-Status: N
> >    X-KMail-EncryptionState:
> >    X-KMail-SignatureState:
> >   
> >    X-KMail-MDN-Sent:
> >   Spam detection software, running on the system
> >   <my.local.mailhost.name.removed>, has
> >   identified this incoming email as possible spam.  The original message
> >   has been attached to this so you can view it (if it isn't spam) or
> >   label similar future email.  If you have any questions, see
> >   postmaster for details.
> >   
> >   
> >   Content preview:  [...]
> >   
> >   
> >   Content analysis details:   (10.1 points, 6.5 required)
> >   
> >    pts rule name              description
> >    ---- ---------------------- 
> >    ---------------------------------------------- 3.8
> >    KB_DATE_CONTAINS_TAB   KB_DATE_CONTAINS_TAB
> >    3.0 IMPOTENCE              BODY: Impotence cure
> >    -0.0 BAYES_20               BODY: Bayes spam probability is 5 to 20%
> >    
> >                             [score: 0.1050]
> >    
> >    2.0 KB_FAKED_THE_BAT       KB_FAKED_THE_BAT
> >    1.2 RDNS_NONE              Delivered to internal network by a host
> >    with no
> > 
> > rDNS
> > 
> > 
> > I don't get it - the content analysis shows a score of 10.1, the modified
> > subject line shows 10.1, but the X-Spam-Status header shows 1.5! What
> > have I messed up in my configuration?
> 
> This message is going through SA twice.
> 
> The first time, it is marked as spam and the message is re-written per
> your "report_safe" setting.  This generates the analysis shown in the
> body itself.
> 
> The second time, the re-written message is scanned by SA.  This time,
> all of the incriminating stuff has been hidden by the rewrite, so it is
> not marked as spam.  This is the analysis shown in the header.
> 
> You need to fix whatever is causing the message to be scanned twice.

OK - that makes sense. Now I'm wondering if there is a global mail config 
somewhere that is routing the message through SA, and then my local 
.procmailrc is doing it again. Time to go digging...

That then leaves the question as to why my procmail recipe isn't triggering on 
the rewritten subject, but that is probably not for this list. 

Thanks for the pointer.
Rodney.


-- 
======================================================
Rodney Baker
rodney@jeremiah31-10.net
web: www.jeremiah31-10.net
======================================================

Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by Rodney Baker <ro...@jeremiah31-10.net>.
On Tue, 16 Aug 2011 01:15:11 Walter Hurry wrote:
> On Mon, 15 Aug 2011 11:18:13 -0400, Bowie Bailey wrote:
> > On 8/15/2011 10:57 AM, Rodney Baker wrote:
> <snip>
> 
> >>    :0
> >>    
> >>    * ^Subject.*SPAM\([0-9]{1,3}\.[0-9]\).* $HOME/Maildir/.Spam//
> 
> <snip>
> 
> > This message is going through SA twice.
> 
> Indeed. And by the way, for what it is worth, my .procmailrc says (inter
> alia)
> 
> :0:
> * ^X-Spam-Status: Yes
> # The trailing slashdot means do it as MH
> # instead of MBOX (the default)
> junk/.
> 
> # Otherwise it falls through
> 
> May I suggest that that's rather simpler than the regex which you are
> using?
> 

Of course, and that's what I wanted to do, except that if you have a look at 
my X-Spam-Status header it says "No", which is the opposite of what I expect 
for a message marked as spam (apparently due, as already suggested, to 
spamassassin processing the message twice). 

> In addition, should I in the future decide for some reason to change or
> revoke the subject rewriting, I won't need to change .procmailrc.

Of course, if I can just get the message flagged as Spam in the headers, I'll 
be able to do the same. ;-)


-- 
======================================================
Rodney Baker
rodney@jeremiah31-10.net
web: www.jeremiah31-10.net
======================================================

Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by Walter Hurry <wa...@lavabit.com>.
On Mon, 15 Aug 2011 11:18:13 -0400, Bowie Bailey wrote:

> On 8/15/2011 10:57 AM, Rodney Baker wrote:
<snip>
>>    :0
>>    * ^Subject.*SPAM\([0-9]{1,3}\.[0-9]\).* $HOME/Maildir/.Spam//
<snip>
> This message is going through SA twice.

Indeed. And by the way, for what it is worth, my .procmailrc says (inter 
alia)

:0:
* ^X-Spam-Status: Yes
# The trailing slashdot means do it as MH
# instead of MBOX (the default)
junk/.

# Otherwise it falls through

May I suggest that that's rather simpler than the regex which you are 
using?

In addition, should I in the future decide for some reason to change or 
revoke the subject rewriting, I won't need to change .procmailrc.



Re: Inconsistent spam scores between spam headers and rewritten subject line.

Posted by Bowie Bailey <Bo...@BUC.com>.
On 8/15/2011 10:57 AM, Rodney Baker wrote:
> Hi all. I'm running spamassassin 3.3.1 on my openSuse 11.2 box at home. Mail 
> is collected from multiple ISP mail accounts via fetchmail and delivered to 
> local IMAP mail folders via procmail. My user account .procmailrc file begins 
> thus:
>
>    LOGFILE=$HOME/pm.log
>
>    :0fw: spamassassin.lock 
>    | spamc
>  
>
>    :0
>    * ^Subject.*SPAM\([0-9]{1,3}\.[0-9]\).*
>    $HOME/Maildir/.Spam//
>
> I'm attempting to filter on the modified subject line (which for some reason 
> isn't working - that rule never seems to match and spam never gets moved into 
> the Spam folder, even though I've tested the regex manually). I thought of 
> filtering on the X-Spam-Status header instead, but when I had a look at a 
> message that was marked as Spam (according to the subject line) I found 
> something rather strange...
>
>    X-Virus-Flag: no
>    X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on     
> <my.local.mailhost.name.removed>
>    X-Spam-Level: *
>    X-Spam-Status: No, score=1.5 required=6.5 
> tests=BAYES_00,IMPOTENCE,NO_RELAYS
>          autolearn=no version=3.3.1
>    X-Spam-Virus: No
>    Received: from localhost by <my.local.mailhost.name.removed>
>          with SpamAssassin (version 3.3.1);
>          Mon, 15 Aug 2011 18:58:01 +0930
>    From: "Adele Key" <spam.address.removed>
>    To: another.user@iinet.net.au
>    Subject: ****SPAM(10.1)**** <spam-subject-removed>
>    Date: Mon, 15 Aug 2011 18:12:48 +0900
>    Message-Id: <16...@spamdomain.removed>
>    MIME-Version: 1.0
>    Content-Type: multipart/mixed;
>    boundary="----------=_4E48E6A1.127A41A2"
>    X-Length: 7330
>    X-UID: 83487
>    X-KMail-Filtered: 61220
>    Status: R
>    X-Status: N
>    X-KMail-EncryptionState: 
>    X-KMail-SignatureState: 
>    X-KMail-MDN-Sent: 
>  
>   Spam detection software, running on the system 
>   <my.local.mailhost.name.removed>, has
>   identified this incoming email as possible spam.  The original message
>   has been attached to this so you can view it (if it isn't spam) or label
>   similar future email.  If you have any questions, see
>   postmaster for details.
>
>
>   Content preview:  [...]
>
>
>   Content analysis details:   (10.1 points, 6.5 required)
>
>
>    pts rule name              description
>    ---- ----------------------  ----------------------------------------------
>    3.8 KB_DATE_CONTAINS_TAB   KB_DATE_CONTAINS_TAB
>    3.0 IMPOTENCE              BODY: Impotence cure
>    -0.0 BAYES_20               BODY: Bayes spam probability is 5 to 20%
>                             [score: 0.1050]
>    2.0 KB_FAKED_THE_BAT       KB_FAKED_THE_BAT
>    1.2 RDNS_NONE              Delivered to internal network by a host with no     
> rDNS
>
>
> I don't get it - the content analysis shows a score of 10.1, the modified 
> subject line shows 10.1, but the X-Spam-Status header shows 1.5! What have I 
> messed up in my configuration?

This message is going through SA twice.

The first time, it is marked as spam and the message is re-written per
your "report_safe" setting.  This generates the analysis shown in the
body itself.

The second time, the re-written message is scanned by SA.  This time,
all of the incriminating stuff has been hidden by the rewrite, so it is
not marked as spam.  This is the analysis shown in the header.

You need to fix whatever is causing the message to be scanned twice.

-- 
Bowie