You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Karsten Bräckelmann <gu...@rudersport.de> on 2008/05/14 00:05:12 UTC

backscatter and (was: Re: AWL putting spam in my inbox)

On Tue, 2008-05-13 at 16:16 -0500, Robotech_Master wrote:
> I'm using SpamAssassin 3.2.3 w/ Perl 5.8.8 on Linux. I'm not the
> sysadmin of the machine, but a user.
> 
> I invoke it through a procmail recipe that says, in part,
> 
> :0fw
> | /usr/bin/spamc

> I am getting an immense amount of backscatter spam, and have trained
> SA on it until SA gives it a reliable Bayes score of 99%.

Please do note, that Bayes will be biased, if you train a LOT more ham
than spam. Even though 50 times as much has been reported to work, one
should at least expect to see "spammy looking ham" due to excessive,
unbalanced training way earlier. This pretty much depends on your own
ham and its variety in topic, too.

Also, I'm not convinced that Bayes is the correct tool to fight
backscatter at all... See your other post for a better way, where you
ask about VBounce. :)

Since you are using procmail anyway, let me stress a point HOW to handle
bounces. Filter them. Into a different folder, for possible later
review. Do not just treat them as spam -- keep in mind, the default
VBounce scores are LOW, and set to merely have the rules not be disabled
(which would be the case with a score of 0).

Now, here goes my favorite quote these days:

$ grep -A 2 procmail /usr/share/spamassassin/20_vbounce.cf

# If you use this, set up procmail or your mail app to spot the
# "ANY_BOUNCE_MESSAGE" rule hits in the X-Spam-Status line, and move
# messages that match that to a 'vbounce' folder.

> However, I'm still ending up getting tons of it passed through into my
> mailbox.
> 
> When I check the headers of some of the spams that end up in my
> mailbox, I see something like the following:
> 
> From MAILER-DAEMON  Tue May 13 13:46:20 2008
> Return-Path: <>
...
> X-Spam-Status: No, score=1.2 required=4.0 tests=AWL,BAYES_99 autolearn=no
>         version=3.2.3

Just a guess, but most likely due to an empty Return-Path. AWL is based
on email address and the originating network block. Thus, you might see
totally different results for mail sent by the same $address (well, the
empty string here) from different net blocks.

AWL is not related to Bayes, but all about the average score of mail
previously seen by a specific sender (and origin).

See also these and probably other articles in the wiki:
  http://wiki.apache.org/spamassassin/AutoWhitelist
  http://wiki.apache.org/spamassassin/AwlWrongWay

> So, SA is giving it a BAYES_99, which should result in it hitting 5.0
> right off the bat.
> 
> However, apparently the Auto-Whitelist is knocking it back down to
> where it still ends up in my mailbox.
> 
> Can someone please tell me how to make it stop? I'm getting a LOT of
> these messages that should by all rights be safely filtered into
> spammyland.

Use VBounce. Filter them (using procmail) into bouncy-land. :)

  guenther

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: backscatter and (was: Re: AWL putting spam in my inbox)

Posted by Karsten Bräckelmann <gu...@rudersport.de>.

Please keep list posts on list, by either Replying To List or All.

On Tue, 2008-05-13 at 17:43 -0500, Robotech_Master wrote:
> On Tue, May 13, 2008 at 5:05 PM, Karsten Bräckelmann
> <gu...@rudersport.de> wrote:
>  
>         Now, here goes my favorite quote these days:
>         
>         $ grep -A 2 procmail /usr/share/spamassassin/20_vbounce.cf
>         
>         # If you use this, set up procmail or your mail app to spot
>         the
>         # "ANY_BOUNCE_MESSAGE" rule hits in the X-Spam-Status line,
>         and move
>         # messages that match that to a 'vbounce' folder.
> 
> Thanks for your advice. I would like to do that. I'd also like to tell
> it to search the body of the bounce for a
> 
> "Sender: [my gmail address]"
> 
> line, which gmail sticks in when I send as robotech@eyrie.org, and
> pass those on, since I don't think I can inclusively list every GMail
> mail server (since I don't know them).

The one you are using as SMTP as configured in your MUA should be
sufficient, I guess. If not, you can simply omit the leading hostname or
use file-glob-style patterns. See the docs [1].

  "The hostnames can be file-glob-style patterns, so relay*.isp.com will
  work. Specifically, * and ? are allowed, but all other metacharacters
  are not. Regular expressions are not used for security reasons."

> The thing is, I'm not real good with coming up with my own recipes. :P
> Can you help me out?

Procmail receipts? Sure.

:0 :
* ^X-Spam-Status: .*ANY_BOUNCE_MESSAGE
spam/bounces

Put that AFTER your SA/spamc filtering receipt, and BEFORE any receipt
to dump classified spam into their own folders. Also, of course, do
adjust the delivery actions target.

Not taking on the body grep for Sender (or are you about a SA rule
here?), since I don't know the exact details. Anyway I'd recommend to
just start with the above, and later re-evaluate if you actually see any
need for that.

However, I am rather positive, that VBounce generally does not result in
FP at all -- you can check by sending a test mail to a known-to-fail
address. Testing for any marker like the above seems to aim at rescuing
FPs. Which is the very purpose of whitelist_bounce_relays. I don't think
any additional body grep would be useful.

> Also, what's the difference between ANY_BOUNCE_MESSAGE and
> BOUNCE_MESSAGE?

BOUNCE_MESSAGE is a general MTA bounce message, not including Challenge-
Response or Virus-Scanner bounces. ANY_BOUNCE_MESSAGE is a meta rule
that aggregates all of these. (Not including legit bounces of course,
which originated at your whitelisted relays.)

See /usr/share/spamassassin/20_vbounce.cf :)

  guenther

[1] http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_VBounce.html

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}