You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Gary G. Taylor" <ga...@donavan.org> on 2006/07/25 17:19:04 UTC

Auto White List problem

I have noticed that several spams are getting through because they have 
entries in the Auto White List, sometimes with very large numbers.

Here is a sample header from a message not flagged as spam:

Return-Path: 
<Wa...@shop.walmart.com>
 Received: from mailrelay03.walmart.com (161.170.254.40) by wesurvu.net with
 SMTP (Eudora Internet Mail Server X 3.2.8) for <ga...@donavan.org>;
 Tue, 25 Jul 2006 01:48:12 -0700
 Received: from shop.walmart.com (172.29.138.22)
  by mailrelay03.walmart.com with ESMTP; 25 Jul 2006 01:47:37 -0700
 Accreditor: Habeas
 X-Habeas-Report: Please report use of this mark in spam to 
<http://www.habeas.com/report/>
 Message-ID: <16...@172.29.138.26>
 Date: Tue, 25 Jul 2006 01:47:36 -0700 (PDT)
 From: Wal-Mart Wire <wa...@walmart.com>
 To: gary@donavan.org
 Subject: Furnishings 101: Fun & Affordable
 Mime-Version: 1.0
 Content-Type: text/html;
  charset=ISO-8859-1
 Content-Transfer-Encoding: quoted-printable
 X-Mailer-Version: 3.5.14 build 759
 X-Mailer: Accucast
 X-Accutrak: Wal-Mart_Wire-#2.13690.2d322e343633333732384537@shop.walmart.com
 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
manduck.donavan.org
 X-Spam-Level: 
 X-Spam-Status: No, score=-13.0 required=5.0 tests=AWL,HTML_90_100,
        HTML_IMAGE_RATIO_04,HTML_MESSAGE,MIME_HTML_ONLY,USER_IN_DEF_WHITELIST 
        autolearn=no version=3.0.4
 X-UID: 
 Status: R
 X-Status: NC
 X-KMail-EncryptionState: 
 X-KMail-SignatureState: 
 X-KMail-MDN-Sent: 

-------------

And here is a header from a beliefnet (gag) message SA caught:

Return-Path: <li...@partner.beliefnet.com>
 Received: from cmn1lsm4.beliefnet.com (129.33.230.138) by wesurvu.net with
 SMTP (Eudora Internet Mail Server X 3.2.8) for <ga...@donavan.org>;
 Mon, 24 Jul 2006 13:07:25 -0700
 Received: from partner.beliefnet.com (10.2.2.61)
  by cmn1lsm4.beliefnet.com with SMTP; 24 Jul 2006 16:06:43 -0400
 From: PetLovers <Pa...@partner.beliefnet.com>
 Subject: [SPAM] Take a survey - Get a $500 Pet Store Gift Card
 Date: 24 Jul 2006 16:06:43 -0400
 To: gary@donavan.org
 Mime-Version: 1.0
 Content-Type: text/html;
  charset=us-ascii
 Content-Transfer-Encoding: 8bit
 Message-ID: <10...@wesurvu.net>
 X-Spam-Prev-Subject: Take a survey - Get a $500 Pet Store Gift Card
 X-Spam-Flag: YES
 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
manduck.donavan.org
 X-Spam-Level: *****
 X-Spam-Status: Yes, score=5.4 required=5.0 tests=AWL,HELO_DYNAMIC_DHCP,
        HTML_80_90,HTML_IMAGE_RATIO_02,HTML_MESSAGE,MIME_HTML_ONLY 
        autolearn=no version=3.0.4
 X-Spam-Report: 
        *  2.8 HELO_DYNAMIC_DHCP Relay HELO'd using suspicious hostname (DHCP)
        *  0.0 HTML_80_90 BODY: Message is 80% to 90% HTML
        *  1.7 HTML_IMAGE_RATIO_02 BODY: HTML has a low ratio of text to image 
area
        *  0.0 HTML_MESSAGE BODY: HTML included in message
        *  1.2 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
        * -0.4 AWL AWL: From: address is in the auto white-list
 X-UID: 
 Status: R
 X-Status: NPC
 X-KMail-EncryptionState: 
 X-KMail-SignatureState: 
 X-KMail-MDN-Sent: 

How the fsck did these idiots get into the AWL?!

When I view the AWL file I find probably two or three hundred different URLs 
and email addresses. I am running SpamAssassin 3.0.4 installed as an rpm from 
Mandrakesoft and I have not designated any block of senders as ham.

The questions are:
1) How do I clean out the white list?
2) The installation of SpamAssassin set up KMail with filters for spam. There 
are two actions available: Filter a message as spam, and filter a message as 
ham; each goes into its own separate folder within KMail. Is using these 
manual filters the right thing to do, and then run sa-learn through them at 
the appropriate time?
-- 
Gary G. Taylor * Pomona, CA * 34.074630°N 117.754195°W
gary@donavan.org * http://www.donavan.org
"The two most abundant substance in the Universe are hydrogen
and stupidity." --Frank Zappa, R.A. Heinlein and many others

Re: Auto White List problem

Posted by Matt Kettler <mk...@comcast.net>.
Gary G. Taylor wrote:
>
> And here is a header from a beliefnet (gag) message SA caught:
>   
<snip>
>  X-Spam-Status: Yes, score=5.4 required=5.0 tests=AWL,HELO_DYNAMIC_DHCP,
>         HTML_80_90,HTML_IMAGE_RATIO_02,HTML_MESSAGE,MIME_HTML_ONLY 
>         autolearn=no version=3.0.4
>  
>   
<snip>
> How the fsck did these idiots get into the AWL?!
>   

Step 1: Ditch any preconceptions that the AWL is a whitelist. It's not,
it's a score-averager. It's called AWL due to lack of a better name
that's not absurdly long.

Step 2:  The above example is perfectly normal and expected for the AWL.
Note that the message was still tagged as spam. This is perfectly normal.

The AWL, in the above example, felt the score of the message should be
5.0. SA scored it 5.8, so the AWL split the difference and made it 5.4
by subtracting 0.4 points. Still tagged as spam, no problem.

Step 3: Read the WIKI to get a better idea of what the AWL really is,
and what it does.

Why the AWL sometimes "Scores the wrong way":

http://wiki.apache.org/spamassassin/AwlWrongWay

What the AWL is and how it works:

http://wiki.apache.org/spamassassin/AutoWhitelist

> When I view the AWL file I find probably two or three hundred different URLs 
> and email addresses. I am running SpamAssassin 3.0.4 installed as an rpm from 
> Mandrakesoft and I have not designated any block of senders as ham.
>
> The questions are:
> 1) How do I clean out the white list?
>   
It's not really a whitelist, as noted above, but you can clean out any
"one off" addresses by using the check-whitelist script that comes in
the tools subdirectory of the tarball.
Note: most RPMs do not come with this, so you WILL have to download the
tarball to get it. However, you don't need to install anything. It's
just a stand-alone perl script. Read the top of the file to get usage
directions.
> 2) The installation of SpamAssassin set up KMail with filters for spam. There 
> are two actions available: Filter a message as spam, and filter a message as 
> ham; each goes into its own separate folder within KMail. Is using these 
> manual filters the right thing to do, and then run sa-learn through them at 
> the appropriate time?
>   
Sounds reasonable to me.