You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Bob George <ma...@ttlexceeded.com> on 2004/02/27 22:20:32 UTC

FN: Is DEFAULT whitelist making AWL break? USER_IN_DEF_WHITELIST From: address scored, but it's NOT

 Just when I thought I had whitelist and AWL figured out...

I just got this (headers edited down):

--- begin headers --- 
[...]
Mailing-List: contact incidents-help@securityfocus.com; run by ezmlm
List-Id: <incidents.list-id.securityfocus.com>
List-Post: <ma...@securityfocus.com>
List-Help: <ma...@securityfocus.com>
List-Unsubscribe: <ma...@securityfocus.com>
List-Subscribe: <ma...@securityfocus.com>
Delivered-To: mailing list incidents@securityfocus.com
Delivered-To: moderator for incidents@securityfocus.com
Received: (qmail 30206 invoked from network); 27 Feb 2004 12:06:12 -0000
To: <in...@securityfocus.com>
From: "harley" <XX...@hotmail.com>
Date: Sat, 28 Feb 2004 01:55:39 GMT
Message-Id: <10...@excite.com>
Sender: YYYYYY@hotmail.com
Subject: This Drug puts VlAGRA to shame!!
Content-Type: text/plain;
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on
server.ttlexceeded.com
X-Spam-Level:
X-Spam-Status: No, hits=-1.3 required=5.0
tests=BAYES_44,DATE_IN_FUTURE_06_12,FORGED_HOTMAIL_RCVD2,FVGT_u_HAS_2LETTERFLDR
,J_CHICKENPOX_14,LOCAL_DRUGS_MALEDYSFUNCTION,LOCAL_DRUGS_MALEDYSFUNCTION_OBFU,O
ACYS_DISGUISED_P0RN,USER_IN_DEF_WHITELIST,_YM_HS_BAGLE_A autolearn=no
version=2.63
X-Spam-Pyzor: Reported 0 times.
X-Spam-Report:  *  1.0 _YM_HS_BAGLE_A Subject =~ /Hi/i *  6.0
OACYS_DISGUISED_P0RN BODY: Tries to slip through filters by substituting
numbers for letters *  0.6 J_CHICKENPOX_14 BODY: {1}Letter - punctuation -
{4}Letter * -0.0 BAYES_44 BODY: Bayesian spam probability is 44 to 50% *
[score: 0.4999] *  0.1 FVGT_u_HAS_2LETTERFLDR URI: FVGT - URL has a 2 letter
folder like /ab/ *  2.0 DATE_IN_FUTURE_06_12 Date: is 6 to 12 hours after
Received: date *  -15 USER_IN_DEF_WHITELIST From: address is in the default
white-list *  2.5 FORGED_HOTMAIL_RCVD2 hotmail.com 'From' address, but no
'Received:' *  0.5 LOCAL_DRUGS_MALEDYSFUNCTION_OBFU
LOCAL_DRUGS_MALEDYSFUNCTION_OBFU *  1.0 LOCAL_DRUGS_MALEDYSFUNCTION
LOCAL_DRUGS_MALEDYSFUNCTION

--- end headers --- 

Notice that X-Spam Report: *  -15 USER_IN_DEF_WHITELIST From: address is in the
default white-list

The LIST address IS in the default whitelist
(/usr/share/spamassassin/60_whitelist.cf):
60_whitelist.cf:def_whitelist_from_rcvd  *@securityfocus.com
securityfocus.com

I've double-checked, but neither securityfocus.com, nor that sender address are
listed anywhere else under /usr/share/spamassassin, /etc/spamassassin or
~spamd/.spamassassin. Many securityfocs addresses are in spamd's AWL list from
previous posts, but not this (non-securityfocus) address. The fields remotely
like From: seem to be:

From: "harley" <XX...@hotmail.com>
Message-Id: <10...@excite.com>
Sender: YYYYYY@hotmail.com

On a second pass, it rose to 6.2, so AWL is correcting it (and thus eventually
will fix the wayward mystery whitelist entry), and AWL seems to have forgotten
the original whitelist score:

# check_whitelist | grep XXXXXX@hotmail.com
     6.2         (6.2/2)  --  XXXXXX@hotmail.com|ip=205.206

(-7.5 AWL                    AWL: Auto-whitelist adjustment)

So AWL knows it was From: that account. It's figured out the message is
spammy-scoring (trending towards higher scores). But how did it give it a
non-spam STARTING value so high when there was no previous entry for that user?
It seems AWL defaulted to -30 (-30/0) -- XXXXXX@hotmail.com, then corrected
when the first actual message was posted.

To test, I:

1. Edited /usr/share/spamassassin/60_whitelist.cf and commented out the
securityfocus entry.

2. As spamd user, did:
spamassassin --remove-addr-from-whitelist=XXXXXX@hotmail.com

SpamAssassin auto-whitelist: removing address: XXXXXX@hotmail.com

3. Verified it was gone:

$ check_whitelist | grep XXXXXX@hotmail.com
[spamd@server: ~]$

4. Re-ran the message through SA. There was no AWL/whitelist adjustment, and it
scored 13.7

5. Checked to verify the first posting from that user was given an appropriate
score: $ check_whitelist | grep XXXXXX@hotmail.com
    13.7        (13.7/1)  --  XXXXXX@hotmail.com|ip=205.206

Now I'm the first to admit I may have something wrong here, but it sure seems
that the default whitelist, or some other factor can have 'interesting' side
effects on initial AWL values. Note that this is apparently causing the same
problem that Michael Shlief observed: Sender address not in default or added
whitelist is impacting mail to list sent by others due to AWL (apparently). I
checked the wikis, but didn't find a good rundown on whitelist. This is, I
think, a whitelist issue manifesting itself as an AWL scoring issue. So:

1. How is an entry in the default from whitelist (or at least scored that way)
affecting this scoring if no sender fields (that I can see) relate?

2. Would a sender address of xyz@securityfocus.comicalname.biz match the
default whitelist entry with no anchoring $?

3. Can I tweak the +/- score awarded to black/whitelist entries to be closer to
the spam thershold score?

4. How is AWL initialized when there's no previous entry from a user?

5. Does AWL relate to default whitelists in any non-obvious way?

I still like what AWL does for me, but I'm beggining to think
whitelist/blacklist entries (with +/- 100 adjustments) are more risky than
equivalent rules that add +/- < threshold to avoid this "blowing out" on even
obviously spammy messages. Had it not been for that VERY HIGH starting value
for the sender (from the default whitelist I presume), this would've still been
scored as spam.

Any clarification much appreciated.

- Bob