You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2012/09/26 22:59:04 UTC

[Bug 6844] New: Almost-overlapping rules for "bad" OE version causing FP

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844

          Priority: P2
            Bug ID: 6844
          Assignee: dev@spamassassin.apache.org
           Summary: Almost-overlapping rules for "bad" OE version causing
                    FP
          Severity: normal
    Classification: Unclassified
                OS: Linux
          Reporter: kdeugau@vianet.ca
          Hardware: PC
            Status: NEW
           Version: 3.3.2
         Component: Rules
           Product: Spamassassin

Created attachment 5091
  --> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5091&action=edit
Sanitized copy of FP

These three rules:

FSL_UA
FSL_XM_419
AXB_XMAILER_MIMEOLE_OL_024C2

are all very similar and overlap a little, looking for the same version string
in the X-Mailer, X-MIME-OLE, and User-Agent headers.

I just caught a FP that hit all three of these as well as BAYES_50, pushing it
just over 5 points.

I've attached a sanitized version of the message (note that this was extracted
from a Request Tracker instance, so the headers are not in original order, and
some may be missing).

The OE version in question is likely one that is thoroughly unpatched (XP
pre-SP1, anyone?), but these rules should probably still be reviewed and
merged.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6844] Almost-overlapping rules for "bad" OE version causing FP

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844

--- Comment #7 from Kris Deugau <kd...@vianet.ca> ---
(In reply to comment #5)
> (In reply to comment #4)
> > We'll check FSL_UA / FSL_XM_419
> > 
> > AXB_XMAILER_MIMEOLE_OL_024C2 is autogenerated
> > 
> > These rules are basically dead safe to even use them to reject mail at smtp
> > level.
> > 
> > "user is an ISP customer" is not a reason to compromise spam detection &
> > lower a score  just because a user intentionally runs totally
> > outdated/insecure software.
> 
> I'll disagree. Rules are written for real-world experience and SA is not a
> security product enforcing patches and software upgrades.  If the rule hits
> on ham in real-world experience, it should be scored lower.  
> 
> What's the S/O on all three of these rules like?

0     26.3995     0.0011     1.000     0.97     1.00     FSL_XM_419
0     26.4033     0.0011     1.000     0.97     1.59     FSL_UA
0     26.3462     0.0011     1.000     0.97     2.01    
AXB_XMAILER_MIMEOLE_OL_024C2

So they aren't hitting very much ham, but there's still some out there.

However, I was more concerned about the overlap;  SA's duplicate rule detection
can't pick up on cases like this.  The FP is basically a side effect of the
overlap.

Looking at the SA log locally, I expect a lot of those 26% of spam hits from
the mass-check info also hit Spamhaus DNSBL rules;  we block with Spamhaus at
the MTA so the full ruleset only gets run on a small percentage of mail. 
Locally, they hit ~2.5% of mail.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [Bug 6844] Almost-overlapping rules for "bad" OE version causing FP

Posted by Axb <ax...@gmail.com>.
On 09/27/2012 04:27 PM, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844
>
> AXB <ax...@gmail.com> changed:
>
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>               Status|NEW                         |RESOLVED
>           Resolution|---                         |FIXED
>
> --- Comment #8 from AXB <ax...@gmail.com> ---
> Most of these are sent thru freemailers and exploited accounts on ISP's server
>> 90% is 419 type, the rest is diplomas.
>
> Fact is that your sample is not pristine and that FP is absolutely exceptional
> (modified header / negligent user) and by no means consider this a generic
> issue.
>
> We'll watch the overlap and act if required.
>

Re:

0 26.3995 	0.0011 	1.000 	0.97 	1.00 	FSL_XM_419
0 26.4033 	0.0011 	1.000 	0.97 	1.59 	FSL_UA
0 26.3462 	0.0011 	1.000 	0.97 	2.01 	AXB_XMAILER_MIMEOLE_OL_024C2



Please see

http://ruleqa.spamassassin.org/20120926-r1390330-n/AXB_XMAILER_MIMEOLE_OL_024C2/detail

http://ruleqa.spamassassin.org/20120926-r1390330-n/FSL_XM_419/detail

http://ruleqa.spamassassin.org/20120926-r1390330-n/FSL_UA/detailt

the 0.0011 seen in non spam are all coming from the bb-jm corpus which 
is ancient and, imo, should have been removed long ago,

This issue has been raised before but apparently some ppl consider stale 
data to still be of value even if it has a negative effect.

Axb





[Bug 6844] Almost-overlapping rules for "bad" OE version causing FP

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844

AXB <ax...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from AXB <ax...@gmail.com> ---
Most of these are sent thru freemailers and exploited accounts on ISP's server
> 90% is 419 type, the rest is diplomas.

Fact is that your sample is not pristine and that FP is absolutely exceptional
(modified header / negligent user) and by no means consider this a generic
issue.

We'll watch the overlap and act if required.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6844] Almost-overlapping rules for "bad" OE version causing FP

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844

--- Comment #1 from AXB <ax...@gmail.com> ---
Apparently the sender is using an ancient (insecure) OE version and should be
upgraded. Iirc this version was EOL in 2002.

OE msg should include both headers as:

X-Mailer: Microsoft Outlook Express 6.00.2600.0000
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000

Something removed the X-Mailer header or it was intentionaly forged.

I don't see this as a bug but more as a warning to the user that he/she should
update, urgently

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6844] Almost-overlapping rules for "bad" OE version causing FP

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844

--- Comment #4 from AXB <ax...@gmail.com> ---
We'll check FSL_UA / FSL_XM_419

AXB_XMAILER_MIMEOLE_OL_024C2 is autogenerated

These rules are basically dead safe to even use them to reject mail at smtp
level.

"user is an ISP customer" is not a reason to compromise spam detection & lower
a score  just because a user intentionally runs totally outdated/insecure
software.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6844] Almost-overlapping rules for "bad" OE version causing FP

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844

Kris Deugau <kd...@vianet.ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kdeugau@vianet.ca

--- Comment #3 from Kris Deugau <kd...@vianet.ca> ---
(In reply to comment #1)
> Apparently the sender is using an ancient (insecure) OE version and should
> be upgraded. Iirc this version was EOL in 2002.
> 
> OE msg should include both headers as:
> 
> X-Mailer: Microsoft Outlook Express 6.00.2600.0000
> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000
> 
> Something removed the X-Mailer header or it was intentionaly forged.

Both headers are there, but as noted the attachment was extracted from a
Request Tracker instance, and for reasons beyond my understanding RT does not
preserve RFC822 attachments intact and unaltered - it rewrites the character
set and reorders the headers to various degrees.  X-Mailer is the second header
in the attachment, X-MimeOLE is in between Date: and To:.

> I don't see this as a bug but more as a warning to the user that he/she
> should update, urgently

Not under my control (user is an ISP customer), or it wouldn't have been a
problem in the first place.

Looking more closely at the rules, FSL_UA and FSL_XM_419 will almost always
trigger together;  one subrule in FSL_UA is almost identical to FSL_XM_419:

meta     FSL_UA       (__FSL_UA_1 || __FSL_UA_2)
header   __FSL_UA_1   User-Agent =~ /6\.00\.2600\.000/
header   __FSL_UA_2   X-Mailer   =~ /6\.00\.2600\.000/
header   FSL_XM_419   X-Mailer   =~ /\s+6\.00\.2600\.0000$/

The other subrule in FSL_UA triggers on the same version string in the
User-Agent header - which header I don't remember ever seeing in legitimate OE
mail.  That alone might make a better rule to keep (assuming it hits anything
at all;  a search through my archive of spam reports shows *no* examples of a
User-Agent header with that version number in it).

The AXB_XMAILER_MIMEOLE_OL_024C2 subrules are much more specific in matching on
the complete header value rather than just the version string, and require both
X-Mailer and X-MimeOLE headers to trigger the scored rule instead of one or the
other.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6844] Almost-overlapping rules for "bad" OE version causing FP

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844

Kevin A. McGrail <km...@pccc.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kmcgrail@pccc.com

--- Comment #5 from Kevin A. McGrail <km...@pccc.com> ---
(In reply to comment #4)
> We'll check FSL_UA / FSL_XM_419
> 
> AXB_XMAILER_MIMEOLE_OL_024C2 is autogenerated
> 
> These rules are basically dead safe to even use them to reject mail at smtp
> level.
> 
> "user is an ISP customer" is not a reason to compromise spam detection &
> lower a score  just because a user intentionally runs totally
> outdated/insecure software.

I'll disagree. Rules are written for real-world experience and SA is not a
security product enforcing patches and software upgrades.  If the rule hits on
ham in real-world experience, it should be scored lower.  

What's the S/O on all three of these rules like?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6844] Almost-overlapping rules for "bad" OE version causing FP

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844

--- Comment #6 from AXB <ax...@gmail.com> ---
SO/O speaks for not changing anything.
but I'm wokring on changes that will reflect less overlap and if there are a
FPs on a 10 year old MUA which is now considered ratware then we can reconsider
scores.

Give me a few masscheks to finish this

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6844] Almost-overlapping rules for "bad" OE version causing FP

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844

--- Comment #2 from AXB <ax...@gmail.com> ---
Please see http://ruleqa.spamassassin.org/ SPAM% HAM% for each rule.
While there is overlap, it can vary (a lot) depending on the submitted corpus
data.

-- 
You are receiving this mail because:
You are the assignee for the bug.