You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Ned Slider <ne...@unixmail.co.uk> on 2013/01/07 18:29:33 UTC

FPs on AXB_XMAILER_MIMEOLE_OL_B054A

Hi,

I'd just like to note some FPs on AXB_XMAILER_MIMEOLE_OL_B054A hitting 
some ham.

# grep _OL_B054A *.cf
72_active.cf:##{ AXB_XMAILER_MIMEOLE_OL_B054A
72_active.cf:meta   AXB_XMAILER_MIMEOLE_OL_B054A  (__AXB_XM_OL_B054A && 
__AXB_MO_OL_B054A)
72_active.cf:##} AXB_XMAILER_MIMEOLE_OL_B054A
72_active.cf:header __AXB_MO_OL_B054A  X-MimeOLE =~ /Produced\ By\ 
Microsoft\ MimeOLE\ V15\.4\.3555\.308/
72_active.cf:header __AXB_XM_OL_B054A  X-Mailer =~ /Microsoft\ Windows\ 
Live\ Mail\ 15\.4\.3555\.308/
72_scores.cf:score AXB_XMAILER_MIMEOLE_OL_B054A          3.499 2.121 
3.499 2.121


The scores seem pretty high for what looks like a hit against a pretty 
standard X-Mailer and X-MimeOLE type, or am I missing something here? 
I'm not sure I understand the strategy or thinking behind the rule.

Looking back through my mail archives it seems this rule was scoring 
0.001 until very recently hence why it probably didn't hit my radar 
until now.

Re: FPs on AXB_XMAILER_MIMEOLE_OL_B054A

Posted by Ned Slider <ne...@unixmail.co.uk>.

On 08/01/13 16:31, Kevin A. McGrail wrote:
> On 1/8/2013 11:27 AM, Kris Deugau wrote:
>> Ned Slider wrote:
>>> Hi,
>>>
>>> I'd just like to note some FPs on AXB_XMAILER_MIMEOLE_OL_B054A hitting
>>> some ham.
>> Rules in this cluster seem to target "obsolete" versions of MSOE and its
>> descendants. See
>> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844 for some
>> discussion around a similar rule.
>>
>> I can see the reasoning, but all too often ISP end users do not update
>> their systems, ever, causing these to be seen in live legitimate traffic.
>
> My $0.02. Rules often will hit on Spam and Ham so a FP should really be
> something that causes a Spam or Ham to be categorized incorrectly as a
> whole.
>
> For example, I may write a rule that scores 0.25 that hits on Spam but
> also some Ham. But I also have rules that are negative to negate the Ham
> impact.
>
> So if a score is particularly high on a single rule or it contributes to
> mismarking an email, it's a good thing to discuss. If it adds a small
> amount to a score, that's really not unexpected.
>
> So when the rule misfires on the Ham, is the ham still being overall not
> marked as Spam? Do you see a good amount of hits from the rule on Spam?
>
> Regards,
> KAM
>

Hi Kevin,

I absolutely take your point about scoring ham vs spam, and in this case 
the ham was indeed not misclassified as spam. Bayes was correctly 
scoring these, either neutrally or as ham. About the only rule hitting 
with any significant score was AXB_XMAILER_MIMEOLE_OL_B054A.

However, in order to improve overall efficiency I do take note and try 
to investigate when any rule hits on ham, especially when that rule is 
scored at anything much higher than an informational score. This rule 
came to my attention as the score has very recently increased from an 
informational score of 0.001 to a not insignificant 2.121 (and even 
higher for those not running network tests and/or bayes). If as you 
suggest it had a score of 0.25 then it almost certainly wouldn't have 
caught my attention.

The fact it is scoring greater than 40% of a spam classification doesn't 
appear justified from examination of my corpus. I see absolutely no hits 
in my spam corpus dating back two years and covering over 10,000 
messages (I grant small by some standards). I see a small number of hits 
against ham dating back to June 2012 (perhaps around the time the rule 
was first introduced?) from a handful of senders.

Ultimately it has to come down to rule efficiency and the efficiency of 
this rule _for me_ is pretty awful even if it's not a huge issue. I see 
it performs a little better in the official corpus:

http://ruleqa.spamassassin.org/20130107-r1429709-n/AXB_XMAILER_MIMEOLE_OL_B054A/detail

It's probably fair to say that neither my nor the SA corpus are ideal 
for judging the true performance of such rules but in each case it's 
what we have to work with.

Having read the bugzilla Kris referenced I do now at least understand a 
little of the reasoning behind the rule :-)

Thanks for the responses.

Re: FPs on AXB_XMAILER_MIMEOLE_OL_B054A

Posted by "Kevin A. McGrail" <KM...@PCCC.com>.

On 1/8/2013 11:27 AM, Kris Deugau wrote:
> Ned Slider wrote:
>> Hi,
>>
>> I'd just like to note some FPs on AXB_XMAILER_MIMEOLE_OL_B054A hitting
>> some ham.
> Rules in this cluster seem to target "obsolete" versions of MSOE and its
> descendants.  See
> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844 for some
> discussion around a similar rule.
>
> I can see the reasoning, but all too often ISP end users do not update
> their systems, ever, causing these to be seen in live legitimate traffic.

My $0.02.  Rules often will hit on Spam and Ham so a FP should really be 
something that causes a Spam or Ham to be categorized incorrectly as a 
whole.

For example, I may write a rule that scores 0.25 that hits on Spam but 
also some Ham.  But I also have rules that are negative to negate the 
Ham impact.

So if a score is particularly high on a single rule or it contributes to 
mismarking an email, it's a good thing to discuss. If it adds a small 
amount to a score, that's really not unexpected.

So when the rule misfires on the Ham, is the ham still being overall not 
marked as Spam?  Do you see a good amount of hits from the rule on Spam?

Regards,
KAM

Re: FPs on AXB_XMAILER_MIMEOLE_OL_B054A

Posted by Ned Slider <ne...@unixmail.co.uk>.

On 08/01/13 16:27, Kris Deugau wrote:
> Ned Slider wrote:
>> Hi,
>>
>> I'd just like to note some FPs on AXB_XMAILER_MIMEOLE_OL_B054A hitting
>> some ham.
>
> Rules in this cluster seem to target "obsolete" versions of MSOE and its
> descendants.  See
> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844 for some
> discussion around a similar rule.
>
> I can see the reasoning, but all too often ISP end users do not update
> their systems, ever, causing these to be seen in live legitimate traffic.
>
> -kgd
>

Thanks Kris for the pointer and discussion.

Re: FPs on AXB_XMAILER_MIMEOLE_OL_B054A

Posted by Kris Deugau <kd...@vianet.ca>.

Ned Slider wrote:
> Hi,
> 
> I'd just like to note some FPs on AXB_XMAILER_MIMEOLE_OL_B054A hitting
> some ham.

Rules in this cluster seem to target "obsolete" versions of MSOE and its
descendants.  See
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6844 for some
discussion around a similar rule.

I can see the reasoning, but all too often ISP end users do not update
their systems, ever, causing these to be seen in live legitimate traffic.

-kgd