You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2009/08/07 10:14:59 UTC

[Bug 4892] The abbreviation for Oxfordshire causes high Spam Score

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4892





--- Comment #21 from Cedric Knight <ce...@gn.apc.org>  2009-08-07 01:14:56 PST ---
Created an attachment (id=4505)
 --> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4505)
Another FUZZY_XPILL false positive, bulk non-spam, somewhat munged

But how many of the spam hits for FUZZY_XPILL are actually for the right
reason?  One third of the hits I see are on ham (I'd reduced score to 1.2 for
this reason), and the rest are on spam in Cyrillic that is about Moscow
removals firms, tyre rebalancing etc. with no pharmaceutical connotations.  I
don't have samples but it looks like there would be a high rate of FPs on
Cyrillic ham, as well as the ones already discussed.  Here's another false
positive (English, bulk, ham, anonymised).

The modified rule still hits all fuzzy references to the drug name I can
devise, although could use (?:_|\b) instead of \b.

I guess this rule will get zeroed by the score generation, but it would be good
to see the fix in update channels etc.  Could it please be reprioritised?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.