You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2008/04/19 12:05:05 UTC

[Bug 5887] New: FP FB_CIALIS_LEO3 matches "financial is" "financially sound "

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5887

           Summary: FP FB_CIALIS_LEO3 matches "financial is" "financially
                    sound"
           Product: Spamassassin
           Version: 3.2.4
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Rules
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: sidney@sidney.com


I got an FP on this rule in some email from a company named something Financial
with a line like "Soandso Financial is committed to ..."

I would just go ahead and change the rule to not match anything beginning with
financial, but I'm not enough of a perl regexp expert to be confident of what
the cleanest way to do it is. The rule already begins with /(?!CIALIS)C to not
match a non-fuzzy CIALIS. What would be the right way to also exclude
FINANCIAL?


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5887] FP FB_CIALIS_LEO3 matches "FINANCIAL is"

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5887





--- Comment #2 from Sidney Markowitz <si...@sidney.com>  2008-04-23 14:24:51 PST ---
I tested what would happen if I put a \b before and after the existing rule,
which would eliminate these two FPs. I would like opinions whether the result
is worth puting in. On last nights mass check the test rule eliminates 7 out of
the 16 FPs while hitting 158 fewer spams (out of 19708). That gives it a better
S/O of 0.991 instead of 0.9845.

I added a rule to allow looking at the hits that are eliminated by the rule
change. Looking at its details, it appears that most of the FNs for the new
rule are high scoring spam anyway.

So any opinions as to whether I should change the rule to require word
boundaries?

SPAM%(of 1165303)  HAM%(of 61395)  S/O% RANK NAME

1.6776 (19549)     0.0147 (9)     0.991 0.90 T_SIDNEY_FB_CIALIS_LEO3    

1.6912 (19708)     0.0261 (16)    0.985 0.89 FB_CIALIS_LEO3     

0.0136 (158)       0.0114 (7)     0.545 0.56 T_SIDNEY_FB_CIAL_NONWORD


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5887] FP FB_CIALIS_LEO3 matches "FINANCIAL is"

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5887


Sidney Markowitz <si...@sidney.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




--- Comment #4 from Sidney Markowitz <si...@sidney.com>  2008-04-24 13:42:36 PST ---
Committed to update channel (branches/3.2/72_active.cf) 651411

Is that the only place I need to change it? Does the version in
rules/trunk/sandbox/emailed/00_FVGT_File001.cf get automatically promoted
somehow?


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5887] FP FB_CIALIS_LEO3 matches "FINANCIAL is"

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5887





--- Comment #5 from Sidney Markowitz <si...@sidney.com>  2008-04-24 13:58:27 PST ---
I see that it has to be done in the sandbox too in order to get into the trunk:

rules/trunk/sandbox/emailed/00_FVGT_File001.cf
Committed revision 651418.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5887] FP FB_CIALIS_LEO3 matches "FINANCIAL is"

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5887





--- Comment #3 from Justin Mason <jm...@jmason.org>  2008-04-24 01:13:42 PST ---
+1, go ahead and replace it.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5887] FP FB_CIALIS_LEO3 matches "FINANCIAL is"

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5887


Sidney Markowitz <si...@sidney.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|FP FB_CIALIS_LEO3 matches   |FP FB_CIALIS_LEO3 matches
                   |"financial is" "financially |"FINANCIAL is"
                   |sound"                      |




--- Comment #1 from Sidney Markowitz <si...@sidney.com>  2008-04-21 04:00:14 PST ---
I was sloppy in reporting this bug. The rule requires the initial C to be upper
case, and indeed the FP was on a sentence that began "E*TRADE FINANCIAL is".

I changed the summary to be more accurate.

Someone sent a letter to sa-dev complaining about an FP on this rule matching
the word Catalise, which a Google search reveals is the name of some companies
and a commonly appearing spelling variant of the Portuguese word for catalyst.


-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.