You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Robert Menschel <Ro...@Menschel.net> on 2004/12/01 18:56:12 UTC

Re[2]: Japanese False Postives with Spam Assassin 3.01 and RH WS 3.0

Hello Robert,

Tuesday, November 30, 2004, 9:25:52 PM, Daniel wrote:

DQ> The problem doesn't sound like it's SpamAssassin despite the subject
DQ> line of this email, rather it's third-party rulesets.

I agree.

DQ> "Johnson, Robert F" <ro...@intel.com> writes:

>> Based on spt checking of a couple of dozen examples, I didn't see any
>> significant pattern of out of the box rules being involved, mostly SARE
>> or WIKI rules.  The most heavily implicated were the following:
>> (MANGLED and SARE_SUB_CASH_CHAR were probably had the biggest impact.
>> 
>> SARE Rules
>> SARE_SUB_CASH_CHAR
>> SARE_RAND_2

Can you email a couple of examples to me that hit these rules to me,
preferably in a zip or gz file? I maintain the Subject rules file for
SARE, and would like to refine/rescore SARE_SUB_CASH_CHAR to help
avoid your FPs. I'll also forward the info to the SARE ninja that
maintains our Random rules file.

>> WIKI Rules
>> MANGLED_LIST
>> MANGLED_LIPS
>> J_CHICKENPOX_12
>> J_CHICKENPOX_22

All of these are language-related rules, which work well in English,
might be subject to an occasional misfire in a non-English Western
European language, and can readily misfire in any
non-Latin/non-Romance language. If you regularly get non-spam in
Japanese, you should probably drop the entire MANGLED and CHICKENPOX
families. If you're using Tripwire, you should drop that also since it
too can misfire on Japanese non-spam.

Bob Menschel