You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by David Hobley <da...@mionegroup.com> on 2008/02/06 11:00:39 UTC

Problems with CHARSET_FARAWAY_HEADER & UNWANTED_MESSAGE_BODY (was Re: Japanese emails being triggered as Spam incorrectly...)

All, 

I have been trying to work out what is the core issue here, but I am still stumped. Can anyone offer any suggestions? 

Cheers, 
David 
----- Original Message ----- 
From: "David Hobley" <da...@mionegroup.com> 
To: users@spamassassin.apache.org 
Sent: Thursday, 31 January 2008 02:30:45 PM (GMT+1000) Australia/Sydney 
Subject: Japanese emails being triggered as Spam incorrectly... 



All, 

I have a very bizarre issue here - we use Zimbra and its' built in SpamAssassin to manage our Spam - we get a lot of Japanese emails in, so I have configured 

ok_languages en jp 
ok_locales en jp 

in local.cf. I have also edited v310pre.in to enable TextCat. SpamAssassin has then been restarted. 

However, our Japanese emails are still being triggered with UNWANTED_MESSAGE_BODY and CHARSET_FARAWAY_HEADER. Here is an example which given the tagging of iso-2022-jp should not be triggering these rules (as I understand it): 

Received: from localhost (localhost.localdomain [127.0.0.1]) 
by mail.onegrp.com (Postfix) with ESMTP id 831EF1289AF7 
for <us...@domain>; Tue, 22 Jan 2008 16:45:44 +1000 (EST) 
X-Virus-Scanned: amavisd-new at 
X-Spam-Flag: YES 
X-Spam-Score: 5.832 
X-Spam-Level: ***** 
X-Spam-Status: Yes, score=5.832 tagged_above=-10 required=3 tests=[AWL=-1.189, 
BAYES_50=0.001, CHARSET_FARAWAY_HEADER=3.2, GAPPY_SUBJECT=1.02, 
HTML_MESSAGE=0.001, SPF_PASS=-0.001, UNWANTED_LANGUAGE_BODY=2.8] 
Received: (qmail 34635 invoked by uid 60001); 22 Jan 2008 06:59:45 -0000 
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; 
s=yj20050223; d=yahoo.co.jp; 
h=Message-ID:Received ate:From:Subject:To:MIME-Version:Content-Type; 
b=ig41AiIB0bZcGPlzggDID3xDLYQ5eYgSqF3yieHoP8SrApfr AbdWM4zGTNhNHMSmuAyuluzrKiLBpDH/hFfTsfZiTvMPgTDP6wg7iQq+5lNjn1eWjR93CAR8DLFp+hbf ; 
Message-ID: <blah blah blah> 
Tue, 22 Jan 2008 15:59:45 JST 
Date: Tue, 22 Jan 2008 15:59:45 +0900 (JST) 
From: =?ISO-2022-JP?B?GyRCQHVMbhsoQiAbJEIkIiRmO1IbKEI=?= <us...@sourcedomain> 
Subject: [*** SPAM ***]=?ISO-2022-JP?B?GyRCJTUlcyVXJWslUSVDJS8kTjdvGyhC?= 
To: user@domain 
MIME-Version: 1.0 
Content-Type: multipart/alternative; boundary="0-2144660122-1200985185=:33655" 

--0-2144660122-1200985185=:33655 
Content-Type: text/plain; charset=iso-2022-jp 

あゆみさま 

。。。 





--------------------------------- 
Easy + Joy + Powerful = Yahoo! Bookmarks x Toolbar 

--0-2144660122-1200985185=:33655 
Content-Type: text/html; charset=iso-2022-jp 

<div>あゆみさま</div> 


--0-2144660122-1200985185=:33655-- 

I assume I have stuffed something up, can anyone point me in the right direction please. 

Cheers, 
David 


Re: Problems with CHARSET_FARAWAY_HEADER & UNWANTED_MESSAGE_BODY (was Re: Japanese emails being triggered as Spam incorrectly...)

Posted by Matt Kettler <mk...@verizon.net>.
David Hobley wrote:
> All,
>
> I have been trying to work out what is the core issue here, but I am 
> still stumped. Can anyone offer any suggestions?
Yes, the code for japan in ok_locales and ok_languages is ja, not jp. 
Your current setting essentially boils down to English only.



Re: Problems with CHARSET_FARAWAY_HEADER & UNWANTED_MESSAGE_BODY (was Re: Japanese emails being triggered as Spam incorrectly...)

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Wed, 2008-02-06 at 20:00 +1000, David Hobley wrote:
> I have been trying to work out what is the core issue here, but I am
> still stumped. Can anyone offer any suggestions?

What don't you like about my reply sent just within a couple hours after
your OP a week ago?
  http://mail-archives.apache.org/mod_mbox/spamassassin-users/200801.mbox/%3c1201782786.9551.3.camel@monkey.loc%3e

I told you how to fix your problem *and* pointed to the relevant
documentation -- which you apparently did not bother to read carefully
while "struggling" with this issue for days...

  guenther


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}