You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by LuKreme <kr...@kreme.com> on 2014/09/01 01:39:38 UTC

Re: Give a penalty to messages with non latin UTF-8 characters?

On 31 Aug 2014, at 14:38 , Ian Zimmerman <it...@buug.org> wrote:

> Doesn't ok_languages and ok_locales do the job?  It does for me.

Not with UTF-8 encoding, that setting only seems to apply to old-stye character declarations.

-- 
showing snuffy is when Sesame Street jumped the shark


Re: Give a penalty to messages with non latin UTF-8 characters?

Posted by Reindl Harald <h....@thelounge.net>.

Am 20.10.2014 um 20:09 schrieb Philip Prindeville:
> I don’t understand why Apple’s Mail.app, for instance,
> defaults to Win-1252 here in the US. That’s braindead

well, ask the Firefox developers why they use to
say charset is "windows-1252" in recent releases
while in fact the http-headers as well meta-tags
clearly say "ISO-8859-1"

https://bugzilla.mozilla.org/show_bug.cgi?id=288904

the whole IT goes crazy in that context over years now


Re: Give a penalty to messages with non latin UTF-8 characters?

Posted by Philip Prindeville <ph...@redfish-solutions.com>.
On Oct 17, 2014, at 9:53 AM, Michael Opdenacker <mi...@free-electrons.com> wrote:

> On 09/01/2014 01:39 AM, LuKreme wrote:
>> On 31 Aug 2014, at 14:38 , Ian Zimmerman <it...@buug.org> wrote:
>> 
>>> Doesn't ok_languages and ok_locales do the job?  It does for me.
>> Not with UTF-8 encoding, that setting only seems to apply to old-stye character declarations.
>> 
> 
> This was exactly my point. As long as characters are in utf-8,
> ok_locales doesn't trigger. And ok_languages needs a sufficient number
> of characters to trigger. A subject with only Chinese characters in
> UTF-8 isn't enough.
> 
> Michael.

I explicitly add 10.0 to messages with charset of GB2312.  Unfortunately, a lot of Chinese engineers use clients that still use this charset as the default, and post to English language mailing lists.

There used to be a recommendation that MUA’s fit messages into the “smallest” encoding possible (smallest from the metric of how many characters it holds), i.e. USASCII, Latin1, UTF8.  Period.

Thus Chinese posters of legitimate messages to English language mailing lists would use USASCII or Latin1 (if replying to someone named André).

I don’t understand why Apple’s Mail.app, for instance, defaults to Win-1252 here in the US. That’s braindead.

Apple won’t bundle Flash with MacOS because it’s not an Open Standard, but they’ll embrace a vendor-specific character code when a superior Open Standard encoding exists.  Go figure.

-Philip


Re: Give a penalty to messages with non latin UTF-8 characters?

Posted by Michael Opdenacker <mi...@free-electrons.com>.
On 09/01/2014 01:39 AM, LuKreme wrote:
> On 31 Aug 2014, at 14:38 , Ian Zimmerman <it...@buug.org> wrote:
>
>> Doesn't ok_languages and ok_locales do the job?  It does for me.
> Not with UTF-8 encoding, that setting only seems to apply to old-stye character declarations.
>

This was exactly my point. As long as characters are in utf-8,
ok_locales doesn't trigger. And ok_languages needs a sufficient number
of characters to trigger. A subject with only Chinese characters in
UTF-8 isn't enough.

Michael.

-- 
Michael Opdenacker, CEO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
+33 484 258 098