You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Johnson, Robert F" <ro...@intel.com> on 2005/04/13 00:11:19 UTC
Rules to identify simplified and traditional chinese character sets
I have a requirement for a rule that will identify emails using either
traditional or simplified Chinese character sets.
I was able to create a rule that finds these codes in the Internet
headers but I have noticed that some emails have the char set identified
in the mime header and not the Internet header.
This code fragment illustrates how I do this for Internet headers:
header CHINESE_WL_1 Content-Type =~ /gb2312/i
describe CHINESE_WL_1 White list Simplified Chinese
Does anyone no how to create a rule to detect these codes in a mime
header?
Re: Rules to identify simplified and traditional chinese character sets
Posted by Loren Wilton <lw...@earthlink.net>.
> This code fragment illustrates how I do this for Internet headers:
>
> header CHINESE_WL_1 Content-Type =~ /gb2312/i
> describe CHINESE_WL_1 White list Simplified Chinese
>
> Does anyone no how to create a rule to detect these codes in a mime
> header?
There was talk on the dev list a while back of being able to test the items
in MIME headers. I'm not clear on whether anything ever came of that.
In any case you can run a 'full' to look for the headers and find them.
Perhaps something like (untested):
full CHINESE_xxx /^Content-Type:\s+gb2312\b/im
Loren