You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Arvid Ephraim Picciani <ae...@ibcsolutions.de> on 2008/04/10 12:38:29 UTC

foreign languages

greetings.
any ideas for spam in russian and chineese? (some even with broken charset)
XBL and bayes are very effective but not enough :/
I'd like to have some kind of language matcher. We don't have people speaking 
russian in the company so it would be nice to give 1 or 2 points on just the 
language.
-- 
best regards
Arvid Ephraim Picciani

Re: foreign languages

Posted by Arvid Ephraim Picciani <ae...@ibcsolutions.de>.
thanks Matt  and Mathus. That helps.

-- 
best regards/Mit freundlichen Grüßen
Arvid Ephraim Picciani

Re: foreign languages

Posted by Matt Kettler <mk...@verizon.net>.
Arvid Ephraim Picciani wrote:
> greetings.
> any ideas for spam in russian and chineese? (some even with broken charset)
> XBL and bayes are very effective but not enough :/
> I'd like to have some kind of language matcher. We don't have people speaking 
> russian in the company so it would be nice to give 1 or 2 points on just the 
> language.
>   
Well, SpamAssassin has two tools to help here..

ok_locales will check character sets. By default it allows everything, 
but you can change it to only allow character sets that are appropriate 
for your locale.

Also, there's the TextCat plugin, which you'd have to un-comment in 
v310.pre. Once that's enabled, you can start using ok_languages, which 
tries to guess at the language of a message based on character combinations.

Please read the docs closely, as there are a lot more languages than 
locales, so what's valid for one isn't valid for the other. (There are 
lots of languages that all use the same character sets.)

http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html#language_options

http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_TextCat.html





Re: foreign languages

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 10.04.08 12:38, Arvid Ephraim Picciani wrote:
> any ideas for spam in russian and chineese? (some even with broken charset)
> XBL and bayes are very effective but not enough :/
> I'd like to have some kind of language matcher. We don't have people speaking 
> russian in the company so it would be nice to give 1 or 2 points on just the 
> language.

Look at TextCat plugin and ok_languages setting.

There's also ok_locale settigns which match the alphabet setting and does
not require any plugin...
-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Honk if you love peace and quiet.