You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2014/10/15 17:43:12 UTC

[Bug 7091] New: UTF-8 characters don't work in rules

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7091

            Bug ID: 7091
           Summary: UTF-8 characters don't work in rules
           Product: Spamassassin
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: spamassassin
          Assignee: dev@spamassassin.apache.org
          Reporter: Darxus@ChaosReigns.com

They need to be escaped.  For example:

/ą/ needs to be written as /\x{104}/  (or /[\x{104}\x{105}]/ if case
insensitive).  

I'm using perl v5.10.1 and spamassassin 3.4.0-rsvnunknown.  This problem seems
to exist for everyone currently, based on discussion on the users list: 
http://spamassassin.1065346.n5.nabble.com/UTF-8-rules-what-am-I-missing-td111934.html

http://spamassassin.1065346.n5.nabble.com/UTF-8-Spam-rules-td106485.html


Karsten Bräckelmann posted in September 2013 that it worked with SA 3.2 on Perl
5.8.x: 
http://spamassassin.1065346.n5.nabble.com/UTF-8-Spam-rules-td106485.html

That seems to have been when perl had more flexible UTF-8 handling: "-C on its
own ... follows the implicit (and problematic) UTF-8 behaviour of Perl 5.8.0."
- http://perldoc.perl.org/perlrun.html

I made some attempts to run spamassassin with this perl -C, unsuccessfully. 
Not sure what I did wrong.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [Bug 7091] New: UTF-8 characters don't work in rules

Posted by Linda Walsh <sa...@tlinx.org>.

bugzilla-daemon@issues.apache.org wrote:

> I made some attempts to run spamassassin with this perl -C, unsuccessfully. 
> Not sure what I did wrong.

What I try to use that _usually_ works is putting
"-Mutf8 -CSA" in my PERL5RUN env var, then perl treats stdin/out/err and arguments
as utf8, but doesn't do translations on files opened with "open"... Usually it's
the stream I/O where I need it most often....