You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2016/06/15 23:35:31 UTC

[Bug 7314] Bayes.pm, DECOMPOSE_BODY_TOKENS and Unicode

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7314

Mark Martinec <Ma...@ijs.si> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|Undefined                   |4.0.0

--- Comment #1 from Mark Martinec <Ma...@ijs.si> ---
> Maybe it is better to work with Unicode characters not just in
> DECOMPOSE_BODY_TOKENS section, but everywhere in _tokenize_line sub ...

Thanks. Yes, there are several problems still associated with
historical assumption of single-byte characters. Some have been
addressed in current trunk code, but there are more, like the one
reported here. To be addressed in the next major version (4.0) ...

-- 
You are receiving this mail because:
You are the assignee for the bug.