You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Kevin A. McGrail" <km...@apache.org> on 2018/09/04 15:04:16 UTC

use bytes was Re: Non-ascii subjects with images

On 9/4/2018 10:57 AM, RW wrote:
> My understanding is that for historic reasons there is heavy use of
> 'use byte', so SpamAssassin sees text as a series of bytes in whatever
> character set it's written in. normalize_charset allows text to be
> converted to UTF-8, which makes it easier to match byte sequences, but a
> byte can't be an emoticon etc.  

3.4 has use bytes removed.  See
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7232

Mark had concerns about backporting it though so I was just considering
this issue for reopening.

Any comment on that?

-- 
Kevin A. McGrail
VP Fundraising, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171