You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mime4j-dev@james.apache.org by "Benoit Tellier (Jira)" <mi...@james.apache.org> on 2022/05/19 08:32:00 UTC

[jira] [Commented] (MIME4J-216) 8bit character broken when parsing multi-line encoded subject

    [ https://issues.apache.org/jira/browse/MIME4J-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539405#comment-17539405 ] 

Benoit Tellier commented on MIME4J-216:
---------------------------------------

Got the issue. I bet the Q encoding can be normalized after unfolding.

I will have a look today, it looks fun!


{code:java}
    @Test
    public void test() throws Exception {
        DefaultMessageBuilder messageBuilder = new DefaultMessageBuilder();
        messageBuilder.setMimeEntityConfig(MimeConfig.PERMISSIVE);
        messageBuilder.setDecodeMonitor(DecodeMonitor.SILENT);

        final Message message = messageBuilder.parseMessage(new ByteArrayInputStream(("Subject: Re: =?UTF-8?Q?=D8=AA=D8=B2_=D8=A2=D9=82=D8=A7=DB=8C_=DA=A9=D8=B1=D8=A7=D9=85=D8=AA=DB=8C?=\r\n").getBytes()));

        System.out.println(message.getSubject());
    }
{code}


Returns


{code:java}
Re: تز آقای کرامتی
{code}


But


{code:java}
    @Test
    public void test() throws Exception {
        DefaultMessageBuilder messageBuilder = new DefaultMessageBuilder();
        messageBuilder.setMimeEntityConfig(MimeConfig.PERMISSIVE);
        messageBuilder.setDecodeMonitor(DecodeMonitor.SILENT);

        final Message message = messageBuilder.parseMessage(new ByteArrayInputStream(("Subject: Re: =?UTF-8?Q?=D8=AA=D8=B2_=D8=A2=D9=82=D8=A7=DB=8C_=DA=A9=D8=B1=D8=A7=D9?=\r\n" +
            " =?UTF-8?Q?=85=D8=AA=DB=8C?=\r\n").getBytes()));

        System.out.println(message.getSubject());
    }
{code}


Returns

{code:java}

Re: تز آقای کرا��تی
{code}


I also hereby acknowledge Thunderbird handles the splited encoding correctly... (I manually changed the subject of an email to check.)


{code:java}
Re:تز آقای کرامتی
{code}




> 8bit character broken when parsing multi-line encoded subject
> -------------------------------------------------------------
>
>                 Key: MIME4J-216
>                 URL: https://issues.apache.org/jira/browse/MIME4J-216
>             Project: James Mime4j
>          Issue Type: Test
>          Components: parser (core)
>    Affects Versions: 0.7.1
>            Reporter: changwan lim
>            Priority: Major
>
> Parsing multi-line encoded subject using org.apache.james.mime4j.codec.DecoderUtil.decodeEncodedWords(), decoded 8bit character is broken.
> Exactly broken character is between last character of first line and first charcter of second line.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)