You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mime4j-dev@james.apache.org by Aron Wieck <aw...@cnt.net> on 2009/08/17 13:02:01 UTC

Bug in DecoderUtil

Hello Folks,

I happened to stumble upon a Bug in  
org.apache.james.mime4j.codec.DecoderUtil  (mime4j version 0.6.0).

Proposed Test Case:

assertEquals("Test ü  and more", DecoderUtil.decodeEncodedWords("Test  
=?ISO-8859-1?Q?=FC_?= =?ISO-8859-1?Q?and_more?="));

Proposed Quick and dirty Fix:

Change this line in DecoderUtil.decodeEncodedWords :
             int end = begin == -1 ? -1 : body.indexOf("?=", begin + 2);
to
             int end = begin == -1 ? -1 : body.indexOf("?=",  
body.indexOf("?", begin + 2) + 3);

After this fix there is only one space between "ü" and "and", which I  
think is not correct (but I'm not sure).

Proposed Solution:

Replace "indexOf" by Regex matching, like so:

    final static Pattern regex = Pattern.compile("(=\\?.*?\\?.*?\\?.*?\ 
\?=)");

     public static String decodeEncodedWords(String body) {
         StringBuffer sb = new StringBuffer();

         final Matcher matcher = regex.matcher(body);
         while (matcher.find()) {
              
matcher 
.appendReplacement 
(sb,decodeEncodedWord(body,matcher.start(),matcher.end()));
         }

         matcher.appendTail(sb);
         return sb.toString();
     }


Keep up the good work!
Aron

Re: Bug in DecoderUtil

Posted by Markus Wiederkehr <ma...@gmail.com>.

On Mon, Aug 17, 2009 at 1:02 PM, Aron Wieck<aw...@cnt.net> wrote:
> Hello Folks,
>
> I happened to stumble upon a Bug in
> org.apache.james.mime4j.codec.DecoderUtil  (mime4j version 0.6.0).
>
> Proposed Test Case:
>
> assertEquals("Test ü  and more", DecoderUtil.decodeEncodedWords("Test
> =?ISO-8859-1?Q?=FC_?= =?ISO-8859-1?Q?and_more?="));

Hi Aron,

Coincidentally the same problem has been reported yesterday by Wim
Jongman. Funny how bugs like this can somehow remain undetected for
years and then show up all of a sudden..

> Proposed Quick and dirty Fix:
>
> Change this line in DecoderUtil.decodeEncodedWords :
>            int end = begin == -1 ? -1 : body.indexOf("?=", begin + 2);
> to
>            int end = begin == -1 ? -1 : body.indexOf("?=", body.indexOf("?",
> begin + 2) + 3);
>
> After this fix there is only one space between "ü" and "and", which I think
> is not correct (but I'm not sure).

No I think one space would be correct, see MIME4J-104.

> Proposed Solution:
>
> Replace "indexOf" by Regex matching, like so:
>
>   final static Pattern regex = Pattern.compile("(=\\?.*?\\?.*?\\?.*?\\?=)");
>
>    public static String decodeEncodedWords(String body) {
>        StringBuffer sb = new StringBuffer();
>
>        final Matcher matcher = regex.matcher(body);
>        while (matcher.find()) {
>
>  matcher.appendReplacement(sb,decodeEncodedWord(body,matcher.start(),matcher.end()));
>        }
>
>        matcher.appendTail(sb);
>        return sb.toString();
>    }

I'm afraid that would reintroduce MIME4J-104..

Thanks for reporting the problem,

Markus

>
>
> Keep up the good work!
> Aron