You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mime4j-dev@james.apache.org by Aron Wieck <aw...@cnt.net> on 2009/08/17 13:02:01 UTC
Bug in DecoderUtil
Hello Folks,
I happened to stumble upon a Bug in
org.apache.james.mime4j.codec.DecoderUtil (mime4j version 0.6.0).
Proposed Test Case:
assertEquals("Test ü and more", DecoderUtil.decodeEncodedWords("Test
=?ISO-8859-1?Q?=FC_?= =?ISO-8859-1?Q?and_more?="));
Proposed Quick and dirty Fix:
Change this line in DecoderUtil.decodeEncodedWords :
int end = begin == -1 ? -1 : body.indexOf("?=", begin + 2);
to
int end = begin == -1 ? -1 : body.indexOf("?=",
body.indexOf("?", begin + 2) + 3);
After this fix there is only one space between "ü" and "and", which I
think is not correct (but I'm not sure).
Proposed Solution:
Replace "indexOf" by Regex matching, like so:
final static Pattern regex = Pattern.compile("(=\\?.*?\\?.*?\\?.*?\
\?=)");
public static String decodeEncodedWords(String body) {
StringBuffer sb = new StringBuffer();
final Matcher matcher = regex.matcher(body);
while (matcher.find()) {
matcher
.appendReplacement
(sb,decodeEncodedWord(body,matcher.start(),matcher.end()));
}
matcher.appendTail(sb);
return sb.toString();
}
Keep up the good work!
Aron
Re: Bug in DecoderUtil
Posted by Markus Wiederkehr <ma...@gmail.com>.
On Mon, Aug 17, 2009 at 1:02 PM, Aron Wieck<aw...@cnt.net> wrote:
> Hello Folks,
>
> I happened to stumble upon a Bug in
> org.apache.james.mime4j.codec.DecoderUtil (mime4j version 0.6.0).
>
> Proposed Test Case:
>
> assertEquals("Test ü and more", DecoderUtil.decodeEncodedWords("Test
> =?ISO-8859-1?Q?=FC_?= =?ISO-8859-1?Q?and_more?="));
Hi Aron,
Coincidentally the same problem has been reported yesterday by Wim
Jongman. Funny how bugs like this can somehow remain undetected for
years and then show up all of a sudden..
> Proposed Quick and dirty Fix:
>
> Change this line in DecoderUtil.decodeEncodedWords :
> int end = begin == -1 ? -1 : body.indexOf("?=", begin + 2);
> to
> int end = begin == -1 ? -1 : body.indexOf("?=", body.indexOf("?",
> begin + 2) + 3);
>
> After this fix there is only one space between "ü" and "and", which I think
> is not correct (but I'm not sure).
No I think one space would be correct, see MIME4J-104.
> Proposed Solution:
>
> Replace "indexOf" by Regex matching, like so:
>
> final static Pattern regex = Pattern.compile("(=\\?.*?\\?.*?\\?.*?\\?=)");
>
> public static String decodeEncodedWords(String body) {
> StringBuffer sb = new StringBuffer();
>
> final Matcher matcher = regex.matcher(body);
> while (matcher.find()) {
>
> matcher.appendReplacement(sb,decodeEncodedWord(body,matcher.start(),matcher.end()));
> }
>
> matcher.appendTail(sb);
> return sb.toString();
> }
I'm afraid that would reintroduce MIME4J-104..
Thanks for reporting the problem,
Markus
>
>
> Keep up the good work!
> Aron