You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mime4j-dev@james.apache.org by Günther Schmidt <gu...@kmmd.de> on 2012/05/30 23:07:41 UTC

Re: Automatic Decoding of field contents

Hi everyone,

sorry for the confusion, the mimeparser set to do decoding *does* also 
decode content fields. It just has trouble recognizing some more exotic 
ones correctly, like this one:

Content-Type: application/zip; 
name*0*=iso-8859-15''Redcom%20Erl%F6santeilsplit;
  name*1*=ter.zip

Anybody here know a neat trick to get around this hick up?

Günther

Am 30.05.12 00:37, schrieb Günther Schmidt:
> Hi everyone,
>
> I've figured out how to get the parser to automatically decode input 
> streams, ie make mime4j return Base64InputStream or 
> QuotedPrintableInputStream, by setting 
> MimeStreamParser.setContentDecoding(true).
>
> However this does not seem to affect the contents of any of the 
> headers fields. Their bodies are still quotedprintable encoded, even 
> when parsed, ie turned into a ParsedField, which causes some errors in 
> my code.
>
> I haven't found the switch for automatically decoding the field 
> contents as well. What do I have to do?
>
> Günther


Re: Automatic Decoding of field contents

Posted by Günther Schmidt <gu...@kmmd.de>.
Hello Ioan,

I guess the (Content-type) Fieldparser would have worked fine and 
recognized this correctly if the preceeding QuotedPrintableDecoder had 
turned this into "name=Redcom Erlössplitter.zip". In the 7,790 emails I 
test against there is only a total of 20 cases such as this. I don't 
know what's causing this odd encoding, whether the original mail client 
or the MTA is responsible for it.

At the moment I'm not even sure if in order to remedy this to make it 
the QuotedPrintableDecoder's responsibility or the FieldParsers. Or just 
ignore the problem and create a work around.

My Thunderbird email client was however able to parse it.

Günther

Am 30.05.12 23:44, schrieb Ioan Eugen Stan:

name*0*=iso-8859-15''Redcom%20Erl%F6santeilsplit;   name*1*=ter.zip


Re: Automatic Decoding of field contents

Posted by Ioan Eugen Stan <st...@gmail.com>.
Hi Gunther,

Not sure of I could help but please have a look at chapter 5:
Content-Type Header Field http://www.ietf.org/rfc/rfc2045.txt .

I reproduced the relevant part:
"""
 The Content-Type header field specifies the nature of the data in the
   body of an entity by giving media type and subtype identifiers, and
   by providing auxiliary information that may be required for certain
   media types.  After the media type and subtype names, the remainder
   of the header field is simply a set of parameters, specified in an
   attribute=value notation.  The ordering of parameters is not
   significant.
"""

You'll have to see what are the parameters for application/zip and how
they should be encoded. It seems to me that they are indexed file
names with some optional encoding in between. Furthermore I don't
think that MimeStreamParser.setContentDecoding(true) refers to how
mime4j decodes field parameters.

Hope this helps,

2012/5/31 Günther Schmidt <gu...@kmmd.de>:
> Hi everyone,
>
> sorry for the confusion, the mimeparser set to do decoding *does* also
> decode content fields. It just has trouble recognizing some more exotic ones
> correctly, like this one:
>
> Content-Type: application/zip;
> name*0*=iso-8859-15''Redcom%20Erl%F6santeilsplit;
>  name*1*=ter.zip
>
> Anybody here know a neat trick to get around this hick up?
>
> Günther
>
> Am 30.05.12 00:37, schrieb Günther Schmidt:
>>
>> Hi everyone,
>>
>> I've figured out how to get the parser to automatically decode input
>> streams, ie make mime4j return Base64InputStream or
>> QuotedPrintableInputStream, by setting
>> MimeStreamParser.setContentDecoding(true).
>>
>> However this does not seem to affect the contents of any of the headers
>> fields. Their bodies are still quotedprintable encoded, even when parsed, ie
>> turned into a ParsedField, which causes some errors in my code.
>>
>> I haven't found the switch for automatically decoding the field contents
>> as well. What do I have to do?
>>
>> Günther
>
>



-- 
Ioan Eugen Stan
http://ieugen.blogspot.com/  *** http://bucharest-jug.github.com/ ***

Re: Automatic Decoding of field contents

Posted by Markus Wiederkehr <ma...@gmail.com>.
Sorry to disappoint you but "Fix Version" means it is scheduled to be fixed
in 0.8, it is not fixed yet. If nobody contributes a patch it will simply
be postponed to 0.9 once 0.8 is ready.

A patch would be welcome but we cannot accept a derivative work from
JavaMail or other sources if the license is not compatible.

Markus

On Thu, May 31, 2012 at 12:37 AM, Günther Schmidt
<gu...@kmmd.de>wrote:

> Hi Markus,
>
> thank you very much, you seem to have nailed. And it even better, if I
> read this correctly it's "fixed" in 0.8. So I can even give it a spin.
>
> Günther
>
> Am 31.05.12 00:28, schrieb Markus Wiederkehr:
>
>> This is a MIME extension that is not implemented in Mime4J, see
>> https://issues.apache.org/**jira/browse/MIME4J-109<https://issues.apache.org/jira/browse/MIME4J-109>
>>
>> Markus
>>
>> On Wed, May 30, 2012 at 11:07 PM, Günther Schmidt
>> <gu...@kmmd.de>**wrote:
>>
>>  Hi everyone,
>>>
>>> sorry for the confusion, the mimeparser set to do decoding *does* also
>>> decode content fields. It just has trouble recognizing some more exotic
>>> ones correctly, like this one:
>>>
>>> Content-Type: application/zip; name*0*=iso-8859-15''Redcom%**
>>>
>>> 20Erl%F6santeilsplit;
>>>  name*1*=ter.zip
>>>
>>> Anybody here know a neat trick to get around this hick up?
>>>
>>> Günther
>>>
>>> Am 30.05.12 00:37, schrieb Günther Schmidt:
>>>
>>>  Hi everyone,
>>>>
>>>> I've figured out how to get the parser to automatically decode input
>>>> streams, ie make mime4j return Base64InputStream or
>>>> QuotedPrintableInputStream, by setting MimeStreamParser.**
>>>>
>>>> setContentDecoding(true).
>>>>
>>>> However this does not seem to affect the contents of any of the headers
>>>> fields. Their bodies are still quotedprintable encoded, even when
>>>> parsed,
>>>> ie turned into a ParsedField, which causes some errors in my code.
>>>>
>>>> I haven't found the switch for automatically decoding the field contents
>>>> as well. What do I have to do?
>>>>
>>>> Günther
>>>>
>>>>
>>>
>
> --
> KMMD IT-Consulting UG (haftungsbeschränkt)
> Offenburger Str. 45
> 68239 Mannheim
> Tel: +49-621-4393887
> HRB 712101 Amtsgericht Mannheim
>
>

Re: Automatic Decoding of field contents

Posted by Günther Schmidt <gu...@kmmd.de>.
Hi Markus,

thank you very much, you seem to have nailed. And it even better, if I 
read this correctly it's "fixed" in 0.8. So I can even give it a spin.

Günther

Am 31.05.12 00:28, schrieb Markus Wiederkehr:
> This is a MIME extension that is not implemented in Mime4J, see
> https://issues.apache.org/jira/browse/MIME4J-109
>
> Markus
>
> On Wed, May 30, 2012 at 11:07 PM, Günther Schmidt
> <gu...@kmmd.de>wrote:
>
>> Hi everyone,
>>
>> sorry for the confusion, the mimeparser set to do decoding *does* also
>> decode content fields. It just has trouble recognizing some more exotic
>> ones correctly, like this one:
>>
>> Content-Type: application/zip; name*0*=iso-8859-15''Redcom%**
>> 20Erl%F6santeilsplit;
>>   name*1*=ter.zip
>>
>> Anybody here know a neat trick to get around this hick up?
>>
>> Günther
>>
>> Am 30.05.12 00:37, schrieb Günther Schmidt:
>>
>>> Hi everyone,
>>>
>>> I've figured out how to get the parser to automatically decode input
>>> streams, ie make mime4j return Base64InputStream or
>>> QuotedPrintableInputStream, by setting MimeStreamParser.**
>>> setContentDecoding(true).
>>>
>>> However this does not seem to affect the contents of any of the headers
>>> fields. Their bodies are still quotedprintable encoded, even when parsed,
>>> ie turned into a ParsedField, which causes some errors in my code.
>>>
>>> I haven't found the switch for automatically decoding the field contents
>>> as well. What do I have to do?
>>>
>>> Günther
>>>
>>


-- 
KMMD IT-Consulting UG (haftungsbeschränkt)
Offenburger Str. 45
68239 Mannheim
Tel: +49-621-4393887
HRB 712101 Amtsgericht Mannheim


Re: Automatic Decoding of field contents

Posted by Markus Wiederkehr <ma...@gmail.com>.
This is a MIME extension that is not implemented in Mime4J, see
https://issues.apache.org/jira/browse/MIME4J-109

Markus

On Wed, May 30, 2012 at 11:07 PM, Günther Schmidt
<gu...@kmmd.de>wrote:

> Hi everyone,
>
> sorry for the confusion, the mimeparser set to do decoding *does* also
> decode content fields. It just has trouble recognizing some more exotic
> ones correctly, like this one:
>
> Content-Type: application/zip; name*0*=iso-8859-15''Redcom%**
> 20Erl%F6santeilsplit;
>  name*1*=ter.zip
>
> Anybody here know a neat trick to get around this hick up?
>
> Günther
>
> Am 30.05.12 00:37, schrieb Günther Schmidt:
>
>> Hi everyone,
>>
>> I've figured out how to get the parser to automatically decode input
>> streams, ie make mime4j return Base64InputStream or
>> QuotedPrintableInputStream, by setting MimeStreamParser.**
>> setContentDecoding(true).
>>
>> However this does not seem to affect the contents of any of the headers
>> fields. Their bodies are still quotedprintable encoded, even when parsed,
>> ie turned into a ParsedField, which causes some errors in my code.
>>
>> I haven't found the switch for automatically decoding the field contents
>> as well. What do I have to do?
>>
>> Günther
>>
>
>