You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by Robert Burrell Donkin <ro...@gmail.com> on 2008/05/26 21:35:41 UTC

[mime4j] Extra methods For ContentDescriptor...?

ATM this interface supports

   String getMimeType();
   String getCharset();
   String getTransferEncoding();

so it has Content-Type and Content-Transfer-Encoding but is missing
calls for information on the MIME-Version, Content-ID and
Content-Description headers defined in RFC2045. this information would
be useful for IMAP so i was wondering about the design of the API.
should ContentDescriptor aim to supply the standard headers found in
RFC2045?

of course, this leads to the slippery slope: what about RFC 2183
(Content-Disposition),  RFC 3066 (LANGUAGE-TAGS), RFC 2557 (LOCATION)
and RFC 1864 (MD5)...?

where should the API draw the line?

opinions?

- robert

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] Extra methods For ContentDescriptor...?

Posted by Stefano Bagnara <ap...@bago.org>.
Robert Burrell Donkin ha scritto:
> On Thu, May 29, 2008 at 8:49 AM, Stefano Bagnara <ap...@bago.org> wrote:
>> Robert Burrell Donkin ha scritto:
>>> Maybe only two modes need to be supported: minimal and maximal. The
>>> only downside of parsing every standard MIME header is efficiency. So
>>> offer either everything or minimum as part of the library. More
>>> precise tuning would require suclassing.
>> Can parsing be delayed to the moment a specific getter is used? (lazy
>> parsing) ? Or you want a parse error to be called before in case of invalid
>> data?
> 
> the pull parser discards any data which isn't used. retrieving and
> storing the data would have a cost even if the actual parsing were
> left until required. i don't think that the cost would be great but
> some applications with extreme needs may want to avoid these costs
> entirely.
> 
> - robert

Good point. Thank you for the explanation :-)

Stefano


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] Extra methods For ContentDescriptor...?

Posted by Robert Burrell Donkin <ro...@gmail.com>.
On Thu, May 29, 2008 at 8:49 AM, Stefano Bagnara <ap...@bago.org> wrote:
> Robert Burrell Donkin ha scritto:
>>
>> Maybe only two modes need to be supported: minimal and maximal. The
>> only downside of parsing every standard MIME header is efficiency. So
>> offer either everything or minimum as part of the library. More
>> precise tuning would require suclassing.
>
> Can parsing be delayed to the moment a specific getter is used? (lazy
> parsing) ? Or you want a parse error to be called before in case of invalid
> data?

the pull parser discards any data which isn't used. retrieving and
storing the data would have a cost even if the actual parsing were
left until required. i don't think that the cost would be great but
some applications with extreme needs may want to avoid these costs
entirely.

- robert

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] Extra methods For ContentDescriptor...?

Posted by Stefano Bagnara <ap...@bago.org>.
Robert Burrell Donkin ha scritto:
> Maybe only two modes need to be supported: minimal and maximal. The
> only downside of parsing every standard MIME header is efficiency. So
> offer either everything or minimum as part of the library. More
> precise tuning would require suclassing.

Can parsing be delayed to the moment a specific getter is used? (lazy 
parsing) ? Or you want a parse error to be called before in case of 
invalid data?

Stefano


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] Extra methods For ContentDescriptor...?

Posted by Robert Burrell Donkin <ro...@gmail.com>.
On 5/27/08, Robert Burrell Donkin <ro...@gmail.com> wrote:
> On Tue, May 27, 2008 at 8:54 AM, Stefano Bagnara <ap...@bago.org> wrote:
>> Robert Burrell Donkin ha scritto:
>>>
>>> On 5/26/08, Stefano Bagnara <ap...@bago.org> wrote:
>>>>
>>>> Robert Burrell Donkin ha scritto:
>>>>>
>>>>> ATM this interface supports
>>>>>
>>>>>   String getMimeType();
>>>>>   String getCharset();
>>>>>   String getTransferEncoding();
>>>>>
>>>>> so it has Content-Type and Content-Transfer-Encoding but is missing
>>>>> calls for information on the MIME-Version, Content-ID and
>>>>> Content-Description headers defined in RFC2045. this information would
>>>>> be useful for IMAP so i was wondering about the design of the API.
>>>>> should ContentDescriptor aim to supply the standard headers found in
>>>>> RFC2045?
>>>>>
>>>>> of course, this leads to the slippery slope: what about RFC 2183
>>>>> (Content-Disposition),  RFC 3066 (LANGUAGE-TAGS), RFC 2557 (LOCATION)
>>>>> and RFC 1864 (MD5)...?
>>>>>
>>>>> where should the API draw the line?
>>>>
>>>> This is an hard issue...
>>>> IMO anything that is not required to correctly interpreting the mime
>>>> source should be left to a getHeader(String headerName) and a bunch of
>>>> constants for most used headers, so no, I would not add the Content-ID,
>>>> Content-Description and other headers defined in other rfcs.
>>>
>>> In theory ID (and some of the other examples) are required for correct
>>> interpretation. They are just not commonly used. Several have internal
>>> structures which require correct parsing. This is
>>> inconvenient for the user especially when using the pull parser.
>>
>> In this case this is a "rule for inclusion" ;-).
>> The ID should be included.
>>
>> In my understanding most of them didn't have a structure, but in case they
>> have structure and it is so different from header to header then maybe it
>> worth adding everything to mime4j.
>
> the structures are related but it would be more convenient if a
> library took care of the parsing

Perhaps pluggable micro-parsers could be introduced per RFC
>>> So the question is whether ContentDesciptor should be a maximal or
>>> minimal description of the standard MIME content.
>>
>> Maybe we should cover minimanl with the standard APIs and then provide
>> utility classes for the maximal descriptions?
>>
>>>> On the other hand I would say that mime4j should be easy to be extended
>>>> to support a more complete/specific interface when parsing messages.
>>>
>>> Given several fields require parsing, if they aren't included in the
>>> ContentDescriptor interface then I think they need to be supported in
>>> some other way. Not sure quite how, though.
>>
>> Not sure too.. what about a wrapper:
>>
>> new Rfc1864ContentDescriptor(ContentDescriptor).getMD5()
>>
>> or a static:
>>
>> Rfc1864Helper.getContentMD5(ContentDescriptor) ?
>
> ContentDescriptor doesn't expose headers (just results). a different
> BodyDescriptor implementation would be a possiblility.
>
> MimeStreamParser has a body descriptor creation hook for subclasses.
> being able to substitute a factory could allow more possibilities. but
> probably best to wait until jochen has his changes in...

Maybe only two modes need to be supported: minimal and maximal. The
only downside of parsing every standard MIME header is efficiency. So
offer either everything or minimum as part of the library. More
precise tuning would require suclassing.
I dislike the mutator on BodyDescriptor: it's there only to allow
results to be fed in by the parser. I see no reason why users should
call it. Might be better to introduce a MutatableBodyDescriptor
extension for use in the parser.

Robert

>
> - robert
>

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] Extra methods For ContentDescriptor...?

Posted by Robert Burrell Donkin <ro...@gmail.com>.
On Tue, May 27, 2008 at 8:54 AM, Stefano Bagnara <ap...@bago.org> wrote:
> Robert Burrell Donkin ha scritto:
>>
>> On 5/26/08, Stefano Bagnara <ap...@bago.org> wrote:
>>>
>>> Robert Burrell Donkin ha scritto:
>>>>
>>>> ATM this interface supports
>>>>
>>>>   String getMimeType();
>>>>   String getCharset();
>>>>   String getTransferEncoding();
>>>>
>>>> so it has Content-Type and Content-Transfer-Encoding but is missing
>>>> calls for information on the MIME-Version, Content-ID and
>>>> Content-Description headers defined in RFC2045. this information would
>>>> be useful for IMAP so i was wondering about the design of the API.
>>>> should ContentDescriptor aim to supply the standard headers found in
>>>> RFC2045?
>>>>
>>>> of course, this leads to the slippery slope: what about RFC 2183
>>>> (Content-Disposition),  RFC 3066 (LANGUAGE-TAGS), RFC 2557 (LOCATION)
>>>> and RFC 1864 (MD5)...?
>>>>
>>>> where should the API draw the line?
>>>
>>> This is an hard issue...
>>> IMO anything that is not required to correctly interpreting the mime
>>> source should be left to a getHeader(String headerName) and a bunch of
>>> constants for most used headers, so no, I would not add the Content-ID,
>>> Content-Description and other headers defined in other rfcs.
>>
>> In theory ID (and some of the other examples) are required for correct
>> interpretation. They are just not commonly used. Several have internal
>> structures which require correct parsing. This is
>> inconvenient for the user especially when using the pull parser.
>
> In this case this is a "rule for inclusion" ;-).
> The ID should be included.
>
> In my understanding most of them didn't have a structure, but in case they
> have structure and it is so different from header to header then maybe it
> worth adding everything to mime4j.

the structures are related but it would be more convenient if a
library took care of the parsing

>> So the question is whether ContentDesciptor should be a maximal or
>> minimal description of the standard MIME content.
>
> Maybe we should cover minimanl with the standard APIs and then provide
> utility classes for the maximal descriptions?
>
>>> On the other hand I would say that mime4j should be easy to be extended
>>> to support a more complete/specific interface when parsing messages.
>>
>> Given several fields require parsing, if they aren't included in the
>> ContentDescriptor interface then I think they need to be supported in
>> some other way. Not sure quite how, though.
>
> Not sure too.. what about a wrapper:
>
> new Rfc1864ContentDescriptor(ContentDescriptor).getMD5()
>
> or a static:
>
> Rfc1864Helper.getContentMD5(ContentDescriptor) ?

ContentDescriptor doesn't expose headers (just results). a different
BodyDescriptor implementation would be a possiblility.

MimeStreamParser has a body descriptor creation hook for subclasses.
being able to substitute a factory could allow more possibilities. but
probably best to wait until jochen has his changes in...

- robert

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] Extra methods For ContentDescriptor...?

Posted by Stefano Bagnara <ap...@bago.org>.
Robert Burrell Donkin ha scritto:
> On 5/26/08, Stefano Bagnara <ap...@bago.org> wrote:
>> Robert Burrell Donkin ha scritto:
>>> ATM this interface supports
>>>
>>>    String getMimeType();
>>>    String getCharset();
>>>    String getTransferEncoding();
>>>
>>> so it has Content-Type and Content-Transfer-Encoding but is missing
>>> calls for information on the MIME-Version, Content-ID and
>>> Content-Description headers defined in RFC2045. this information would
>>> be useful for IMAP so i was wondering about the design of the API.
>>> should ContentDescriptor aim to supply the standard headers found in
>>> RFC2045?
>>>
>>> of course, this leads to the slippery slope: what about RFC 2183
>>> (Content-Disposition),  RFC 3066 (LANGUAGE-TAGS), RFC 2557 (LOCATION)
>>> and RFC 1864 (MD5)...?
>>>
>>> where should the API draw the line?
>> This is an hard issue...
>> IMO anything that is not required to correctly interpreting the mime
>> source should be left to a getHeader(String headerName) and a bunch of
>> constants for most used headers, so no, I would not add the Content-ID,
>> Content-Description and other headers defined in other rfcs.
> 
> In theory ID (and some of the other examples) are required for correct
> interpretation. They are just not commonly used. Several have internal
> structures which require correct parsing. This is
> inconvenient for the user especially when using the pull parser.

In this case this is a "rule for inclusion" ;-).
The ID should be included.

In my understanding most of them didn't have a structure, but in case 
they have structure and it is so different from header to header then 
maybe it worth adding everything to mime4j.

> So the question is whether ContentDesciptor should be a maximal or
> minimal description of the standard MIME content.

Maybe we should cover minimanl with the standard APIs and then provide 
utility classes for the maximal descriptions?

>> On the other hand I would say that mime4j should be easy to be extended
>> to support a more complete/specific interface when parsing messages.
> 
> Given several fields require parsing, if they aren't included in the
> ContentDescriptor interface then I think they need to be supported in
> some other way. Not sure quite how, though.

Not sure too.. what about a wrapper:

new Rfc1864ContentDescriptor(ContentDescriptor).getMD5()

or a static:

Rfc1864Helper.getContentMD5(ContentDescriptor) ?

Stefano


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] Extra methods For ContentDescriptor...?

Posted by Robert Burrell Donkin <ro...@gmail.com>.
On 5/26/08, Stefano Bagnara <ap...@bago.org> wrote:
> Robert Burrell Donkin ha scritto:
>> ATM this interface supports
>>
>>    String getMimeType();
>>    String getCharset();
>>    String getTransferEncoding();
>>
>> so it has Content-Type and Content-Transfer-Encoding but is missing
>> calls for information on the MIME-Version, Content-ID and
>> Content-Description headers defined in RFC2045. this information would
>> be useful for IMAP so i was wondering about the design of the API.
>> should ContentDescriptor aim to supply the standard headers found in
>> RFC2045?
>>
>> of course, this leads to the slippery slope: what about RFC 2183
>> (Content-Disposition),  RFC 3066 (LANGUAGE-TAGS), RFC 2557 (LOCATION)
>> and RFC 1864 (MD5)...?
>>
>> where should the API draw the line?
>
> This is an hard issue...
> IMO anything that is not required to correctly interpreting the mime
> source should be left to a getHeader(String headerName) and a bunch of
> constants for most used headers, so no, I would not add the Content-ID,
> Content-Description and other headers defined in other rfcs.

In theory ID (and some of the other examples) are required for correct
interpretation. They are just not commonly used. Several have internal
structures which require correct parsing. This is
inconvenient for the user especially when using the pull parser.

So the question is whether ContentDesciptor should be a maximal or
minimal description of the standard MIME content.
> On the other hand I would say that mime4j should be easy to be extended
> to support a more complete/specific interface when parsing messages.

Given several fields require parsing, if they aren't included in the
ContentDescriptor interface then I think they need to be supported in
some other way. Not sure quite how, though.

Robert

>
> Stefano
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] Extra methods For ContentDescriptor...?

Posted by Stefano Bagnara <ap...@bago.org>.
Robert Burrell Donkin ha scritto:
> ATM this interface supports
> 
>    String getMimeType();
>    String getCharset();
>    String getTransferEncoding();
> 
> so it has Content-Type and Content-Transfer-Encoding but is missing
> calls for information on the MIME-Version, Content-ID and
> Content-Description headers defined in RFC2045. this information would
> be useful for IMAP so i was wondering about the design of the API.
> should ContentDescriptor aim to supply the standard headers found in
> RFC2045?
> 
> of course, this leads to the slippery slope: what about RFC 2183
> (Content-Disposition),  RFC 3066 (LANGUAGE-TAGS), RFC 2557 (LOCATION)
> and RFC 1864 (MD5)...?
> 
> where should the API draw the line?

This is an hard issue...
IMO anything that is not required to correctly interpreting the mime 
source should be left to a getHeader(String headerName) and a bunch of 
constants for most used headers, so no, I would not add the Content-ID, 
Content-Description and other headers defined in other rfcs.

On the other hand I would say that mime4j should be easy to be extended 
to support a more complete/specific interface when parsing messages.

Stefano


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org