You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cxf.apache.org by Brice Dutheil <br...@gmail.com> on 2012/11/05 20:27:46 UTC

Multipart values are not trimed

Hi,

I'm crafting a resource that should accept multipart POST request.

Here's the method :

================================================
  @POST
  @Produces({MediaType.APPLICATION_JSON})
  @Consumes(MediaType.MULTIPART_FORM_DATA)
  public MetaData archive(@FormParam("title") String title,
                                  @FormParam("revision") String revision,
                                  @Multipart("archive") TemporaryBinaryFile
temporaryBinaryFile) {
================================================

Also I tried with @Multipart instead of @FormParam

================================================
  @POST
  @Produces({MediaType.APPLICATION_JSON})
  @Consumes(MediaType.MULTIPART_FORM_DATA)
  public DocumentMetaData archive(@Multipart(value = "title", required =
false) @FormParam("title") String title,
                                  @Multipart(value = "revision", required =
false) String revision,
                                  @Multipart("archive") TemporaryBinaryFile
temporaryBinaryFile) {
================================================

And here is the raw request :
================================================
Address: http://localhost:8080/api/v1.0/document/archive
Encoding: ISO-8859-1
Http-Method: POST
Content-Type: multipart/form-data;boundary=partie
Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,utf-8;q=0.7,*;q=0.3],
accept-encoding=[gzip,deflate,sdch], Content-Length=[301],
content-type=[multipart/form-data;boundary=partie]}
Payload:
--partie
Content-Disposition: form-data; name="title"
Content-ID: title

the.title
--partie
Content-Disposition: form-data; name="revision"
Content-ID: revision

some.revision
--partie
Content-Disposition: form-data; name="archive"; filename="file.txt"
Content-Type: text/plain

I've got a woman, way over town...
--partie
================================================

However the title and revision values are incorrect because they are ended
by a new line char '\n'. Hence these parameters are not validated by my
validator (which is using Message.getContent),

I don't think this is a normal behavior, but I might be wrong, maybe about
the specs, or my request. Note that I had to add the Content-ID when using
the Multipart annotation. Maybe there is something I should do ?


I have a workaround for that, I've made an interceptor whose role is to
trim strings. But I find it rather inelegant to do that.

Or am I missing something ?

Cheers
-- Brice

Re: Multipart values are not trimed

Posted by Brice Dutheil <br...@gmail.com>.
Got it working properly :)

Now thinking about the next thing, or specification change :(

Do you think it would possible to adapt the method

@POST
@Produces(MediaType.APPLICATION_JSON)
@Consumes({MediaType.MULTIPART_FORM_DATA})
public DocumentMetaData archive(
    @Multipart(value = "title", required = false) @DocumentTitle(optional =
true) String title,
    @Multipart(value = "tag", required = false)   @Revision(optional =
true)           String tag,
    @Multipart(value = "archive")                 @NotNull
       Attachment attachment) {

to allow the following request :


================================================
Address: http://localhost:8080/api/v1.0/document/archive
Encoding: UTF-8
Http-Method: POST
Content-Type: multipart/form-data;boundary=partie
Headers: {Accept=[*/*], content-type=[multipart/form-data;boundary=partie]}
Payload:
--partie
Content-Disposition: form-data; name="title_and_revision"
Content-Type: application/json
Content-ID: title_and_revision

{ "title" : "the.title", "revision : "some.revision" }
--partie
Content-Disposition: form-data; name="archive"; filename="file.txt"
Content-Type: text/plain

I've got a woman, way over town...
--partie
================================================

Didn't had time to explore many approaches, but I don't think it is
possible as while keeping the two parameters separate parameters title and
revision in the method signature.

I don't think CXF is "splitting the json's values" as different parameters
that can be matched in the method. Or should I need to proceed the same way
as the file attachement, ie to have two method parameters with the
Attachment type and annotated with @Multipart ?


Cheers,
-- Brice



On Tue, Nov 6, 2012 at 4:38 PM, Sergey Beryozkin <sb...@gmail.com>wrote:

> Hi
>
>>
>>>>>   Hi,
>>>>>
>>>>>>
>>>>>> To get a bigger picture let me explain what I would like to actually
>>>>>> craft :
>>>>>>
>>>>>> In a multipart POST request, I'd like to have form params and a file
>>>>>> attachement (like the example above). And I would like to handle
>>>>>> myself
>>>>>> the
>>>>>> inputstream of the file. In order do stuff like
>>>>>>     - checking some headers, for example Content-Length on one of the
>>>>>> Attachement, Content-Disposition etc
>>>>>>     - consuming the content of the given inputstream of this part to
>>>>>> store
>>>>>> it
>>>>>> in a temporary file
>>>>>>
>>>>>> However in the MessageBodyReader, the entityStream looks like it's
>>>>>> been
>>>>>> closed and already consumed. Debugging reveals that an
>>>>>> AttatchmentDeserializer already consumed the stream, and created an
>>>>>> Attachement collection, however my provider wasn't called at that
>>>>>> time.
>>>>>> If
>>>>>> the opportunity is available I would like to copy these bytes to
>>>>>> another
>>>>>> outputstream.
>>>>>>
>>>>>>    The provider for TemporaryBinaryFile is called later, when
>>>>>> individual
>>>>>>
>>>>>>  parts are deserialized.
>>>>>
>>>>>
>>>>>    Is it possible or should I use attachments ? I'd like as much as
>>>>> possible
>>>>>
>>>>>  avoid technical code in the resource, and have a reference to a
>>>>>>     TemporaryBinaryFile.
>>>>>>
>>>>>>
>>>>>>   You can use org.apache.cxf.jaxrs.ext.******multipart.Attachment
>>>>>> instead
>>>>>>
>>>>> of
>>>>>
>>>>> TemporaryBinaryFile, check Content-Type and Content-Disposition, and
>>>>> then
>>>>> do 'attachment.getObject(******TemporaryBinaryFile.class)':
>>>>>
>>>>>
>>>>>
>>>>> post(@Multipart("someid") Attachment attachment) {
>>>>>      attachment.getContentType();
>>>>>      attachment.******getContentDisposition();
>>>>>      attachment.getObject(******TemporaryBinaryFile.class)
>>>>>
>>>>>
>>>>> }
>>>>>
>>>>> Actually, you can optimize it slightly by adding a 'type' parameter to
>>>>> @Multipart(value = "someid", type = "text/plain")
>>>>>
>>>>>
>>>>>  Ok, thx for that :)
>>>> Do you think it will be possible to stream directly the content of the
>>>> attachment to another outputstream ? The attachment can have a large
>>>> size
>>>> like 20 MB maybe more, I'd like to keep memory consumption as low as
>>>> possible.
>>>>
>>>>   CXF will internally manage saving the stream to the temp folder if the
>>>>
>>> part is large.
>>>
>>> You can do
>>>
>>> attachment.getObject(****InputStream.class),
>>>
>>>
>>> in which case you will have to deal with InputStream directly or you can
>>> do it within your own TemporaryBinaryFile MBR when you do
>>>
>>> attachment.getObject(****TemporaryBinaryFile.class)
>>>
>>>
>> Fantastic :)
>> I would have preferred to have a avoid dealing with technical code in
>> direct way, so I will probably keep a reference to the inputStream in a
>> renamed StreamableBinaryFile.
>>
>> Is it possible to have the size of the attachment in a safer way than this
>> (if the Content-Length isn't present) ?
>>
>> ((AttachmentDataSource)
>> attachment.getDataHandler().**getDataSource()).cache.size()
>>
>> Note that the cache field would be accessed via reflexion.
>>
>>
> I think the better option, assuming you'd like to enforce a certain limit,
> is to use attachment-max-size property:
>
> http://cxf.apache.org/docs/**security.html#Security-**Multiparts<http://cxf.apache.org/docs/security.html#Security-Multiparts>
>
>
>>>>>>>
>>>>>>>> I'm crafting a resource that should accept multipart POST request.
>>>>>>>>
>>>>>>>> Here's the method :
>>>>>>>>
>>>>>>>> ==============================********==================
>>>>>>>>       @POST
>>>>>>>>       @Produces({MediaType.********APPLICATION_JSON})
>>>>>>>>       @Consumes(MediaType.MULTIPART_********FORM_DATA)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>       public MetaData archive(@FormParam("title") String title,
>>>>>>>>                                       @FormParam("revision") String
>>>>>>>> revision,
>>>>>>>>                                       @Multipart("archive")
>>>>>>>> TemporaryBinaryFile
>>>>>>>> temporaryBinaryFile) {
>>>>>>>> ==============================********==================
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Also I tried with @Multipart instead of @FormParam
>>>>>>>>
>>>>>>>> ==============================********==================
>>>>>>>>       @POST
>>>>>>>>       @Produces({MediaType.********APPLICATION_JSON})
>>>>>>>>       @Consumes(MediaType.MULTIPART_********FORM_DATA)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>       public DocumentMetaData archive(@Multipart(value = "title",
>>>>>>>> required =
>>>>>>>> false) @FormParam("title") String title,
>>>>>>>>                                       @Multipart(value = "revision",
>>>>>>>> required =
>>>>>>>> false) String revision,
>>>>>>>>                                       @Multipart("archive")
>>>>>>>> TemporaryBinaryFile
>>>>>>>> temporaryBinaryFile) {
>>>>>>>>
>>>>>>>>
>>>>>>>>   You have @FormParam and @Multipart attached to 'title', drop
>>>>>>>>
>>>>>>> @FormParam,
>>>>>>> I
>>>>>>> think it only works because 'title' is a simple parameter.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   Yes I wrongly copied/ modified the code in the mail, however I
>>>>>>> tested
>>>>>>>
>>>>>> both
>>>>>> setup separately.
>>>>>> Anyway, as you advised me I will inly use Multipart now.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>      ==============================********==================
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  And here is the raw request :
>>>>>>>> ==============================********==================
>>>>>>>> Address: http://localhost:8080/api/v1.********0/document/archive<http://localhost:8080/api/v1.******0/document/archive>
>>>>>>>> <http:/**/localhost:8080/api/v1.****0/**document/archive<http://localhost:8080/api/v1.****0/document/archive>
>>>>>>>> >
>>>>>>>> <http://**localhost:8080/api/**v1.**0/**document/archive<http**
>>>>>>>> ://localhost:8080/api/v1.**0/**document/archive<http://localhost:8080/api/v1.**0/document/archive>
>>>>>>>> >
>>>>>>>>
>>>>>>>>>
>>>>>>>>>  <http://**localhost:8080/api/****v1.0/**document/archive<http:**
>>>>>>>> /**
>>>>>>>>
>>>>>>>> /localhost:8080/api/v1.0/****document/archive<http://**
>>>>>>>> localhost:8080/api/v1.0/**document/archive<http://localhost:8080/api/v1.0/document/archive>
>>>>>>>> >
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>>   Encoding: ISO-8859-1
>>>>>>>>>
>>>>>>>> Http-Method: POST
>>>>>>>> Content-Type: multipart/form-data;boundary=********partie
>>>>>>>>
>>>>>>>>
>>>>>>>> Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,**
>>>>>>>> utf-8;q=0.7,*;q=0.3],
>>>>>>>> accept-encoding=[gzip,deflate,********sdch], Content-Length=[301],
>>>>>>>> content-type=[multipart/form-********data;boundary=partie]}
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Payload:
>>>>>>>> --partie
>>>>>>>> Content-Disposition: form-data; name="title"
>>>>>>>> Content-ID: title
>>>>>>>>
>>>>>>>> the.title
>>>>>>>> --partie
>>>>>>>> Content-Disposition: form-data; name="revision"
>>>>>>>> Content-ID: revision
>>>>>>>>
>>>>>>>> some.revision
>>>>>>>> --partie
>>>>>>>> Content-Disposition: form-data; name="archive"; filename="file.txt"
>>>>>>>> Content-Type: text/plain
>>>>>>>>
>>>>>>>> I've got a woman, way over town...
>>>>>>>> --partie
>>>>>>>> ==============================********==================
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> However the title and revision values are incorrect because they are
>>>>>>>> ended
>>>>>>>> by a new line char '\n'. Hence these parameters are not validated by
>>>>>>>> my
>>>>>>>> validator (which is using Message.getContent),
>>>>>>>>
>>>>>>>> I don't think this is a normal behavior, but I might be wrong, maybe
>>>>>>>> about
>>>>>>>> the specs, or my request. Note that I had to add the Content-ID when
>>>>>>>> using
>>>>>>>> the Multipart annotation.
>>>>>>>>
>>>>>>>>
>>>>>>>>   What CXF version is it ? Content-Disposition 'name' is definitely
>>>>>>>>
>>>>>>> checked
>>>>>>> too.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  Also I found part of the code that should check the
>>>>>> Content-Disposition,
>>>>>> however I have found that the first letter 'C' disappeared and the key
>>>>>> in
>>>>>> the attachment header is now 'ontent-Disposition' which can complicate
>>>>>> things further, and probably explains why, I needed a Content-ID
>>>>>> header
>>>>>> in
>>>>>> each part. Although the first part got his header Content-Disposition
>>>>>> always correctly decoded. Adding another new line after the boundary
>>>>>> fixes
>>>>>> looks like a workaround though, but i'd rather not impose this on the
>>>>>> API
>>>>>> users :/
>>>>>>
>>>>>> I couldn't figure out yet where the code could is consuming the
>>>>>> additional
>>>>>> char. I just know that at some point, the LazyAttachmentCollection has
>>>>>> the
>>>>>> remaining attachment (AttachmentImpl), and the first header is wrong.
>>>>>>
>>>>>>
>>>>>>   I think it is the bug of the code the posts the multipart, I recall
>>>>>>
>>>>> exactly the same issue reported when RESTClient was used
>>>>>
>>>>>
>>>>>  Isn't it this issue ? https://issues.apache.org/****
>>>> jira/browse/CXF-2704 <https://issues.apache.org/**jira/browse/CXF-2704>
>>>> <https://**issues.apache.org/jira/browse/**CXF-2704<https://issues.apache.org/jira/browse/CXF-2704>
>>>> >
>>>>
>>>>
>>> Looks like so, but I also do recall the same issue with RESTClient
>>> payloads
>>>
>>>
>>>
>>>>
>>>>    About Content-Disposition name, it is checked only if there is no
>>>>>
>>>>>> Content-ID, however it seems at some point the default Content-ID is
>>>>>> added "
>>>>>> root.message@cxf.apache.org", which defeats the purpose of the
>>>>>> following
>>>>>> code.
>>>>>>
>>>>>>        private static boolean *matchAttachmentId(Attachment at,
>>>>>> Multipart
>>>>>> mid,
>>>>>> MediaType multipartType)* {
>>>>>>            if (at.getContentId().equals(mid.******value())) {
>>>>>>
>>>>>>
>>>>>>                return true;
>>>>>>            }
>>>>>>            ContentDisposition cd = at.getContentDisposition();
>>>>>>            if (cd != null&&    mid.value().equals(cd.****
>>>>>>
>>>>>> getParameter("name")))
>>>>>>
>>>>>> {
>>>>>>                return true;
>>>>>>            }
>>>>>>            return false;
>>>>>>        }
>>>>>>
>>>>>>    default Content-ID is added on the output, it is not added during
>>>>>> the
>>>>>>
>>>>>>  read...
>>>>>
>>>>>
>>>>>  I'm not 100% sure how everything worked, but at some point the
>>>> MultipartProvider.readFrom is called from the
>>>> JAXRSUtils.****readFromMessageBodyReader, which will indirectly call
>>>> the
>>>>
>>>> above
>>>> code :
>>>>
>>>>       public Object *readFrom*(Class<Object>   c, Type t, Annotation[]
>>>> anns,
>>>>
>>>> MediaType mt,
>>>>                              MultivaluedMap<String, String>   headers,
>>>> InputStream is) throws IOException, WebApplicationException {
>>>>
>>>> // ...
>>>>
>>>>           Multipart id = AnnotationUtils.getAnnotation(****anns,
>>>> Multipart.class);
>>>>           Attachment multipart = *AttachmentUtils.getMultipart(****c,
>>>> id,
>>>>
>>>> mt,
>>>> infos)*;
>>>>
>>>>           if (multipart != null) {
>>>>               return fromAttachment(multipart, c, t, anns);
>>>>           } else if (id != null&&   !id.required()) {
>>>>
>>>>
>>>> // ...
>>>>
>>>>       }
>>>>
>>>>
>>>>
>>>>       public static Attachment getMultipart(Class<Object>   c,
>>>>                                             Multipart id,
>>>>                                             MediaType mt,
>>>>                                             List<Attachment>   infos)
>>>> throws
>>>> IOException {
>>>>
>>>>           if (id != null) {
>>>>               for (Attachment a : infos) {
>>>>                   if (*matchAttachmentId(a, id, mt)*) {
>>>>
>>>>                       checkMediaTypes(a.****getContentType(),
>>>> id.type());
>>>>
>>>>                       return a;
>>>>                   }
>>>>               }
>>>> // ...
>>>>       }
>>>>
>>>> I'm not sure of the implications, but it might be possible to fix this
>>>> with
>>>> the following code :
>>>>
>>>>       private static boolean matchAttachmentId(Attachment at, Multipart
>>>> mid,
>>>> MediaType multipartType) {
>>>>           ContentDisposition cd = at.getContentDisposition();
>>>>           boolean matchContentDispositionName = cd != null&&
>>>> mid.value().equals(cd.****getParameter("name"));
>>>>           boolean matchContentId = at.getContentId().equals(mid.****
>>>> value());
>>>>
>>>>           return matchContentId || matchContentDispositionName;
>>>>       }
>>>>
>>>>
>>>>  What exactly you are proposing to fix though ?
>>>
>>>
>> Damn, forgive me I stayed too long at work yesterday night and missed
>> things, that affected my mail this morning as well it seems ! I was
>> mistaken by the fact that the fist letter of the first header in the
>> second
>> and following attachment are missing, hence in my case Content-Disposition
>> isn't parsed by CXF.
>>
>> Anyway the above code works correctly. ....shame on me !
>>
>>
>> Again thank very much, I owe you a beer or two !
>>
>
> No problems at all :-), thanks for stressing the code :-)
>
> Cheers, Sergey
>
>
>>   Cheers
>> -- Brice
>>
>>
>
> --
> Sergey Beryozkin
>
> Talend Community Coders
> http://coders.talend.com/
>
> Blog: http://sberyozkin.blogspot.com
>

Re: Multipart values are not trimed

Posted by Sergey Beryozkin <sb...@gmail.com>.
Hi
>>>>
>>>>   Hi,
>>>>>
>>>>> To get a bigger picture let me explain what I would like to actually
>>>>> craft :
>>>>>
>>>>> In a multipart POST request, I'd like to have form params and a file
>>>>> attachement (like the example above). And I would like to handle myself
>>>>> the
>>>>> inputstream of the file. In order do stuff like
>>>>>     - checking some headers, for example Content-Length on one of the
>>>>> Attachement, Content-Disposition etc
>>>>>     - consuming the content of the given inputstream of this part to
>>>>> store
>>>>> it
>>>>> in a temporary file
>>>>>
>>>>> However in the MessageBodyReader, the entityStream looks like it's been
>>>>> closed and already consumed. Debugging reveals that an
>>>>> AttatchmentDeserializer already consumed the stream, and created an
>>>>> Attachement collection, however my provider wasn't called at that time.
>>>>> If
>>>>> the opportunity is available I would like to copy these bytes to another
>>>>> outputstream.
>>>>>
>>>>>    The provider for TemporaryBinaryFile is called later, when individual
>>>>>
>>>> parts are deserialized.
>>>>
>>>>
>>>>    Is it possible or should I use attachments ? I'd like as much as
>>>> possible
>>>>
>>>>> avoid technical code in the resource, and have a reference to a
>>>>>     TemporaryBinaryFile.
>>>>>
>>>>>
>>>>>   You can use org.apache.cxf.jaxrs.ext.****multipart.Attachment instead
>>>> of
>>>>
>>>> TemporaryBinaryFile, check Content-Type and Content-Disposition, and then
>>>> do 'attachment.getObject(****TemporaryBinaryFile.class)':
>>>>
>>>>
>>>> post(@Multipart("someid") Attachment attachment) {
>>>>      attachment.getContentType();
>>>>      attachment.****getContentDisposition();
>>>>      attachment.getObject(****TemporaryBinaryFile.class)
>>>>
>>>> }
>>>>
>>>> Actually, you can optimize it slightly by adding a 'type' parameter to
>>>> @Multipart(value = "someid", type = "text/plain")
>>>>
>>>>
>>> Ok, thx for that :)
>>> Do you think it will be possible to stream directly the content of the
>>> attachment to another outputstream ? The attachment can have a large size
>>> like 20 MB maybe more, I'd like to keep memory consumption as low as
>>> possible.
>>>
>>>   CXF will internally manage saving the stream to the temp folder if the
>> part is large.
>>
>> You can do
>>
>> attachment.getObject(**InputStream.class),
>>
>> in which case you will have to deal with InputStream directly or you can
>> do it within your own TemporaryBinaryFile MBR when you do
>>
>> attachment.getObject(**TemporaryBinaryFile.class)
>>
>
> Fantastic :)
> I would have preferred to have a avoid dealing with technical code in
> direct way, so I will probably keep a reference to the inputStream in a
> renamed StreamableBinaryFile.
>
> Is it possible to have the size of the attachment in a safer way than this
> (if the Content-Length isn't present) ?
>
> ((AttachmentDataSource)
> attachment.getDataHandler().getDataSource()).cache.size()
>
> Note that the cache field would be accessed via reflexion.
>

I think the better option, assuming you'd like to enforce a certain 
limit, is to use attachment-max-size property:

http://cxf.apache.org/docs/security.html#Security-Multiparts

>>>>>>
>>>>>>>
>>>>>>> I'm crafting a resource that should accept multipart POST request.
>>>>>>>
>>>>>>> Here's the method :
>>>>>>>
>>>>>>> ==============================******==================
>>>>>>>       @POST
>>>>>>>       @Produces({MediaType.******APPLICATION_JSON})
>>>>>>>       @Consumes(MediaType.MULTIPART_******FORM_DATA)
>>>>>>>
>>>>>>>
>>>>>>>       public MetaData archive(@FormParam("title") String title,
>>>>>>>                                       @FormParam("revision") String
>>>>>>> revision,
>>>>>>>                                       @Multipart("archive")
>>>>>>> TemporaryBinaryFile
>>>>>>> temporaryBinaryFile) {
>>>>>>> ==============================******==================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Also I tried with @Multipart instead of @FormParam
>>>>>>>
>>>>>>> ==============================******==================
>>>>>>>       @POST
>>>>>>>       @Produces({MediaType.******APPLICATION_JSON})
>>>>>>>       @Consumes(MediaType.MULTIPART_******FORM_DATA)
>>>>>>>
>>>>>>>
>>>>>>>       public DocumentMetaData archive(@Multipart(value = "title",
>>>>>>> required =
>>>>>>> false) @FormParam("title") String title,
>>>>>>>                                       @Multipart(value = "revision",
>>>>>>> required =
>>>>>>> false) String revision,
>>>>>>>                                       @Multipart("archive")
>>>>>>> TemporaryBinaryFile
>>>>>>> temporaryBinaryFile) {
>>>>>>>
>>>>>>>
>>>>>>>   You have @FormParam and @Multipart attached to 'title', drop
>>>>>> @FormParam,
>>>>>> I
>>>>>> think it only works because 'title' is a simple parameter.
>>>>>>
>>>>>>
>>>>>>
>>>>>>   Yes I wrongly copied/ modified the code in the mail, however I tested
>>>>> both
>>>>> setup separately.
>>>>> Anyway, as you advised me I will inly use Multipart now.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>      ==============================******==================
>>>>>>
>>>>>>
>>>>>>> And here is the raw request :
>>>>>>> ==============================******==================
>>>>>>> Address: http://localhost:8080/api/v1.******0/document/archive<http://localhost:8080/api/v1.****0/document/archive>
>>>>>>> <http://**localhost:8080/api/v1.**0/**document/archive<http://localhost:8080/api/v1.**0/document/archive>
>>>>>>>>
>>>>>>> <http://**localhost:8080/api/**v1.0/**document/archive<http:/**
>>>>>>> /localhost:8080/api/v1.0/**document/archive<http://localhost:8080/api/v1.0/document/archive>
>>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>   Encoding: ISO-8859-1
>>>>>>> Http-Method: POST
>>>>>>> Content-Type: multipart/form-data;boundary=******partie
>>>>>>>
>>>>>>> Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,**
>>>>>>> utf-8;q=0.7,*;q=0.3],
>>>>>>> accept-encoding=[gzip,deflate,******sdch], Content-Length=[301],
>>>>>>> content-type=[multipart/form-******data;boundary=partie]}
>>>>>>>
>>>>>>>
>>>>>>> Payload:
>>>>>>> --partie
>>>>>>> Content-Disposition: form-data; name="title"
>>>>>>> Content-ID: title
>>>>>>>
>>>>>>> the.title
>>>>>>> --partie
>>>>>>> Content-Disposition: form-data; name="revision"
>>>>>>> Content-ID: revision
>>>>>>>
>>>>>>> some.revision
>>>>>>> --partie
>>>>>>> Content-Disposition: form-data; name="archive"; filename="file.txt"
>>>>>>> Content-Type: text/plain
>>>>>>>
>>>>>>> I've got a woman, way over town...
>>>>>>> --partie
>>>>>>> ==============================******==================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> However the title and revision values are incorrect because they are
>>>>>>> ended
>>>>>>> by a new line char '\n'. Hence these parameters are not validated by
>>>>>>> my
>>>>>>> validator (which is using Message.getContent),
>>>>>>>
>>>>>>> I don't think this is a normal behavior, but I might be wrong, maybe
>>>>>>> about
>>>>>>> the specs, or my request. Note that I had to add the Content-ID when
>>>>>>> using
>>>>>>> the Multipart annotation.
>>>>>>>
>>>>>>>
>>>>>>>   What CXF version is it ? Content-Disposition 'name' is definitely
>>>>>> checked
>>>>>> too.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> Also I found part of the code that should check the Content-Disposition,
>>>>> however I have found that the first letter 'C' disappeared and the key
>>>>> in
>>>>> the attachment header is now 'ontent-Disposition' which can complicate
>>>>> things further, and probably explains why, I needed a Content-ID header
>>>>> in
>>>>> each part. Although the first part got his header Content-Disposition
>>>>> always correctly decoded. Adding another new line after the boundary
>>>>> fixes
>>>>> looks like a workaround though, but i'd rather not impose this on the
>>>>> API
>>>>> users :/
>>>>>
>>>>> I couldn't figure out yet where the code could is consuming the
>>>>> additional
>>>>> char. I just know that at some point, the LazyAttachmentCollection has
>>>>> the
>>>>> remaining attachment (AttachmentImpl), and the first header is wrong.
>>>>>
>>>>>
>>>>>   I think it is the bug of the code the posts the multipart, I recall
>>>> exactly the same issue reported when RESTClient was used
>>>>
>>>>
>>> Isn't it this issue ? https://issues.apache.org/**jira/browse/CXF-2704<https://issues.apache.org/jira/browse/CXF-2704>
>>>
>>
>> Looks like so, but I also do recall the same issue with RESTClient payloads
>>
>>
>>>
>>>
>>>>   About Content-Disposition name, it is checked only if there is no
>>>>> Content-ID, however it seems at some point the default Content-ID is
>>>>> added "
>>>>> root.message@cxf.apache.org", which defeats the purpose of the
>>>>> following
>>>>> code.
>>>>>
>>>>>        private static boolean *matchAttachmentId(Attachment at, Multipart
>>>>> mid,
>>>>> MediaType multipartType)* {
>>>>>            if (at.getContentId().equals(mid.****value())) {
>>>>>
>>>>>                return true;
>>>>>            }
>>>>>            ContentDisposition cd = at.getContentDisposition();
>>>>>            if (cd != null&&    mid.value().equals(cd.****
>>>>> getParameter("name")))
>>>>>
>>>>> {
>>>>>                return true;
>>>>>            }
>>>>>            return false;
>>>>>        }
>>>>>
>>>>>    default Content-ID is added on the output, it is not added during the
>>>>>
>>>> read...
>>>>
>>>>
>>> I'm not 100% sure how everything worked, but at some point the
>>> MultipartProvider.readFrom is called from the
>>> JAXRSUtils.**readFromMessageBodyReader, which will indirectly call the
>>> above
>>> code :
>>>
>>>       public Object *readFrom*(Class<Object>   c, Type t, Annotation[] anns,
>>>
>>> MediaType mt,
>>>                              MultivaluedMap<String, String>   headers,
>>> InputStream is) throws IOException, WebApplicationException {
>>>
>>> // ...
>>>
>>>           Multipart id = AnnotationUtils.getAnnotation(**anns,
>>> Multipart.class);
>>>           Attachment multipart = *AttachmentUtils.getMultipart(**c, id,
>>> mt,
>>> infos)*;
>>>
>>>           if (multipart != null) {
>>>               return fromAttachment(multipart, c, t, anns);
>>>           } else if (id != null&&   !id.required()) {
>>>
>>>
>>> // ...
>>>
>>>       }
>>>
>>>
>>>
>>>       public static Attachment getMultipart(Class<Object>   c,
>>>                                             Multipart id,
>>>                                             MediaType mt,
>>>                                             List<Attachment>   infos) throws
>>> IOException {
>>>
>>>           if (id != null) {
>>>               for (Attachment a : infos) {
>>>                   if (*matchAttachmentId(a, id, mt)*) {
>>>
>>>                       checkMediaTypes(a.**getContentType(), id.type());
>>>                       return a;
>>>                   }
>>>               }
>>> // ...
>>>       }
>>>
>>> I'm not sure of the implications, but it might be possible to fix this
>>> with
>>> the following code :
>>>
>>>       private static boolean matchAttachmentId(Attachment at, Multipart
>>> mid,
>>> MediaType multipartType) {
>>>           ContentDisposition cd = at.getContentDisposition();
>>>           boolean matchContentDispositionName = cd != null&&
>>> mid.value().equals(cd.**getParameter("name"));
>>>           boolean matchContentId = at.getContentId().equals(mid.**
>>> value());
>>>
>>>           return matchContentId || matchContentDispositionName;
>>>       }
>>>
>>>
>> What exactly you are proposing to fix though ?
>>
>
> Damn, forgive me I stayed too long at work yesterday night and missed
> things, that affected my mail this morning as well it seems ! I was
> mistaken by the fact that the fist letter of the first header in the second
> and following attachment are missing, hence in my case Content-Disposition
> isn't parsed by CXF.
>
> Anyway the above code works correctly. ....shame on me !
>
>
> Again thank very much, I owe you a beer or two !

No problems at all :-), thanks for stressing the code :-)

Cheers, Sergey

>
>   Cheers
> -- Brice
>


-- 
Sergey Beryozkin

Talend Community Coders
http://coders.talend.com/

Blog: http://sberyozkin.blogspot.com

Re: Multipart values are not trimed

Posted by Brice Dutheil <br...@gmail.com>.
On Tue, Nov 6, 2012 at 12:48 PM, Sergey Beryozkin <sb...@gmail.com>wrote:

> Hi,
>
> On 06/11/12 10:50, Brice Dutheil wrote:
>
>> -- Brice
>>
>>
>>
>> On Tue, Nov 6, 2012 at 11:09 AM, Sergey Beryozkin<sberyozkin@gmail.com**
>> >wrote:
>>
>>  Hi
>>>
>>> On 05/11/12 23:00, Brice Dutheil wrote:
>>>
>>>  Hi,
>>>>
>>>> To get a bigger picture let me explain what I would like to actually
>>>> craft :
>>>>
>>>> In a multipart POST request, I'd like to have form params and a file
>>>> attachement (like the example above). And I would like to handle myself
>>>> the
>>>> inputstream of the file. In order do stuff like
>>>>    - checking some headers, for example Content-Length on one of the
>>>> Attachement, Content-Disposition etc
>>>>    - consuming the content of the given inputstream of this part to
>>>> store
>>>> it
>>>> in a temporary file
>>>>
>>>> However in the MessageBodyReader, the entityStream looks like it's been
>>>> closed and already consumed. Debugging reveals that an
>>>> AttatchmentDeserializer already consumed the stream, and created an
>>>> Attachement collection, however my provider wasn't called at that time.
>>>> If
>>>> the opportunity is available I would like to copy these bytes to another
>>>> outputstream.
>>>>
>>>>   The provider for TemporaryBinaryFile is called later, when individual
>>>>
>>> parts are deserialized.
>>>
>>>
>>>   Is it possible or should I use attachments ? I'd like as much as
>>> possible
>>>
>>>> avoid technical code in the resource, and have a reference to a
>>>>    TemporaryBinaryFile.
>>>>
>>>>
>>>>  You can use org.apache.cxf.jaxrs.ext.****multipart.Attachment instead
>>> of
>>>
>>> TemporaryBinaryFile, check Content-Type and Content-Disposition, and then
>>> do 'attachment.getObject(****TemporaryBinaryFile.class)':
>>>
>>>
>>> post(@Multipart("someid") Attachment attachment) {
>>>     attachment.getContentType();
>>>     attachment.****getContentDisposition();
>>>     attachment.getObject(****TemporaryBinaryFile.class)
>>>
>>> }
>>>
>>> Actually, you can optimize it slightly by adding a 'type' parameter to
>>> @Multipart(value = "someid", type = "text/plain")
>>>
>>>
>> Ok, thx for that :)
>> Do you think it will be possible to stream directly the content of the
>> attachment to another outputstream ? The attachment can have a large size
>> like 20 MB maybe more, I'd like to keep memory consumption as low as
>> possible.
>>
>>  CXF will internally manage saving the stream to the temp folder if the
> part is large.
>
> You can do
>
> attachment.getObject(**InputStream.class),
>
> in which case you will have to deal with InputStream directly or you can
> do it within your own TemporaryBinaryFile MBR when you do
>
> attachment.getObject(**TemporaryBinaryFile.class)
>

Fantastic :)
I would have preferred to have a avoid dealing with technical code in
direct way, so I will probably keep a reference to the inputStream in a
renamed StreamableBinaryFile.

Is it possible to have the size of the attachment in a safer way than this
(if the Content-Length isn't present) ?

((AttachmentDataSource)
attachment.getDataHandler().getDataSource()).cache.size()

Note that the cache field would be accessed via reflexion.



>
>
>
>
>>
>>
>>  More comments below
>>>
>>>
>>>
>>>>
>>>> Here's my comment on Content-Disposition :
>>>>
>>>>
>>>>
>>>> On Mon, Nov 5, 2012 at 11:17 PM, Sergey Beryozkin<sberyozkin@gmail.com*
>>>> ***
>>>>
>>>>> wrote:
>>>>>
>>>>
>>>>   Hi
>>>>
>>>>>
>>>>>
>>>>> On 05/11/12 19:27, Brice Dutheil wrote:
>>>>>
>>>>>   Hi,
>>>>>
>>>>>>
>>>>>> I'm crafting a resource that should accept multipart POST request.
>>>>>>
>>>>>> Here's the method :
>>>>>>
>>>>>> ==============================******==================
>>>>>>      @POST
>>>>>>      @Produces({MediaType.******APPLICATION_JSON})
>>>>>>      @Consumes(MediaType.MULTIPART_******FORM_DATA)
>>>>>>
>>>>>>
>>>>>>      public MetaData archive(@FormParam("title") String title,
>>>>>>                                      @FormParam("revision") String
>>>>>> revision,
>>>>>>                                      @Multipart("archive")
>>>>>> TemporaryBinaryFile
>>>>>> temporaryBinaryFile) {
>>>>>> ==============================******==================
>>>>>>
>>>>>>
>>>>>>
>>>>>> Also I tried with @Multipart instead of @FormParam
>>>>>>
>>>>>> ==============================******==================
>>>>>>      @POST
>>>>>>      @Produces({MediaType.******APPLICATION_JSON})
>>>>>>      @Consumes(MediaType.MULTIPART_******FORM_DATA)
>>>>>>
>>>>>>
>>>>>>      public DocumentMetaData archive(@Multipart(value = "title",
>>>>>> required =
>>>>>> false) @FormParam("title") String title,
>>>>>>                                      @Multipart(value = "revision",
>>>>>> required =
>>>>>> false) String revision,
>>>>>>                                      @Multipart("archive")
>>>>>> TemporaryBinaryFile
>>>>>> temporaryBinaryFile) {
>>>>>>
>>>>>>
>>>>>>  You have @FormParam and @Multipart attached to 'title', drop
>>>>> @FormParam,
>>>>> I
>>>>> think it only works because 'title' is a simple parameter.
>>>>>
>>>>>
>>>>>
>>>>>  Yes I wrongly copied/ modified the code in the mail, however I tested
>>>> both
>>>> setup separately.
>>>> Anyway, as you advised me I will inly use Multipart now.
>>>>
>>>>
>>>>
>>>>
>>>>     ==============================******==================
>>>>>
>>>>>
>>>>>> And here is the raw request :
>>>>>> ==============================******==================
>>>>>> Address: http://localhost:8080/api/v1.******0/document/archive<http://localhost:8080/api/v1.****0/document/archive>
>>>>>> <http://**localhost:8080/api/v1.**0/**document/archive<http://localhost:8080/api/v1.**0/document/archive>
>>>>>> >
>>>>>> <http://**localhost:8080/api/**v1.0/**document/archive<http:/**
>>>>>> /localhost:8080/api/v1.0/**document/archive<http://localhost:8080/api/v1.0/document/archive>
>>>>>> >
>>>>>>
>>>>>>>
>>>>>>>  Encoding: ISO-8859-1
>>>>>> Http-Method: POST
>>>>>> Content-Type: multipart/form-data;boundary=******partie
>>>>>>
>>>>>> Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,**
>>>>>> utf-8;q=0.7,*;q=0.3],
>>>>>> accept-encoding=[gzip,deflate,******sdch], Content-Length=[301],
>>>>>> content-type=[multipart/form-******data;boundary=partie]}
>>>>>>
>>>>>>
>>>>>> Payload:
>>>>>> --partie
>>>>>> Content-Disposition: form-data; name="title"
>>>>>> Content-ID: title
>>>>>>
>>>>>> the.title
>>>>>> --partie
>>>>>> Content-Disposition: form-data; name="revision"
>>>>>> Content-ID: revision
>>>>>>
>>>>>> some.revision
>>>>>> --partie
>>>>>> Content-Disposition: form-data; name="archive"; filename="file.txt"
>>>>>> Content-Type: text/plain
>>>>>>
>>>>>> I've got a woman, way over town...
>>>>>> --partie
>>>>>> ==============================******==================
>>>>>>
>>>>>>
>>>>>>
>>>>>> However the title and revision values are incorrect because they are
>>>>>> ended
>>>>>> by a new line char '\n'. Hence these parameters are not validated by
>>>>>> my
>>>>>> validator (which is using Message.getContent),
>>>>>>
>>>>>> I don't think this is a normal behavior, but I might be wrong, maybe
>>>>>> about
>>>>>> the specs, or my request. Note that I had to add the Content-ID when
>>>>>> using
>>>>>> the Multipart annotation.
>>>>>>
>>>>>>
>>>>>>  What CXF version is it ? Content-Disposition 'name' is definitely
>>>>> checked
>>>>> too.
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Also I found part of the code that should check the Content-Disposition,
>>>> however I have found that the first letter 'C' disappeared and the key
>>>> in
>>>> the attachment header is now 'ontent-Disposition' which can complicate
>>>> things further, and probably explains why, I needed a Content-ID header
>>>> in
>>>> each part. Although the first part got his header Content-Disposition
>>>> always correctly decoded. Adding another new line after the boundary
>>>> fixes
>>>> looks like a workaround though, but i'd rather not impose this on the
>>>> API
>>>> users :/
>>>>
>>>> I couldn't figure out yet where the code could is consuming the
>>>> additional
>>>> char. I just know that at some point, the LazyAttachmentCollection has
>>>> the
>>>> remaining attachment (AttachmentImpl), and the first header is wrong.
>>>>
>>>>
>>>>  I think it is the bug of the code the posts the multipart, I recall
>>> exactly the same issue reported when RESTClient was used
>>>
>>>
>> Isn't it this issue ? https://issues.apache.org/**jira/browse/CXF-2704<https://issues.apache.org/jira/browse/CXF-2704>
>>
>
> Looks like so, but I also do recall the same issue with RESTClient payloads
>
>
>>
>>
>>>  About Content-Disposition name, it is checked only if there is no
>>>> Content-ID, however it seems at some point the default Content-ID is
>>>> added "
>>>> root.message@cxf.apache.org", which defeats the purpose of the
>>>> following
>>>> code.
>>>>
>>>>       private static boolean *matchAttachmentId(Attachment at, Multipart
>>>> mid,
>>>> MediaType multipartType)* {
>>>>           if (at.getContentId().equals(mid.****value())) {
>>>>
>>>>               return true;
>>>>           }
>>>>           ContentDisposition cd = at.getContentDisposition();
>>>>           if (cd != null&&   mid.value().equals(cd.****
>>>> getParameter("name")))
>>>>
>>>> {
>>>>               return true;
>>>>           }
>>>>           return false;
>>>>       }
>>>>
>>>>   default Content-ID is added on the output, it is not added during the
>>>>
>>> read...
>>>
>>>
>> I'm not 100% sure how everything worked, but at some point the
>> MultipartProvider.readFrom is called from the
>> JAXRSUtils.**readFromMessageBodyReader, which will indirectly call the
>> above
>> code :
>>
>>      public Object *readFrom*(Class<Object>  c, Type t, Annotation[] anns,
>>
>> MediaType mt,
>>                             MultivaluedMap<String, String>  headers,
>> InputStream is) throws IOException, WebApplicationException {
>>
>> // ...
>>
>>          Multipart id = AnnotationUtils.getAnnotation(**anns,
>> Multipart.class);
>>          Attachment multipart = *AttachmentUtils.getMultipart(**c, id,
>> mt,
>> infos)*;
>>
>>          if (multipart != null) {
>>              return fromAttachment(multipart, c, t, anns);
>>          } else if (id != null&&  !id.required()) {
>>
>>
>> // ...
>>
>>      }
>>
>>
>>
>>      public static Attachment getMultipart(Class<Object>  c,
>>                                            Multipart id,
>>                                            MediaType mt,
>>                                            List<Attachment>  infos) throws
>> IOException {
>>
>>          if (id != null) {
>>              for (Attachment a : infos) {
>>                  if (*matchAttachmentId(a, id, mt)*) {
>>
>>                      checkMediaTypes(a.**getContentType(), id.type());
>>                      return a;
>>                  }
>>              }
>> // ...
>>      }
>>
>> I'm not sure of the implications, but it might be possible to fix this
>> with
>> the following code :
>>
>>      private static boolean matchAttachmentId(Attachment at, Multipart
>> mid,
>> MediaType multipartType) {
>>          ContentDisposition cd = at.getContentDisposition();
>>          boolean matchContentDispositionName = cd != null&&
>> mid.value().equals(cd.**getParameter("name"));
>>          boolean matchContentId = at.getContentId().equals(mid.**
>> value());
>>
>>          return matchContentId || matchContentDispositionName;
>>      }
>>
>>
> What exactly you are proposing to fix though ?
>

Damn, forgive me I stayed too long at work yesterday night and missed
things, that affected my mail this morning as well it seems ! I was
mistaken by the fact that the fist letter of the first header in the second
and following attachment are missing, hence in my case Content-Disposition
isn't parsed by CXF.

Anyway the above code works correctly. ....shame on me !


Again thank very much, I owe you a beer or two !

 Cheers
-- Brice

Re: Multipart values are not trimed

Posted by Sergey Beryozkin <sb...@gmail.com>.
Hi,
On 06/11/12 10:50, Brice Dutheil wrote:
> -- Brice
>
>
>
> On Tue, Nov 6, 2012 at 11:09 AM, Sergey Beryozkin<sb...@gmail.com>wrote:
>
>> Hi
>>
>> On 05/11/12 23:00, Brice Dutheil wrote:
>>
>>> Hi,
>>>
>>> To get a bigger picture let me explain what I would like to actually
>>> craft :
>>>
>>> In a multipart POST request, I'd like to have form params and a file
>>> attachement (like the example above). And I would like to handle myself
>>> the
>>> inputstream of the file. In order do stuff like
>>>    - checking some headers, for example Content-Length on one of the
>>> Attachement, Content-Disposition etc
>>>    - consuming the content of the given inputstream of this part to store
>>> it
>>> in a temporary file
>>>
>>> However in the MessageBodyReader, the entityStream looks like it's been
>>> closed and already consumed. Debugging reveals that an
>>> AttatchmentDeserializer already consumed the stream, and created an
>>> Attachement collection, however my provider wasn't called at that time. If
>>> the opportunity is available I would like to copy these bytes to another
>>> outputstream.
>>>
>>>   The provider for TemporaryBinaryFile is called later, when individual
>> parts are deserialized.
>>
>>
>>   Is it possible or should I use attachments ? I'd like as much as possible
>>> avoid technical code in the resource, and have a reference to a
>>>    TemporaryBinaryFile.
>>>
>>>
>> You can use org.apache.cxf.jaxrs.ext.**multipart.Attachment instead of
>> TemporaryBinaryFile, check Content-Type and Content-Disposition, and then
>> do 'attachment.getObject(**TemporaryBinaryFile.class)':
>>
>> post(@Multipart("someid") Attachment attachment) {
>>     attachment.getContentType();
>>     attachment.**getContentDisposition();
>>     attachment.getObject(**TemporaryBinaryFile.class)
>> }
>>
>> Actually, you can optimize it slightly by adding a 'type' parameter to
>> @Multipart(value = "someid", type = "text/plain")
>>
>
> Ok, thx for that :)
> Do you think it will be possible to stream directly the content of the
> attachment to another outputstream ? The attachment can have a large size
> like 20 MB maybe more, I'd like to keep memory consumption as low as
> possible.
>
CXF will internally manage saving the stream to the temp folder if the 
part is large.

You can do

attachment.getObject(InputStream.class),

in which case you will have to deal with InputStream directly or you can 
do it within your own TemporaryBinaryFile MBR when you do

attachment.getObject(TemporaryBinaryFile.class)



>
>
>
>> More comments below
>>
>>
>>>
>>>
>>> Here's my comment on Content-Disposition :
>>>
>>>
>>>
>>> On Mon, Nov 5, 2012 at 11:17 PM, Sergey Beryozkin<sberyozkin@gmail.com**
>>>> wrote:
>>>
>>>   Hi
>>>>
>>>>
>>>> On 05/11/12 19:27, Brice Dutheil wrote:
>>>>
>>>>   Hi,
>>>>>
>>>>> I'm crafting a resource that should accept multipart POST request.
>>>>>
>>>>> Here's the method :
>>>>>
>>>>> ==============================****==================
>>>>>      @POST
>>>>>      @Produces({MediaType.****APPLICATION_JSON})
>>>>>      @Consumes(MediaType.MULTIPART_****FORM_DATA)
>>>>>
>>>>>      public MetaData archive(@FormParam("title") String title,
>>>>>                                      @FormParam("revision") String
>>>>> revision,
>>>>>                                      @Multipart("archive")
>>>>> TemporaryBinaryFile
>>>>> temporaryBinaryFile) {
>>>>> ==============================****==================
>>>>>
>>>>>
>>>>> Also I tried with @Multipart instead of @FormParam
>>>>>
>>>>> ==============================****==================
>>>>>      @POST
>>>>>      @Produces({MediaType.****APPLICATION_JSON})
>>>>>      @Consumes(MediaType.MULTIPART_****FORM_DATA)
>>>>>
>>>>>      public DocumentMetaData archive(@Multipart(value = "title",
>>>>> required =
>>>>> false) @FormParam("title") String title,
>>>>>                                      @Multipart(value = "revision",
>>>>> required =
>>>>> false) String revision,
>>>>>                                      @Multipart("archive")
>>>>> TemporaryBinaryFile
>>>>> temporaryBinaryFile) {
>>>>>
>>>>>
>>>> You have @FormParam and @Multipart attached to 'title', drop @FormParam,
>>>> I
>>>> think it only works because 'title' is a simple parameter.
>>>>
>>>>
>>>>
>>> Yes I wrongly copied/ modified the code in the mail, however I tested both
>>> setup separately.
>>> Anyway, as you advised me I will inly use Multipart now.
>>>
>>>
>>>
>>>
>>>>    ==============================****==================
>>>>
>>>>>
>>>>> And here is the raw request :
>>>>> ==============================****==================
>>>>> Address: http://localhost:8080/api/v1.****0/document/archive<http://localhost:8080/api/v1.**0/document/archive>
>>>>> <http://**localhost:8080/api/v1.0/**document/archive<http://localhost:8080/api/v1.0/document/archive>
>>>>>>
>>>>> Encoding: ISO-8859-1
>>>>> Http-Method: POST
>>>>> Content-Type: multipart/form-data;boundary=****partie
>>>>> Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,**
>>>>> utf-8;q=0.7,*;q=0.3],
>>>>> accept-encoding=[gzip,deflate,****sdch], Content-Length=[301],
>>>>> content-type=[multipart/form-****data;boundary=partie]}
>>>>>
>>>>> Payload:
>>>>> --partie
>>>>> Content-Disposition: form-data; name="title"
>>>>> Content-ID: title
>>>>>
>>>>> the.title
>>>>> --partie
>>>>> Content-Disposition: form-data; name="revision"
>>>>> Content-ID: revision
>>>>>
>>>>> some.revision
>>>>> --partie
>>>>> Content-Disposition: form-data; name="archive"; filename="file.txt"
>>>>> Content-Type: text/plain
>>>>>
>>>>> I've got a woman, way over town...
>>>>> --partie
>>>>> ==============================****==================
>>>>>
>>>>>
>>>>> However the title and revision values are incorrect because they are
>>>>> ended
>>>>> by a new line char '\n'. Hence these parameters are not validated by my
>>>>> validator (which is using Message.getContent),
>>>>>
>>>>> I don't think this is a normal behavior, but I might be wrong, maybe
>>>>> about
>>>>> the specs, or my request. Note that I had to add the Content-ID when
>>>>> using
>>>>> the Multipart annotation.
>>>>>
>>>>>
>>>> What CXF version is it ? Content-Disposition 'name' is definitely checked
>>>> too.
>>>>
>>>>
>>>>
>>>
>>> Also I found part of the code that should check the Content-Disposition,
>>> however I have found that the first letter 'C' disappeared and the key in
>>> the attachment header is now 'ontent-Disposition' which can complicate
>>> things further, and probably explains why, I needed a Content-ID header in
>>> each part. Although the first part got his header Content-Disposition
>>> always correctly decoded. Adding another new line after the boundary fixes
>>> looks like a workaround though, but i'd rather not impose this on the API
>>> users :/
>>>
>>> I couldn't figure out yet where the code could is consuming the additional
>>> char. I just know that at some point, the LazyAttachmentCollection has the
>>> remaining attachment (AttachmentImpl), and the first header is wrong.
>>>
>>>
>> I think it is the bug of the code the posts the multipart, I recall
>> exactly the same issue reported when RESTClient was used
>>
>
> Isn't it this issue ? https://issues.apache.org/jira/browse/CXF-2704

Looks like so, but I also do recall the same issue with RESTClient payloads

>
>
>>
>>> About Content-Disposition name, it is checked only if there is no
>>> Content-ID, however it seems at some point the default Content-ID is
>>> added "
>>> root.message@cxf.apache.org", which defeats the purpose of the following
>>> code.
>>>
>>>       private static boolean *matchAttachmentId(Attachment at, Multipart
>>> mid,
>>> MediaType multipartType)* {
>>>           if (at.getContentId().equals(mid.**value())) {
>>>               return true;
>>>           }
>>>           ContentDisposition cd = at.getContentDisposition();
>>>           if (cd != null&&   mid.value().equals(cd.**getParameter("name")))
>>> {
>>>               return true;
>>>           }
>>>           return false;
>>>       }
>>>
>>>   default Content-ID is added on the output, it is not added during the
>> read...
>>
>
> I'm not 100% sure how everything worked, but at some point the
> MultipartProvider.readFrom is called from the
> JAXRSUtils.readFromMessageBodyReader, which will indirectly call the above
> code :
>
>      public Object *readFrom*(Class<Object>  c, Type t, Annotation[] anns,
> MediaType mt,
>                             MultivaluedMap<String, String>  headers,
> InputStream is) throws IOException, WebApplicationException {
>
> // ...
>
>          Multipart id = AnnotationUtils.getAnnotation(anns, Multipart.class);
>          Attachment multipart = *AttachmentUtils.getMultipart(c, id, mt,
> infos)*;
>          if (multipart != null) {
>              return fromAttachment(multipart, c, t, anns);
>          } else if (id != null&&  !id.required()) {
>
> // ...
>
>      }
>
>
>
>      public static Attachment getMultipart(Class<Object>  c,
>                                            Multipart id,
>                                            MediaType mt,
>                                            List<Attachment>  infos) throws
> IOException {
>
>          if (id != null) {
>              for (Attachment a : infos) {
>                  if (*matchAttachmentId(a, id, mt)*) {
>                      checkMediaTypes(a.getContentType(), id.type());
>                      return a;
>                  }
>              }
> // ...
>      }
>
> I'm not sure of the implications, but it might be possible to fix this with
> the following code :
>
>      private static boolean matchAttachmentId(Attachment at, Multipart mid,
> MediaType multipartType) {
>          ContentDisposition cd = at.getContentDisposition();
>          boolean matchContentDispositionName = cd != null&&
> mid.value().equals(cd.getParameter("name"));
>          boolean matchContentId = at.getContentId().equals(mid.value());
>
>          return matchContentId || matchContentDispositionName;
>      }
>

What exactly you are proposing to fix though ?

Cheers, Sergey

>
> Many thanx for your support !
>
>
> Cheers
> --Brice
>

Re: Multipart values are not trimed

Posted by Brice Dutheil <br...@gmail.com>.
-- Brice



On Tue, Nov 6, 2012 at 11:09 AM, Sergey Beryozkin <sb...@gmail.com>wrote:

> Hi
>
> On 05/11/12 23:00, Brice Dutheil wrote:
>
>> Hi,
>>
>> To get a bigger picture let me explain what I would like to actually
>> craft :
>>
>> In a multipart POST request, I'd like to have form params and a file
>> attachement (like the example above). And I would like to handle myself
>> the
>> inputstream of the file. In order do stuff like
>>   - checking some headers, for example Content-Length on one of the
>> Attachement, Content-Disposition etc
>>   - consuming the content of the given inputstream of this part to store
>> it
>> in a temporary file
>>
>> However in the MessageBodyReader, the entityStream looks like it's been
>> closed and already consumed. Debugging reveals that an
>> AttatchmentDeserializer already consumed the stream, and created an
>> Attachement collection, however my provider wasn't called at that time. If
>> the opportunity is available I would like to copy these bytes to another
>> outputstream.
>>
>>  The provider for TemporaryBinaryFile is called later, when individual
> parts are deserialized.
>
>
>  Is it possible or should I use attachments ? I'd like as much as possible
>> avoid technical code in the resource, and have a reference to a
>>   TemporaryBinaryFile.
>>
>>
> You can use org.apache.cxf.jaxrs.ext.**multipart.Attachment instead of
> TemporaryBinaryFile, check Content-Type and Content-Disposition, and then
> do 'attachment.getObject(**TemporaryBinaryFile.class)':
>
> post(@Multipart("someid") Attachment attachment) {
>    attachment.getContentType();
>    attachment.**getContentDisposition();
>    attachment.getObject(**TemporaryBinaryFile.class)
> }
>
> Actually, you can optimize it slightly by adding a 'type' parameter to
> @Multipart(value = "someid", type = "text/plain")
>

Ok, thx for that :)
Do you think it will be possible to stream directly the content of the
attachment to another outputstream ? The attachment can have a large size
like 20 MB maybe more, I'd like to keep memory consumption as low as
possible.




> More comments below
>
>
>>
>>
>> Here's my comment on Content-Disposition :
>>
>>
>>
>> On Mon, Nov 5, 2012 at 11:17 PM, Sergey Beryozkin<sberyozkin@gmail.com**
>> >wrote:
>>
>>  Hi
>>>
>>>
>>> On 05/11/12 19:27, Brice Dutheil wrote:
>>>
>>>  Hi,
>>>>
>>>> I'm crafting a resource that should accept multipart POST request.
>>>>
>>>> Here's the method :
>>>>
>>>> ==============================****==================
>>>>     @POST
>>>>     @Produces({MediaType.****APPLICATION_JSON})
>>>>     @Consumes(MediaType.MULTIPART_****FORM_DATA)
>>>>
>>>>     public MetaData archive(@FormParam("title") String title,
>>>>                                     @FormParam("revision") String
>>>> revision,
>>>>                                     @Multipart("archive")
>>>> TemporaryBinaryFile
>>>> temporaryBinaryFile) {
>>>> ==============================****==================
>>>>
>>>>
>>>> Also I tried with @Multipart instead of @FormParam
>>>>
>>>> ==============================****==================
>>>>     @POST
>>>>     @Produces({MediaType.****APPLICATION_JSON})
>>>>     @Consumes(MediaType.MULTIPART_****FORM_DATA)
>>>>
>>>>     public DocumentMetaData archive(@Multipart(value = "title",
>>>> required =
>>>> false) @FormParam("title") String title,
>>>>                                     @Multipart(value = "revision",
>>>> required =
>>>> false) String revision,
>>>>                                     @Multipart("archive")
>>>> TemporaryBinaryFile
>>>> temporaryBinaryFile) {
>>>>
>>>>
>>> You have @FormParam and @Multipart attached to 'title', drop @FormParam,
>>> I
>>> think it only works because 'title' is a simple parameter.
>>>
>>>
>>>
>> Yes I wrongly copied/ modified the code in the mail, however I tested both
>> setup separately.
>> Anyway, as you advised me I will inly use Multipart now.
>>
>>
>>
>>
>>>   ==============================****==================
>>>
>>>>
>>>> And here is the raw request :
>>>> ==============================****==================
>>>> Address: http://localhost:8080/api/v1.****0/document/archive<http://localhost:8080/api/v1.**0/document/archive>
>>>> <http://**localhost:8080/api/v1.0/**document/archive<http://localhost:8080/api/v1.0/document/archive>
>>>> >
>>>> Encoding: ISO-8859-1
>>>> Http-Method: POST
>>>> Content-Type: multipart/form-data;boundary=****partie
>>>> Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,**
>>>> utf-8;q=0.7,*;q=0.3],
>>>> accept-encoding=[gzip,deflate,****sdch], Content-Length=[301],
>>>> content-type=[multipart/form-****data;boundary=partie]}
>>>>
>>>> Payload:
>>>> --partie
>>>> Content-Disposition: form-data; name="title"
>>>> Content-ID: title
>>>>
>>>> the.title
>>>> --partie
>>>> Content-Disposition: form-data; name="revision"
>>>> Content-ID: revision
>>>>
>>>> some.revision
>>>> --partie
>>>> Content-Disposition: form-data; name="archive"; filename="file.txt"
>>>> Content-Type: text/plain
>>>>
>>>> I've got a woman, way over town...
>>>> --partie
>>>> ==============================****==================
>>>>
>>>>
>>>> However the title and revision values are incorrect because they are
>>>> ended
>>>> by a new line char '\n'. Hence these parameters are not validated by my
>>>> validator (which is using Message.getContent),
>>>>
>>>> I don't think this is a normal behavior, but I might be wrong, maybe
>>>> about
>>>> the specs, or my request. Note that I had to add the Content-ID when
>>>> using
>>>> the Multipart annotation.
>>>>
>>>>
>>> What CXF version is it ? Content-Disposition 'name' is definitely checked
>>> too.
>>>
>>>
>>>
>>
>> Also I found part of the code that should check the Content-Disposition,
>> however I have found that the first letter 'C' disappeared and the key in
>> the attachment header is now 'ontent-Disposition' which can complicate
>> things further, and probably explains why, I needed a Content-ID header in
>> each part. Although the first part got his header Content-Disposition
>> always correctly decoded. Adding another new line after the boundary fixes
>> looks like a workaround though, but i'd rather not impose this on the API
>> users :/
>>
>> I couldn't figure out yet where the code could is consuming the additional
>> char. I just know that at some point, the LazyAttachmentCollection has the
>> remaining attachment (AttachmentImpl), and the first header is wrong.
>>
>>
> I think it is the bug of the code the posts the multipart, I recall
> exactly the same issue reported when RESTClient was used
>

Isn't it this issue ? https://issues.apache.org/jira/browse/CXF-2704


>
>> About Content-Disposition name, it is checked only if there is no
>> Content-ID, however it seems at some point the default Content-ID is
>> added "
>> root.message@cxf.apache.org", which defeats the purpose of the following
>> code.
>>
>>      private static boolean *matchAttachmentId(Attachment at, Multipart
>> mid,
>> MediaType multipartType)* {
>>          if (at.getContentId().equals(mid.**value())) {
>>              return true;
>>          }
>>          ContentDisposition cd = at.getContentDisposition();
>>          if (cd != null&&  mid.value().equals(cd.**getParameter("name")))
>> {
>>              return true;
>>          }
>>          return false;
>>      }
>>
>>  default Content-ID is added on the output, it is not added during the
> read...
>

I'm not 100% sure how everything worked, but at some point the
MultipartProvider.readFrom is called from the
JAXRSUtils.readFromMessageBodyReader, which will indirectly call the above
code :

    public Object *readFrom*(Class<Object> c, Type t, Annotation[] anns,
MediaType mt,
                           MultivaluedMap<String, String> headers,
InputStream is) throws IOException, WebApplicationException {

// ...

        Multipart id = AnnotationUtils.getAnnotation(anns, Multipart.class);
        Attachment multipart = *AttachmentUtils.getMultipart(c, id, mt,
infos)*;
        if (multipart != null) {
            return fromAttachment(multipart, c, t, anns);
        } else if (id != null && !id.required()) {

// ...

    }



    public static Attachment getMultipart(Class<Object> c,
                                          Multipart id,
                                          MediaType mt,
                                          List<Attachment> infos) throws
IOException {

        if (id != null) {
            for (Attachment a : infos) {
                if (*matchAttachmentId(a, id, mt)*) {
                    checkMediaTypes(a.getContentType(), id.type());
                    return a;
                }
            }
// ...
    }

I'm not sure of the implications, but it might be possible to fix this with
the following code :

    private static boolean matchAttachmentId(Attachment at, Multipart mid,
MediaType multipartType) {
        ContentDisposition cd = at.getContentDisposition();
        boolean matchContentDispositionName = cd != null &&
mid.value().equals(cd.getParameter("name"));
        boolean matchContentId = at.getContentId().equals(mid.value());

        return matchContentId || matchContentDispositionName;
    }


Many thanx for your support !


Cheers
--Brice

Re: Multipart values are not trimed

Posted by Sergey Beryozkin <sb...@gmail.com>.
Hi
On 05/11/12 23:00, Brice Dutheil wrote:
> Hi,
>
> To get a bigger picture let me explain what I would like to actually craft :
>
> In a multipart POST request, I'd like to have form params and a file
> attachement (like the example above). And I would like to handle myself the
> inputstream of the file. In order do stuff like
>   - checking some headers, for example Content-Length on one of the
> Attachement, Content-Disposition etc
>   - consuming the content of the given inputstream of this part to store it
> in a temporary file
>
> However in the MessageBodyReader, the entityStream looks like it's been
> closed and already consumed. Debugging reveals that an
> AttatchmentDeserializer already consumed the stream, and created an
> Attachement collection, however my provider wasn't called at that time. If
> the opportunity is available I would like to copy these bytes to another
> outputstream.
>
The provider for TemporaryBinaryFile is called later, when individual 
parts are deserialized.

> Is it possible or should I use attachments ? I'd like as much as possible
> avoid technical code in the resource, and have a reference to a
>   TemporaryBinaryFile.
>

You can use org.apache.cxf.jaxrs.ext.multipart.Attachment instead of 
TemporaryBinaryFile, check Content-Type and Content-Disposition, and 
then do 'attachment.getObject(TemporaryBinaryFile.class)':

post(@Multipart("someid") Attachment attachment) {
    attachment.getContentType();
    attachment.getContentDisposition();
    attachment.getObject(TemporaryBinaryFile.class)
}

Actually, you can optimize it slightly by adding a 'type' parameter to
@Multipart(value = "someid", type = "text/plain")

More comments below

>
>
>
> Here's my comment on Content-Disposition :
>
>
>
> On Mon, Nov 5, 2012 at 11:17 PM, Sergey Beryozkin<sb...@gmail.com>wrote:
>
>> Hi
>>
>>
>> On 05/11/12 19:27, Brice Dutheil wrote:
>>
>>> Hi,
>>>
>>> I'm crafting a resource that should accept multipart POST request.
>>>
>>> Here's the method :
>>>
>>> ==============================**==================
>>>     @POST
>>>     @Produces({MediaType.**APPLICATION_JSON})
>>>     @Consumes(MediaType.MULTIPART_**FORM_DATA)
>>>     public MetaData archive(@FormParam("title") String title,
>>>                                     @FormParam("revision") String revision,
>>>                                     @Multipart("archive")
>>> TemporaryBinaryFile
>>> temporaryBinaryFile) {
>>> ==============================**==================
>>>
>>> Also I tried with @Multipart instead of @FormParam
>>>
>>> ==============================**==================
>>>     @POST
>>>     @Produces({MediaType.**APPLICATION_JSON})
>>>     @Consumes(MediaType.MULTIPART_**FORM_DATA)
>>>     public DocumentMetaData archive(@Multipart(value = "title", required =
>>> false) @FormParam("title") String title,
>>>                                     @Multipart(value = "revision",
>>> required =
>>> false) String revision,
>>>                                     @Multipart("archive")
>>> TemporaryBinaryFile
>>> temporaryBinaryFile) {
>>>
>>
>> You have @FormParam and @Multipart attached to 'title', drop @FormParam, I
>> think it only works because 'title' is a simple parameter.
>>
>>
>
> Yes I wrongly copied/ modified the code in the mail, however I tested both
> setup separately.
> Anyway, as you advised me I will inly use Multipart now.
>
>
>
>>
>>   ==============================**==================
>>>
>>> And here is the raw request :
>>> ==============================**==================
>>> Address: http://localhost:8080/api/v1.**0/document/archive<http://localhost:8080/api/v1.0/document/archive>
>>> Encoding: ISO-8859-1
>>> Http-Method: POST
>>> Content-Type: multipart/form-data;boundary=**partie
>>> Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,**
>>> utf-8;q=0.7,*;q=0.3],
>>> accept-encoding=[gzip,deflate,**sdch], Content-Length=[301],
>>> content-type=[multipart/form-**data;boundary=partie]}
>>> Payload:
>>> --partie
>>> Content-Disposition: form-data; name="title"
>>> Content-ID: title
>>>
>>> the.title
>>> --partie
>>> Content-Disposition: form-data; name="revision"
>>> Content-ID: revision
>>>
>>> some.revision
>>> --partie
>>> Content-Disposition: form-data; name="archive"; filename="file.txt"
>>> Content-Type: text/plain
>>>
>>> I've got a woman, way over town...
>>> --partie
>>> ==============================**==================
>>>
>>> However the title and revision values are incorrect because they are ended
>>> by a new line char '\n'. Hence these parameters are not validated by my
>>> validator (which is using Message.getContent),
>>>
>>> I don't think this is a normal behavior, but I might be wrong, maybe about
>>> the specs, or my request. Note that I had to add the Content-ID when using
>>> the Multipart annotation.
>>>
>>
>> What CXF version is it ? Content-Disposition 'name' is definitely checked
>> too.
>>
>>
>
>
> Also I found part of the code that should check the Content-Disposition,
> however I have found that the first letter 'C' disappeared and the key in
> the attachment header is now 'ontent-Disposition' which can complicate
> things further, and probably explains why, I needed a Content-ID header in
> each part. Although the first part got his header Content-Disposition
> always correctly decoded. Adding another new line after the boundary fixes
> looks like a workaround though, but i'd rather not impose this on the API
> users :/
>
> I couldn't figure out yet where the code could is consuming the additional
> char. I just know that at some point, the LazyAttachmentCollection has the
> remaining attachment (AttachmentImpl), and the first header is wrong.
>

I think it is the bug of the code the posts the multipart, I recall 
exactly the same issue reported when RESTClient was used


>
> About Content-Disposition name, it is checked only if there is no
> Content-ID, however it seems at some point the default Content-ID is added "
> root.message@cxf.apache.org", which defeats the purpose of the following
> code.
>
>      private static boolean matchAttachmentId(Attachment at, Multipart mid,
> MediaType multipartType) {
>          if (at.getContentId().equals(mid.value())) {
>              return true;
>          }
>          ContentDisposition cd = at.getContentDisposition();
>          if (cd != null&&  mid.value().equals(cd.getParameter("name"))) {
>              return true;
>          }
>          return false;
>      }
>
default Content-ID is added on the output, it is not added during the 
read...

Cheers, Sergey

> I have tested these behavior on CXF 2.6.3 and 2.7.0, I'm using JDK 7 also.
>
>
>
>>
>>   Maybe there is something I should do ?
>>>
>>>
>>> I have a workaround for that, I've made an interceptor whose role is to
>>> trim strings. But I find it rather inelegant to do that.
>>>
>>> Or am I missing something ?
>>>
>>>
>> I don't have the immediate answer to it, we have few tests were simple
>> parts are transmitted and no new line/return characters make it into the
>> representation.
>>
>> Can you please experiment with Content-Transfer-Encoding header ?
>>
>
> Nope, no change in behavior, although I only used the 8bit value, as my
> shooter only uses ascii chars.
>
>
> Cheers,
> -- Brice
>


-- 
Sergey Beryozkin

Talend Community Coders
http://coders.talend.com/

Blog: http://sberyozkin.blogspot.com

Re: Multipart values are not trimed

Posted by Brice Dutheil <br...@gmail.com>.
Hi,

To get a bigger picture let me explain what I would like to actually craft :

In a multipart POST request, I'd like to have form params and a file
attachement (like the example above). And I would like to handle myself the
inputstream of the file. In order do stuff like
 - checking some headers, for example Content-Length on one of the
Attachement, Content-Disposition etc
 - consuming the content of the given inputstream of this part to store it
in a temporary file

However in the MessageBodyReader, the entityStream looks like it's been
closed and already consumed. Debugging reveals that an
AttatchmentDeserializer already consumed the stream, and created an
Attachement collection, however my provider wasn't called at that time. If
the opportunity is available I would like to copy these bytes to another
outputstream.

Is it possible or should I use attachments ? I'd like as much as possible
avoid technical code in the resource, and have a reference to a
 TemporaryBinaryFile.




Here's my comment on Content-Disposition :



On Mon, Nov 5, 2012 at 11:17 PM, Sergey Beryozkin <sb...@gmail.com>wrote:

> Hi
>
>
> On 05/11/12 19:27, Brice Dutheil wrote:
>
>> Hi,
>>
>> I'm crafting a resource that should accept multipart POST request.
>>
>> Here's the method :
>>
>> ==============================**==================
>>    @POST
>>    @Produces({MediaType.**APPLICATION_JSON})
>>    @Consumes(MediaType.MULTIPART_**FORM_DATA)
>>    public MetaData archive(@FormParam("title") String title,
>>                                    @FormParam("revision") String revision,
>>                                    @Multipart("archive")
>> TemporaryBinaryFile
>> temporaryBinaryFile) {
>> ==============================**==================
>>
>> Also I tried with @Multipart instead of @FormParam
>>
>> ==============================**==================
>>    @POST
>>    @Produces({MediaType.**APPLICATION_JSON})
>>    @Consumes(MediaType.MULTIPART_**FORM_DATA)
>>    public DocumentMetaData archive(@Multipart(value = "title", required =
>> false) @FormParam("title") String title,
>>                                    @Multipart(value = "revision",
>> required =
>> false) String revision,
>>                                    @Multipart("archive")
>> TemporaryBinaryFile
>> temporaryBinaryFile) {
>>
>
> You have @FormParam and @Multipart attached to 'title', drop @FormParam, I
> think it only works because 'title' is a simple parameter.
>
>

Yes I wrongly copied/ modified the code in the mail, however I tested both
setup separately.
Anyway, as you advised me I will inly use Multipart now.



>
>  ==============================**==================
>>
>> And here is the raw request :
>> ==============================**==================
>> Address: http://localhost:8080/api/v1.**0/document/archive<http://localhost:8080/api/v1.0/document/archive>
>> Encoding: ISO-8859-1
>> Http-Method: POST
>> Content-Type: multipart/form-data;boundary=**partie
>> Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,**
>> utf-8;q=0.7,*;q=0.3],
>> accept-encoding=[gzip,deflate,**sdch], Content-Length=[301],
>> content-type=[multipart/form-**data;boundary=partie]}
>> Payload:
>> --partie
>> Content-Disposition: form-data; name="title"
>> Content-ID: title
>>
>> the.title
>> --partie
>> Content-Disposition: form-data; name="revision"
>> Content-ID: revision
>>
>> some.revision
>> --partie
>> Content-Disposition: form-data; name="archive"; filename="file.txt"
>> Content-Type: text/plain
>>
>> I've got a woman, way over town...
>> --partie
>> ==============================**==================
>>
>> However the title and revision values are incorrect because they are ended
>> by a new line char '\n'. Hence these parameters are not validated by my
>> validator (which is using Message.getContent),
>>
>> I don't think this is a normal behavior, but I might be wrong, maybe about
>> the specs, or my request. Note that I had to add the Content-ID when using
>> the Multipart annotation.
>>
>
> What CXF version is it ? Content-Disposition 'name' is definitely checked
> too.
>
>


Also I found part of the code that should check the Content-Disposition,
however I have found that the first letter 'C' disappeared and the key in
the attachment header is now 'ontent-Disposition' which can complicate
things further, and probably explains why, I needed a Content-ID header in
each part. Although the first part got his header Content-Disposition
always correctly decoded. Adding another new line after the boundary fixes
looks like a workaround though, but i'd rather not impose this on the API
users :/

I couldn't figure out yet where the code could is consuming the additional
char. I just know that at some point, the LazyAttachmentCollection has the
remaining attachment (AttachmentImpl), and the first header is wrong.


About Content-Disposition name, it is checked only if there is no
Content-ID, however it seems at some point the default Content-ID is added "
root.message@cxf.apache.org", which defeats the purpose of the following
code.

    private static boolean matchAttachmentId(Attachment at, Multipart mid,
MediaType multipartType) {
        if (at.getContentId().equals(mid.value())) {
            return true;
        }
        ContentDisposition cd = at.getContentDisposition();
        if (cd != null && mid.value().equals(cd.getParameter("name"))) {
            return true;
        }
        return false;
    }

I have tested these behavior on CXF 2.6.3 and 2.7.0, I'm using JDK 7 also.



>
>  Maybe there is something I should do ?
>>
>>
>> I have a workaround for that, I've made an interceptor whose role is to
>> trim strings. But I find it rather inelegant to do that.
>>
>> Or am I missing something ?
>>
>>
> I don't have the immediate answer to it, we have few tests were simple
> parts are transmitted and no new line/return characters make it into the
> representation.
>
> Can you please experiment with Content-Transfer-Encoding header ?
>

Nope, no change in behavior, although I only used the 8bit value, as my
shooter only uses ascii chars.


Cheers,
-- Brice

Re: Multipart values are not trimed

Posted by Sergey Beryozkin <sb...@gmail.com>.
Hi

On 05/11/12 19:27, Brice Dutheil wrote:
> Hi,
>
> I'm crafting a resource that should accept multipart POST request.
>
> Here's the method :
>
> ================================================
>    @POST
>    @Produces({MediaType.APPLICATION_JSON})
>    @Consumes(MediaType.MULTIPART_FORM_DATA)
>    public MetaData archive(@FormParam("title") String title,
>                                    @FormParam("revision") String revision,
>                                    @Multipart("archive") TemporaryBinaryFile
> temporaryBinaryFile) {
> ================================================
>
> Also I tried with @Multipart instead of @FormParam
>
> ================================================
>    @POST
>    @Produces({MediaType.APPLICATION_JSON})
>    @Consumes(MediaType.MULTIPART_FORM_DATA)
>    public DocumentMetaData archive(@Multipart(value = "title", required =
> false) @FormParam("title") String title,
>                                    @Multipart(value = "revision", required =
> false) String revision,
>                                    @Multipart("archive") TemporaryBinaryFile
> temporaryBinaryFile) {

You have @FormParam and @Multipart attached to 'title', drop @FormParam, 
I think it only works because 'title' is a simple parameter.

> ================================================
>
> And here is the raw request :
> ================================================
> Address: http://localhost:8080/api/v1.0/document/archive
> Encoding: ISO-8859-1
> Http-Method: POST
> Content-Type: multipart/form-data;boundary=partie
> Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,utf-8;q=0.7,*;q=0.3],
> accept-encoding=[gzip,deflate,sdch], Content-Length=[301],
> content-type=[multipart/form-data;boundary=partie]}
> Payload:
> --partie
> Content-Disposition: form-data; name="title"
> Content-ID: title
>
> the.title
> --partie
> Content-Disposition: form-data; name="revision"
> Content-ID: revision
>
> some.revision
> --partie
> Content-Disposition: form-data; name="archive"; filename="file.txt"
> Content-Type: text/plain
>
> I've got a woman, way over town...
> --partie
> ================================================
>
> However the title and revision values are incorrect because they are ended
> by a new line char '\n'. Hence these parameters are not validated by my
> validator (which is using Message.getContent),
>
> I don't think this is a normal behavior, but I might be wrong, maybe about
> the specs, or my request. Note that I had to add the Content-ID when using
> the Multipart annotation.

What CXF version is it ? Content-Disposition 'name' is definitely 
checked too.

> Maybe there is something I should do ?
>
>
> I have a workaround for that, I've made an interceptor whose role is to
> trim strings. But I find it rather inelegant to do that.
>
> Or am I missing something ?
>

I don't have the immediate answer to it, we have few tests were simple 
parts are transmitted and no new line/return characters make it into the 
representation.

Can you please experiment with Content-Transfer-Encoding header ?

Cheers, Sergey

> Cheers
> -- Brice
>