You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2013/05/29 16:20:49 UTC

[Bug 55026] New: Parse the parameter part in the ContentType definition

https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

            Bug ID: 55026
           Summary: Parse the parameter part in the ContentType definition
           Product: POI
           Version: 3.9
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: POI Overall
          Assignee: dev@poi.apache.org
          Reporter: sebastien.schneider@ifpen.fr

Hi POI team,

My enhancement is related to ContentType support in the openxml4j part of the
POI library.
In the current 3.9 version, ContentType  containing parameters throw a
"malformed content type" exception when parsing the OPC document.
Such ContentType could be of the form "application/xml;key1=value1;key2=value2"

There's already code to support this format in the ContentType class but it's
commented out !

Is it possible to activate this ContentType format in a future version ?

Thank you,
Sebastien.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

Sebastien Schneider <se...@ifpen.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sebastien.schneider@ifpen.f
                   |                            |r

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

--- Comment #2 from Sebastien Schneider <se...@ifpen.fr> ---
Created attachment 30341
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=30341&action=edit
OPC file with Content_Types.xml containing parameters

I attach a very simple OPC file with a "Content_Types.xml" which contains
parameters:
ContentType="application/x-resqml+xml;version=2.0;type=obj_global2dCrs"

The only line of code I need to highlight the problem is the OPCPackage.open
method call like that:
OPCPackage p = OPCPackage.open("opc_contenttype_test_wparams.opc",
PackageAccess.READ);

This call throw the following exception:
org.apache.poi.openxml4j.exceptions.InvalidFormatException: The specified
content type 'application/x-resqml+xml;version=2.0;type=obj_global1dCrs' is not
compliant with RFC 2616: malformed content type.

I think that it's because the code from the
/[Apache-SVN]/poi/trunk/src/ooxml/java/org/apache/poi/openxml4j/opc/internal/ContentType.java
doesn't support such ContentType string format.

Thank you,
cheers,
Sebastien.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

--- Comment #6 from Sebastien Schneider <se...@ifpen.fr> ---
Created attachment 30782
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=30782&action=edit
Patch for both files ContentType.java and TestContentType.java

I propose you the fix for this bug. I complete the unit test with hard coded
parameterized content type but I don't implement the file unit test.

I had an issue with the Java pattern matcher that do not handle multiple group
when matching automatically, so I had to add a second matcher specialized to
process parameters.

It works well on my files.

Thank you in advance for integration and feel free to modify it the proper way.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

Sebastien Schneider <se...@ifpen.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P2                          |P1

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

Nick Burch <ap...@gagravarr.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Nick Burch <ap...@gagravarr.org> ---
Do you have a sample file that has parameters in it? And if so, could you
please upload it, ideally along with a short unit test that shows you trying to
load + read them?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

Sebastien Schneider <se...@ifpen.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

--- Comment #3 from Nick Burch <ap...@gagravarr.org> ---
In r1487657 I have added your unit test, and stubbed out the unit tests we'll
need

The next step is to review the ooxml spec, then write the unit tests for valid
parameters based on the stubbed out bits

Finally, we can then try to enable the parameter logic

If you have some time to have on part #2, that'd be great!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

Philippe Verney <ph...@philippeverney.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |philippe@philippeverney.com

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

--- Comment #4 from Sebastien Schneider <se...@ifpen.fr> ---
Unfortunately I won't have time to work on that now. I hope to be able to help
a little bit next month ...


(In reply to Nick Burch from comment #3)
> In r1487657 I have added your unit test, and stubbed out the unit tests
> we'll need
> 
> The next step is to review the ooxml spec, then write the unit tests for
> valid parameters based on the stubbed out bits
> 
> Finally, we can then try to enable the parameter logic
> 
> If you have some time to have on part #2, that'd be great!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

--- Comment #5 from Sebastien Schneider <se...@ifpen.fr> ---
I just reviewed the ooxml spec from the document ISO_IEC_29500-2_2012.pdf, the
ContentType format is specified in 9.1.2 by referencing the RFC2616, paragraph
3.7. The format of the media-type defined by ContentType is as follows:
media-type = type "/" subtype *( ";" parameter )
where parameter is expressed as
attribute "=" value

Now needs to complete unit test and enable the corresponding code in
ContentType.java parsing implementation.

Sebastien.


(In reply to Nick Burch from comment #3)
> In r1487657 I have added your unit test, and stubbed out the unit tests
> we'll need
> 
> The next step is to review the ooxml spec, then write the unit tests for
> valid parameters based on the stubbed out bits
> 
> Finally, we can then try to enable the parameter logic
> 
> If you have some time to have on part #2, that'd be great!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 55026] Parse the parameter part in the ContentType definition

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55026

Nick Burch <ap...@gagravarr.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from Nick Burch <ap...@gagravarr.org> ---
Thanks for this patch, and sorry it got forgotten

I've done some work on this myself, and then incorporated much of your logic
and tests too. As of r1569976 we're now able to process these content types
without error, and we have a lot more testing around it all.

Thanks for your help!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org