You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Gregg Morris <gr...@gmail.com> on 2012/01/18 00:50:47 UTC

POI XWPF compliance error

Hello--

We have a process that takes a Word document (docx) and merges content into
it, at locations in the document indicated by the document's creator. At
the moment there is a small set of authors using only Word 2008 (for Mac),
but this is likely to soon become a larger group of authors using Word for
Windows (2007+) as well.

We've discovered that Word can produce documents that cause
NullPointerExceptions when loaded into POI. Two cases appear to be bugs
(not handling objects that could be null), but one case clearly is not:
"OPC Compliance error [M4.1]: there is more than one core properties
relationship in the package !". Word can happily read and write files that
cause this error in POI.

My concerns are these:
1) How do I tell my authors to avoid this? In other words, what are they
doing in Word to create multiple core properties?
2) Knowing that Word can cheerfully read and write documents that POI
considers invalid, but needing to support my users, should I modify the POI
code to relax this particular error checking? In this case I could just use
the first core properties document part that is encountered and ignore any
others. I guess this is a letter-of-the-law versus intent-of-the-law
situation. It's not a perfect world out there; how best to handle it?

Thanks for any suggestions!
Gregg

Re: POI XWPF compliance error

Posted by Yegor Kozlov <ye...@dinom.ru>.
It might be a bug in Word 2008, it was the first version of MS Office
for Mac that supported the .docx  format. Can the problem be
reproduced with the latest Word 2011 for Mac ?
What happens if you open a document that POI cannot read in Word 2010
for Windows and re-save it ? Can POI read it ?

Which version of POI are you using? Try with with the latest POI
3.8-beta5. A number of bugs related to reading OPC packages has been
recently fixed.
If the problem is still there, please create a new Bugzilla issue and
attach the problematic file.

Cheers,
Yegor

On Wed, Jan 18, 2012 at 3:50 AM, Gregg Morris <gr...@gmail.com> wrote:
> Hello--
>
> We have a process that takes a Word document (docx) and merges content into
> it, at locations in the document indicated by the document's creator. At
> the moment there is a small set of authors using only Word 2008 (for Mac),
> but this is likely to soon become a larger group of authors using Word for
> Windows (2007+) as well.
>
> We've discovered that Word can produce documents that cause
> NullPointerExceptions when loaded into POI. Two cases appear to be bugs
> (not handling objects that could be null), but one case clearly is not:
> "OPC Compliance error [M4.1]: there is more than one core properties
> relationship in the package !". Word can happily read and write files that
> cause this error in POI.
>
> My concerns are these:
> 1) How do I tell my authors to avoid this? In other words, what are they
> doing in Word to create multiple core properties?
> 2) Knowing that Word can cheerfully read and write documents that POI
> considers invalid, but needing to support my users, should I modify the POI
> code to relax this particular error checking? In this case I could just use
> the first core properties document part that is encountered and ignore any
> others. I guess this is a letter-of-the-law versus intent-of-the-law
> situation. It's not a perfect world out there; how best to handle it?
>
> Thanks for any suggestions!
> Gregg

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: POI XWPF compliance error

Posted by Dave Fisher <da...@comcast.net>.
On Jan 17, 2012, at 3:50 PM, Gregg Morris wrote:

> Hello--
> 
> We have a process that takes a Word document (docx) and merges content into
> it, at locations in the document indicated by the document's creator. At
> the moment there is a small set of authors using only Word 2008 (for Mac),
> but this is likely to soon become a larger group of authors using Word for
> Windows (2007+) as well.

Is your Word 2008 fully updated? There were issues with the XML that MSFT produced for the initial PowerPoint 2008 that MSFT had to correct with an update.

> We've discovered that Word can produce documents that cause
> NullPointerExceptions when loaded into POI. Two cases appear to be bugs
> (not handling objects that could be null), but one case clearly is not:
> "OPC Compliance error [M4.1]: there is more than one core properties
> relationship in the package !". Word can happily read and write files that
> cause this error in POI.
> 
> My concerns are these:
> 1) How do I tell my authors to avoid this? In other words, what are they
> doing in Word to create multiple core properties?
> 2) Knowing that Word can cheerfully read and write documents that POI
> considers invalid, but needing to support my users, should I modify the POI
> code to relax this particular error checking? In this case I could just use
> the first core properties document part that is encountered and ignore any
> others. I guess this is a letter-of-the-law versus intent-of-the-law
> situation. It's not a perfect world out there; how best to handle it?

AFAIK our philosophy is "intent-of-the-law" and not "letter-of-the-law", but we need edge cases like yours.

You are free to modify POI and make it work for your case. Patches are always welcome!

I also think that this should be considered a bug in POI and if possible you should create a Bugzilla entry including a document that shows the issue. The document will be publicly available and should contain no personally sensitive information. Perhaps the developers will solve the trouble once the details are available.

Regards,
Dave

> 
> Thanks for any suggestions!
> Gregg


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org