You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/03/23 14:08:00 UTC

[jira] [Comment Edited] (TIKA-3164) Upgrade to POI 5.0.0 when available

    [ https://issues.apache.org/jira/browse/TIKA-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17307099#comment-17307099 ] 

Tim Allison edited comment on TIKA-3164 at 3/23/21, 2:07 PM:
-------------------------------------------------------------

[~fanningpj], many thanks for your help on this.  I finally had a bit of time to look into this again.  I'm now getting a clean build of POI for 5.0.1-SNAPSHOT.

With the Tika integration, though, I'm still getting the following exception on several unit tests.

When I look inside the {{ooxml-lite}} jar for both 5.0.0 and 5.0.1-SNAPSHOT (even after I add Tika's {{EmbeddedDocument.docx}}, I see {{org/apache/poi/schemas/ooxml/system/oleobjelement.xsb}} but not {{..../oleobjectelement.xsb}}.

Any idea how to fix this?

{noformat}
Caused by: org.apache.xmlbeans.SchemaTypeLoaderException: XML-BEANS compiled schema: Could not locate compiled schema resource org/apache/poi/schemas/ooxml/system/ooxml/oleobjectelement.xsb (org.apache.poi.schemas.ooxml.system.ooxml.oleobjectelement) - code 0
	at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl$XsbReader.<init>(SchemaTypeSystemImpl.java:1315)
	at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl.resolveHandle(SchemaTypeSystemImpl.java:3138)
	at org.apache.xmlbeans.SchemaComponent$Ref.getComponent(SchemaComponent.java:113)
	at org.apache.xmlbeans.SchemaGlobalElement$Ref.get(SchemaGlobalElement.java:76)
	at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.findElement(SchemaTypeLoaderBase.java:103)
	at org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createElementType(SchemaTypeImpl.java:988)
	at org.apache.xmlbeans.impl.values.XmlObjectBase.create_element_user(XmlObjectBase.java:913)
	at org.apache.xmlbeans.impl.store.Xobj.getUser(Xobj.java:1597)
	at org.apache.xmlbeans.impl.store.Cur.getUser(Cur.java:2571)
	at org.apache.xmlbeans.impl.store.Cur.getObject(Cur.java:2565)
	at org.apache.xmlbeans.impl.store.Cursor._getObject(Cursor.java:819)
	at org.apache.xmlbeans.impl.store.Cursor.syncWrapHelper(Cursor.java:2522)
	at org.apache.xmlbeans.impl.store.Cursor.syncWrap(Cursor.java:2453)
	at org.apache.xmlbeans.impl.store.Cursor.getObject(Cursor.java:2080)
	at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractParagraph(XWPFWordExtractorDecorator.java:236)
	at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractIBodyText(XWPFWordExtractorDecorator.java:161)
	at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.buildXHTML(XWPFWordExtractorDecorator.java:124)
	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:136)
	at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:214)
	at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:113)

{noformat}


was (Author: tallison@mitre.org):
[~fanningpj], many thanks for your help on this.  I'm now getting a clean build on 5.0.1-SNAPSHOT.

With the Tika integration, though, I'm still getting the following exception on several unit tests.

When I look inside the {{ooxml-lite}} jar for both 5.0.0 and 5.0.1-SNAPSHOT (even after I add Tika's {{EmbeddedDocument.docx}}, I see {{org/apache/poi/schemas/ooxml/system/oleobjelement.xsb}} but not {{..../oleobjectelement.xsb}}.

Any idea how to fix this?

{noformat}
Caused by: org.apache.xmlbeans.SchemaTypeLoaderException: XML-BEANS compiled schema: Could not locate compiled schema resource org/apache/poi/schemas/ooxml/system/ooxml/oleobjectelement.xsb (org.apache.poi.schemas.ooxml.system.ooxml.oleobjectelement) - code 0
	at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl$XsbReader.<init>(SchemaTypeSystemImpl.java:1315)
	at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl.resolveHandle(SchemaTypeSystemImpl.java:3138)
	at org.apache.xmlbeans.SchemaComponent$Ref.getComponent(SchemaComponent.java:113)
	at org.apache.xmlbeans.SchemaGlobalElement$Ref.get(SchemaGlobalElement.java:76)
	at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.findElement(SchemaTypeLoaderBase.java:103)
	at org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createElementType(SchemaTypeImpl.java:988)
	at org.apache.xmlbeans.impl.values.XmlObjectBase.create_element_user(XmlObjectBase.java:913)
	at org.apache.xmlbeans.impl.store.Xobj.getUser(Xobj.java:1597)
	at org.apache.xmlbeans.impl.store.Cur.getUser(Cur.java:2571)
	at org.apache.xmlbeans.impl.store.Cur.getObject(Cur.java:2565)
	at org.apache.xmlbeans.impl.store.Cursor._getObject(Cursor.java:819)
	at org.apache.xmlbeans.impl.store.Cursor.syncWrapHelper(Cursor.java:2522)
	at org.apache.xmlbeans.impl.store.Cursor.syncWrap(Cursor.java:2453)
	at org.apache.xmlbeans.impl.store.Cursor.getObject(Cursor.java:2080)
	at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractParagraph(XWPFWordExtractorDecorator.java:236)
	at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractIBodyText(XWPFWordExtractorDecorator.java:161)
	at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.buildXHTML(XWPFWordExtractorDecorator.java:124)
	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:136)
	at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:214)
	at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:113)

{noformat}

> Upgrade to POI 5.0.0 when available
> -----------------------------------
>
>                 Key: TIKA-3164
>                 URL: https://issues.apache.org/jira/browse/TIKA-3164
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Tim Allison
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)