You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Andreas Beeker <ki...@apache.org> on 2020/12/21 19:20:15 UTC

Missing classes in poi-ooxml-lite - was: Plea - test the POI 5.0.0 snapshot

Hi Tim,

the source for this is an optimization of the included .xsbs in the lite jar done via revision 1884139 and 1884142 [1].
The selection of the xsbs is based on the used classes, but there are a few xsbs loaded without a factory class.
After I finished my junit5-migration-purgatory, I have a look if I can tweak the lite agent by weave in some logging to the call to SchemaTypeLoaderImpl::typeSystemForComponent via ByteBuddy [2]

Until then we can simply add oleobjectelement.xsb to the build.xml and wait for the build to generate a new poi-ooxml-lite.jar

Andi.


[1] https://svn.apache.org/viewvc?view=revision&revision=1884139
[2] https://www.infoq.com/articles/Easily-Create-Java-Agents-with-ByteBuddy/

On 21.12.20 17:26, Tim Allison wrote:
> Andi,
>    Thank you for all of your work on this!  This is probably user error, but
> I'm getting a failed test when I integrate poi trunk with Tika.  Is this
> something I can fix at the Tika level?
>
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from
> org.apache.tika.parser.microsoft.ooxml.OOXMLParser@785a4557
>
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:293)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> at
> org.apache.tika.extractor.ParserContainerExtractor.extract(ParserContainerExtractor.java:82)
> at
> org.apache.tika.parser.microsoft.AbstractPOIContainerExtractionTest.process(AbstractPOIContainerExtractionTest.java:68)
> at
> org.apache.tika.parser.microsoft.POIContainerExtractionTest.testEmbeddedOfficeFilesXML(POIContainerExtractionTest.java:335)
> ...
> Caused by: org.apache.xmlbeans.SchemaTypeLoaderException: XML-BEANS
> compiled schema: Could not locate compiled schema resource
> org/apache/poi/schemas/ooxml/system/ooxml/oleobjectelement.xsb
> (org.apache.poi.schemas.ooxml.system.ooxml.oleobjectelement) - code 0
> at
> org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl$XsbReader.<init>(SchemaTypeSystemImpl.java:1315)
> at
> org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl.resolveHandle(SchemaTypeSystemImpl.java:3138)
> at
> org.apache.xmlbeans.SchemaComponent$Ref.getComponent(SchemaComponent.java:113)
> at
> org.apache.xmlbeans.SchemaGlobalElement$Ref.get(SchemaGlobalElement.java:76)
> at
> org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.findElement(SchemaTypeLoaderBase.java:103)
> at
> org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createElementType(SchemaTypeImpl.java:988)
> at
> org.apache.xmlbeans.impl.values.XmlObjectBase.create_element_user(XmlObjectBase.java:913)
> at org.apache.xmlbeans.impl.store.Xobj.getUser(Xobj.java:1597)
> at org.apache.xmlbeans.impl.store.Cur.getUser(Cur.java:2571)
> at org.apache.xmlbeans.impl.store.Cur.getObject(Cur.java:2565)
> at org.apache.xmlbeans.impl.store.Cursor._getObject(Cursor.java:819)
> at org.apache.xmlbeans.impl.store.Cursor.syncWrapHelper(Cursor.java:2522)
> at org.apache.xmlbeans.impl.store.Cursor.syncWrap(Cursor.java:2453)
> at org.apache.xmlbeans.impl.store.Cursor.getObject(Cursor.java:2080)
> at
> org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractParagraph(XWPFWordExtractorDecorator.java:236)
> at
> org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractIBodyText(XWPFWordExtractorDecorator.java:161)
> at
> org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.buildXHTML(XWPFWordExtractorDecorator.java:124)
> at
> org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:136)
> at
> org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:213)
> at
> org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:113)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>
> On Sat, Dec 19, 2020 at 8:47 AM Tim Allison <ta...@apache.org> wrote:
>
>> If anyone else on this list has time and an interest POI 5.0.0 is on the
>> way! Please help test!
>>
>> ---------- Forwarded message ---------
>> From: Tim Allison <ta...@apache.org>
>> Date: Sat, Dec 19, 2020 at 8:45 AM
>> Subject: Re: Plea - test the POI 5.0.0 snapshot
>> To: POI Users List <us...@poi.apache.org>
>>
>>
>> Will integrate w Tika on Monday and test it out. Thank you!!!
>>
>> On Sat, Dec 19, 2020 at 7:52 AM Andreas Beeker <ki...@apache.org>
>> wrote:
>>
>>> Dear POI users,
>>>
>>> we are shortly before releasing POI 5.0.0 and there have been some
>>> breaking changes [1].
>>> Notably the JPMS/JigSaw migration and the upgrade of the ECMA-376 schemas
>>> to the 5th edition.
>>>
>>> Please download the snapshot [2] and give it a try - especially with the
>>> new schemas, I'm interested if documents created by POI still can be opened
>>> without errors in various office applications.
>>>
>>> Thank you for your support.
>>>
>>> Andi
>>>
>>>
>>> [1] http://poi.apache.org/changes.html
>>>
>>> [2]
>>> https://ci-builds.apache.org/job/POI/job/POI-DSL-1.8/lastSuccessfulBuild/artifact/build/dist/
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>>
>>>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org