You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2014/09/01 18:21:21 UTC
[jira] [Commented] (TIKA-1407) Unexpected RuntimeException from
org.apache.tika.parser.microsoft.OfficeParser@5d11346a
[ https://issues.apache.org/jira/browse/TIKA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117534#comment-14117534 ]
Nick Burch commented on TIKA-1407:
----------------------------------
Firstly, can you please post the problematic file - I get a 403 trying to download it
Secondly, can you try grabbing a recent nightly build of the Tika App jar, and trying with that? (There have been POI updates on trunk since 1.5 was released)
> Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a
> ---------------------------------------------------------------------------------------
>
> Key: TIKA-1407
> URL: https://issues.apache.org/jira/browse/TIKA-1407
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.5
> Environment: Kubuntu 14.04
> Reporter: Matthieu Neamar
>
> I'm trying to parse a document created with Powerpoint for Mac.
> This crash Tika. However, interestingly, i can open it with LibreOffice. If i save it using the same format, it loses some kilobytes and works.
> The failing file is at http://amoki.fr/anyFetch_pitch_deck_Allianz_EN_withoutslide9.ppt
> I get the following error using tika 1.5:
> {quote}
> Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:142)
> at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:418)
> at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:112)
> Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 5000 on class class org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 5002 on class class org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
> at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
> at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
> at org.apache.poi.hslf.model.SimpleShape.getClientRecords(SimpleShape.java:347)
> at org.apache.poi.hslf.model.SimpleShape.getClientDataRecord(SimpleShape.java:319)
> at org.apache.poi.hslf.model.TextShape.getPlaceholderAtom(TextShape.java:596)
> at org.apache.poi.hslf.model.Sheet.getPlaceholder(Sheet.java:443)
> at org.apache.poi.hslf.model.HeadersFooters.isVisible(HeadersFooters.java:244)
> at org.apache.poi.hslf.model.HeadersFooters.isHeaderVisible(HeadersFooters.java:148)
> at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:62)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:202)
> at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:167)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> ... 5 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
> ... 16 more
> Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 5002 on class class org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
> at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
> at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
> at org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren.<init>(DummyPositionSensitiveRecordWithChildren.java:52)
> ... 21 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
> ... 23 more
> Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
> at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
> at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
> at org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren.<init>(DummyPositionSensitiveRecordWithChildren.java:52)
> ... 28 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
> ... 30 more
> Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
> at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
> at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
> at org.apache.poi.hslf.record.BinaryTagDataBlob.<init>(BinaryTagDataBlob.java:52)
> ... 35 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
> ... 37 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 20
> at org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:161)
> at org.apache.poi.hslf.record.StyleTextProp9Atom.<init>(StyleTextProp9Atom.java:70)
> ... 42 more
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)