You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2014/09/01 18:21:21 UTC

[jira] [Commented] (TIKA-1407) Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a

    [ https://issues.apache.org/jira/browse/TIKA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117534#comment-14117534 ] 

Nick Burch commented on TIKA-1407:
----------------------------------

Firstly, can you please post the problematic file - I get a 403 trying to download it

Secondly, can you try grabbing a recent nightly build of the Tika App jar, and trying with that? (There have been POI updates on trunk since 1.5 was released)

> Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a
> ---------------------------------------------------------------------------------------
>
>                 Key: TIKA-1407
>                 URL: https://issues.apache.org/jira/browse/TIKA-1407
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.5
>         Environment: Kubuntu 14.04
>            Reporter: Matthieu Neamar
>
> I'm trying to parse a document created with Powerpoint for Mac.
> This crash Tika. However, interestingly, i can open it with LibreOffice. If i save it using the same format, it loses some kilobytes and works.
> The failing file is at http://amoki.fr/anyFetch_pitch_deck_Allianz_EN_withoutslide9.ppt
> I get the following error using tika 1.5:
> {quote}
> Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> 	at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:142)
> 	at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:418)
> 	at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:112)
> Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 5000 on class class org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 5002 on class class org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
> 	at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
> 	at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
> 	at org.apache.poi.hslf.model.SimpleShape.getClientRecords(SimpleShape.java:347)
> 	at org.apache.poi.hslf.model.SimpleShape.getClientDataRecord(SimpleShape.java:319)
> 	at org.apache.poi.hslf.model.TextShape.getPlaceholderAtom(TextShape.java:596)
> 	at org.apache.poi.hslf.model.Sheet.getPlaceholder(Sheet.java:443)
> 	at org.apache.poi.hslf.model.HeadersFooters.isVisible(HeadersFooters.java:244)
> 	at org.apache.poi.hslf.model.HeadersFooters.isHeaderVisible(HeadersFooters.java:148)
> 	at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:62)
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:202)
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:167)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> 	... 5 more
> Caused by: java.lang.reflect.InvocationTargetException
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> 	at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
> 	... 16 more
> Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 5002 on class class org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
> 	at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
> 	at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
> 	at org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren.<init>(DummyPositionSensitiveRecordWithChildren.java:52)
> 	... 21 more
> Caused by: java.lang.reflect.InvocationTargetException
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> 	at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
> 	... 23 more
> Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
> 	at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
> 	at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
> 	at org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren.<init>(DummyPositionSensitiveRecordWithChildren.java:52)
> 	... 28 more
> Caused by: java.lang.reflect.InvocationTargetException
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> 	at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
> 	... 30 more
> Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
> Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
> 	at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
> 	at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
> 	at org.apache.poi.hslf.record.BinaryTagDataBlob.<init>(BinaryTagDataBlob.java:52)
> 	... 35 more
> Caused by: java.lang.reflect.InvocationTargetException
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> 	at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
> 	... 37 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 20
> 	at org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:161)
> 	at org.apache.poi.hslf.record.StyleTextProp9Atom.<init>(StyleTextProp9Atom.java:70)
> 	... 42 more
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)