You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2014/08/04 18:00:12 UTC

[jira] [Resolved] (TIKA-1093) [OfficeParser] NullPointerException

     [ https://issues.apache.org/jira/browse/TIKA-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Burch resolved TIKA-1093.
------------------------------

       Resolution: Fixed
    Fix Version/s: 1.6

Now parses on Tika but from trunk or from the 1.6 release branch

> [OfficeParser] NullPointerException 
> ------------------------------------
>
>                 Key: TIKA-1093
>                 URL: https://issues.apache.org/jira/browse/TIKA-1093
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.3
>         Environment: % java -version
> java version "1.7.0_17"
> OpenJDK Runtime Environment (IcedTea7 2.3.8) (ArchLinux build 7.u17_2.3.8-1-x86_64)
> OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
>            Reporter: Martin Kalcher
>             Fix For: 1.6
>
>
> OfficeParser throws a NullPointerException for a doc file.
> % java -Djava.awt.headless=false -jar tika-app-1.3.jar -t < test.doc 
> Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@29a01add
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> 	at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:139)
> 	at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:400)
> 	at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:112)
> Caused by: java.lang.NullPointerException
> 	at org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.uncompressCHP(CharacterSprmUncompressor.java:48)
> 	at org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:288)
> 	at org.apache.poi.hwpf.model.StyleSheet.<init>(StyleSheet.java:121)
> 	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:346)
> 	at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:79)
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> 	... 5 more
> I can not share the doc file at the moment, but i will ask my clients if you need it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)