You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "sbathrutheen (JIRA)" <ji...@apache.org> on 2016/10/25 15:59:58 UTC
[jira] [Created] (TIKA-2143) POI deprecated method used in TIKA
1.13
sbathrutheen created TIKA-2143:
----------------------------------
Summary: POI deprecated method used in TIKA 1.13
Key: TIKA-2143
URL: https://issues.apache.org/jira/browse/TIKA-2143
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.13, 1.9
Environment: Windows java application
Reporter: sbathrutheen
Priority: Trivial
Fix For: 1.13
We see that TIKA throws a long list of errors when extraction ppt files. We tested with standalone tike application (1.13) we cannot reproduce the issue.
We took a look at POI source code and abserved the class "HSLFSlideShow" we could see the below deprecated method defined
*
/**
- * Get the lookup from slide numbers to their offsets inside
- * _ptrData, used when adding or moving slides.
- *
- * @deprecated since POI 3.11, not supported anymore
- */
- @Deprecated
- public Hashtable<Integer,Integer> getSlideOffsetDataLocationsLookup() {
- throw new UnsupportedOperationException("PersistPtrHolder.getSlideOffsetDataLocationsLookup() is not supported since 3.12-Beta1");
- }
*
we may think Tika library still calling this deprecated method causing this run time Exception
Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@204c3b78
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:283)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at com.searchtechnologies.aspire.docprocessing.extracttext.ExtractTextStage.process(ExtractTextStage.java:140)
... 14 more
Caused by: java.lang.UnsupportedOperationException
at java.util.AbstractMap$SimpleImmutableEntry.setValue(Unknown Source)
at org.apache.poi.hslf.HSLFSlideShow.read(HSLFSlideShow.java:293)
at org.apache.poi.hslf.HSLFSlideShow.buildRecords(HSLFSlideShow.java:273)
at org.apache.poi.hslf.HSLFSlideShow.<init>(HSLFSlideShow.java:188)
at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
... 17 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)