You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jason Borg (JIRA)" <ji...@apache.org> on 2015/06/29 08:53:05 UTC

[jira] [Updated] (TIKA-530) InvalidFormatException on a PackagePart in OOXML

     [ https://issues.apache.org/jira/browse/TIKA-530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Borg updated TIKA-530:
----------------------------
    Attachment: Presentation1.pptx

File to replicate issue

> InvalidFormatException on a PackagePart in OOXML
> ------------------------------------------------
>
>                 Key: TIKA-530
>                 URL: https://issues.apache.org/jira/browse/TIKA-530
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.8
>            Reporter: Sjoerd Smeets
>         Attachments: Presentation1.pptx
>
>
> Hi,
> I receive the following error when parsing an ooxml file:
> Caused by: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Absolute URI forbidden:  file://///ravn.co.uk/London/Jobs/first%20introduction%20/Welcome%20day/1.avi
>     at org.apache.poi.openxml4j.opc.PackagePartName?.throwExceptionIfAbsoluteUri(PackagePartName?.java:426) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.openxml4j.opc.PackagePartName?.throwExceptionIfInvalidPartUri(PackagePartName?.java:175) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.openxml4j.opc.PackagePartName?.<init>(PackagePartName?.java:83) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.openxml4j.opc.PackagingURIHelper.createPartName(PackagingURIHelper.java:470) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.POIXMLDocument.getTargetPart(POIXMLDocument.java:95) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.POIXMLDocument.getTargetPart(POIXMLDocument.java:84) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.xslf.XSLFSlideShow.<init>(XSLFSlideShow.java:89) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.xslf.extractor.XSLFPowerPointExtractor.<init>(XSLFPowerPointExtractor.java:45) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory?.java:183) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory?.java:150) ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
>     at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:53) ~[tika-parsers-0.8-SNAPSHOT.jar:na]
> I can see that Absolute URI is forbidden, however, should it not just ignore the PackagePartName in POI and move on with the other parts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)