You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2011/05/06 04:52:03 UTC
[jira] [Created] (TIKA-655) IWorkPackageParser / IWorkParser not
registering properly
IWorkPackageParser / IWorkParser not registering properly
---------------------------------------------------------
Key: TIKA-655
URL: https://issues.apache.org/jira/browse/TIKA-655
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 0.9
Reporter: Nick Burch
Assignee: Nick Burch
If you try to use AutoDetectParser to handle an iWork document, it'll fail with:
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
However IWorkPackageParser works fine. It seems the IWorkParser needs just the individual zip part, but is registered as the handler for the individual mime types, so breaks.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (TIKA-655) IWorkPackageParser / IWorkParser not
registering properly
Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Burch resolved TIKA-655.
-----------------------------
Resolution: Fixed
Fix Version/s: 1.0
> IWorkPackageParser / IWorkParser not registering properly
> ---------------------------------------------------------
>
> Key: TIKA-655
> URL: https://issues.apache.org/jira/browse/TIKA-655
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.9
> Reporter: Nick Burch
> Assignee: Nick Burch
> Fix For: 1.0
>
>
> If you try to use AutoDetectParser to handle an iWork document, it'll fail with:
> org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
> at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
> However IWorkPackageParser works fine. It seems the IWorkParser needs just the individual zip part, but is registered as the handler for the individual mime types, so breaks.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TIKA-655) IWorkPackageParser / IWorkParser not
registering properly
Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029720#comment-13029720 ]
Nick Burch commented on TIKA-655:
---------------------------------
In r1100039, I've pushed the iWorks detection logic from ZipContainerDetector to IWorkPackageParser, and made that detect similar to OfficeParser does.
Then, put the content handler selection logic into IWorkPackageParser, and remove IWorkParser (which claimed to be a regular parser but in fact only worked when called from IWorkPackageParser). The result is that tika app can then parse iWork files, and unit tests still work
> IWorkPackageParser / IWorkParser not registering properly
> ---------------------------------------------------------
>
> Key: TIKA-655
> URL: https://issues.apache.org/jira/browse/TIKA-655
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.9
> Reporter: Nick Burch
> Assignee: Nick Burch
> Fix For: 1.0
>
>
> If you try to use AutoDetectParser to handle an iWork document, it'll fail with:
> org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
> at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
> However IWorkPackageParser works fine. It seems the IWorkParser needs just the individual zip part, but is registered as the handler for the individual mime types, so breaks.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira