You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (Commented) (JIRA)" <ji...@apache.org> on 2011/12/17 11:44:33 UTC

[jira] [Commented] (TIKA-815) Tika parsers should handle failures more gracefully

    [ https://issues.apache.org/jira/browse/TIKA-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171516#comment-13171516 ] 

Nick Burch commented on TIKA-815:
---------------------------------

FYI Tika does provide the Fork Parser for cases when you want to ensure the parsing can't affect the parent application
                
> Tika parsers should handle failures more gracefully
> ---------------------------------------------------
>
>                 Key: TIKA-815
>                 URL: https://issues.apache.org/jira/browse/TIKA-815
>             Project: Tika
>          Issue Type: Test
>          Components: parser
>    Affects Versions: 1.0
>            Reporter: Jerome Lacoste
>
> We encountered an OOM while parsing a Word document. We will report the failure to POI.
> This raises the question about the general robustness of the parsers.
> We've written a little test tool that reproduces the aforementionned OOM and other potential issues that will be reported to the individual parsers. It's the responsibility of the parsers to handle those failures gracefully.
> Yet it's easy to write generic tools at the Tika level to make these kind of tests.
> So we also submit this issue here to start a discussion on what role should Tika have when it comes to validate its parsers.
> Code here: https://github.com/lacostej/tika-hardener

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira