You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2018/06/20 15:05:00 UTC

[jira] [Created] (TIKA-2675) OpenDocumentParser should fail on invalid zip files

Sebastian Nagel created TIKA-2675:
-------------------------------------

             Summary: OpenDocumentParser should fail on invalid zip files
                 Key: TIKA-2675
                 URL: https://issues.apache.org/jira/browse/TIKA-2675
             Project: Tika
          Issue Type: Bug
          Components: parser
            Reporter: Sebastian Nagel


The OpenDocumentParser assumes a zip file as container. However, if it is called on an invalid zip stream from a remote URL (see NUTCH-2603), the parser signals success and returns a document with no/empty content. The behavior is different when called on a local file: while the [constructor of ZipFile|https://docs.oracle.com/javase/8/docs/api/java/util/zip/ZipFile.html#ZipFile-java.io.File-] fails on invalid input, the [constructor of ZipInputStream|https://docs.oracle.com/javase/8/docs/api/java/util/zip/ZipInputStream.html#ZipInputStream-java.io.InputStream-] silently ignores the input.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)