You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Cservenak, Tamas (JIRA)" <ji...@apache.org> on 2014/02/21 16:01:28 UTC

[jira] [Created] (TIKA-1241) Tike does not recognise empty nor spanning ZIP files magic

Cservenak, Tamas created TIKA-1241:
--------------------------------------

             Summary: Tike does not recognise empty nor spanning ZIP files magic
                 Key: TIKA-1241
                 URL: https://issues.apache.org/jira/browse/TIKA-1241
             Project: Tika
          Issue Type: Improvement
            Reporter: Cservenak, Tamas
            Priority: Minor


As it turns out, magic differs for non-empty, empty and
spanning ZIP files. Tika recognizes only the non-empty ZIP files.

Magic for empty ZIP file is validated with hexdump:
https://gist.github.com/cstamas/6e90ae73f83c8e4a3f42

Also described on Wikipedia
http://en.wikipedia.org/wiki/Zip_(file_format)
(see sidebar with Magic Numbers)

Proposed change:
add two more match entries to ZIP MIME definition:
https://github.com/apache/tika/pull/4



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)