You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2014/02/22 17:19:19 UTC

[jira] [Commented] (TIKA-1241) Tika does not recognise empty nor spanning ZIP files magic

    [ https://issues.apache.org/jira/browse/TIKA-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13909422#comment-13909422 ] 

Nick Burch commented on TIKA-1241:
----------------------------------

Thanks for this, applied with minor tweak + new unit test in r1570860.

> Tika does not recognise empty nor spanning ZIP files magic
> ----------------------------------------------------------
>
>                 Key: TIKA-1241
>                 URL: https://issues.apache.org/jira/browse/TIKA-1241
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Cservenak, Tamas
>            Priority: Minor
>             Fix For: 1.6
>
>
> As it turns out, magic differs for non-empty, empty and
> spanning ZIP files. Tika recognizes only the non-empty ZIP files.
> Magic for empty ZIP file is validated with hexdump:
> https://gist.github.com/cstamas/6e90ae73f83c8e4a3f42
> Also described on Wikipedia
> http://en.wikipedia.org/wiki/Zip_(file_format)
> (see sidebar with Magic Numbers)
> Proposed change:
> add two more match entries to ZIP MIME definition:
> https://github.com/apache/tika/pull/4



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)