You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2015/05/23 14:23:17 UTC

[jira] [Resolved] (TIKA-1632) ZLIB magic detection support

     [ https://issues.apache.org/jira/browse/TIKA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Burch resolved TIKA-1632.
------------------------------
       Resolution: Fixed
    Fix Version/s: 1.9

> ZLIB magic detection support
> ----------------------------
>
>                 Key: TIKA-1632
>                 URL: https://issues.apache.org/jira/browse/TIKA-1632
>             Project: Tika
>          Issue Type: Improvement
>          Components: detector
>            Reporter: Pavel Micka
>            Priority: Minor
>             Fix For: 1.9
>
>
> In our environment we encounter many compressed streams, one of them (which is currently not supported by Tika 1.8) is ZLIB. According to my sources and experience the magics that cover majority of ZLIB archives are these:
>     <mime-type type="application/zlib">
>         <_comment>Zlib Compressed Archive</_comment>
>         <magic priority="45">
>             <match value="\x78\x01" type="string" offset="0" />
>             <match value="\x78\x9c" type="string" offset="0" />
>             <match value="\x78\xda" type="string" offset="0" />
>         </magic>
>     </mime-type>
> Well described here:
> http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like
> Original RFC here:
> http://tools.ietf.org/html/rfc1950



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)