You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Pavel Micka (JIRA)" <ji...@apache.org> on 2015/05/20 09:51:59 UTC
[jira] [Created] (TIKA-1632) ZLIB magic detection support
Pavel Micka created TIKA-1632:
---------------------------------
Summary: ZLIB magic detection support
Key: TIKA-1632
URL: https://issues.apache.org/jira/browse/TIKA-1632
Project: Tika
Issue Type: Improvement
Components: detector
Reporter: Pavel Micka
Priority: Minor
In our environment we encounter many compressed streams, one of them (which is currently not supported by Tika) is ZLIB. According to my sources and experience the magics that cover majority of ZLIB archives are these:
<mime-type type="application/zlib">
<_comment>Zlib Compressed Archive</_comment>
<magic priority="45">
<match value="\x78\x01" type="string" offset="0" />
<match value="\x78\x9c" type="string" offset="0" />
<match value="\x78\xda" type="string" offset="0" />
</magic>
</mime-type>
Well described here:
http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like
Original RFC here:
http://tools.ietf.org/html/rfc1950
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)