You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Nick Burch <ap...@gagravarr.org> on 2015/03/01 08:02:21 UTC

Re: File extension for application/gzip

On Sat, 28 Feb 2015, Adam Lamar wrote:
> I'd appreciate a change of the default!

Best bet would be to open a jira for this, then the change can be tracked 
and will have a jira id

> Every tgz is application/gzip, but not every application/gzip is a tgz. 
> Also, it seems to me that the parsers should be able to decompress the 
> first few bytes and check for the tar magic bytes at offset 257, if it 
> were important to differentiate between a gz and tgz on specific files 
> (if this is not already done).

The compressed file parser already does that!

Won't help for mime magic detection though, as that has to work on the raw 
(and hence compressed) byte patterns

Nick

Re: File extension for application/gzip

Posted by Adam Lamar <ad...@gmail.com>.
> Best bet would be to open a jira for this, then the change can be tracked and will have a jira id

https://issues.apache.org/jira/browse/TIKA-1563

Many thanks,
Adam