You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/10/01 17:19:00 UTC
[jira] [Resolved] (TIKA-3563) Add a few more mimes
[ https://issues.apache.org/jira/browse/TIKA-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison resolved TIKA-3563.
-------------------------------
Fix Version/s: 2.1.1
Resolution: Fixed
> Add a few more mimes
> --------------------
>
> Key: TIKA-3563
> URL: https://issues.apache.org/jira/browse/TIKA-3563
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Trivial
> Fix For: 2.1.1
>
>
> I recently compared the linux file command vs tika on a month of commoncrawl where Tika had initially identified 'application/octet'. When I query 'file' identified file types when Tika returned octet-stream, I get the list below for the top 20 most common.
> I think it should be fairly straightforward (easy and precise) to add x-nes-rom, marc, and icc profile.
> If anyone sees other file types that we would want to add, let me know.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)