You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/10/01 17:19:00 UTC

[jira] [Resolved] (TIKA-3563) Add a few more mimes

     [ https://issues.apache.org/jira/browse/TIKA-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison resolved TIKA-3563.
-------------------------------
    Fix Version/s: 2.1.1
       Resolution: Fixed

> Add a few more mimes
> --------------------
>
>                 Key: TIKA-3563
>                 URL: https://issues.apache.org/jira/browse/TIKA-3563
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Trivial
>             Fix For: 2.1.1
>
>
> I recently compared the linux file command vs tika on a month of commoncrawl where Tika had initially identified 'application/octet'.  When I query 'file' identified file types when Tika returned octet-stream, I get the list below for the top 20 most common.
> I think it should be fairly straightforward (easy and precise) to add x-nes-rom, marc, and icc profile.
> If anyone sees other file types that we would want to add, let me know.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)