You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2015/07/21 22:04:04 UTC

[jira] [Commented] (TIKA-1692) Enable getExtension() for texty file types that include encoding information

    [ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635718#comment-14635718 ] 

Nick Burch commented on TIKA-1692:
----------------------------------

You'd get something similar with a type of {{application/vnd.ms-excel}} and a hypothetical {{application/vnd.ms-excel; version=10}}, the latter being a child of the former. The former is known in the mime types file, and so has lots of details, the latter is a newly-registered child of it which was created by the call to {{types.forName}}

Maybe we want to tweak the {{getRegisteredMimeType}} method, so it would try the with-parameters type first, without-parameters second, and null if not? (Some of our defined mimetypes in the file do have parameters, so we can't just ignore them). This one doesn't register now, and returns null if not known, we'd just add the "try dropping parameters if you don't know them" logic

You could then do something like
{code}
String name = "application/xml; charset=UTF-8";
MimeType mimeType = types.getRegisteredMimeType(name);
if (mimeType != null) {
       assertEquals("xml", mimeType.getExtension());
       assertEquals(name, mimeType.toString());
} else {
     System.err.println("Sorry, this type isn't one we know about: " + name);
}
{code}

> Enable getExtension() for texty file types that include encoding information
> ----------------------------------------------------------------------------
>
>                 Key: TIKA-1692
>                 URL: https://issues.apache.org/jira/browse/TIKA-1692
>             Project: Tika
>          Issue Type: Improvement
>          Components: core
>            Reporter: Tim Allison
>            Priority: Trivial
>             Fix For: 1.10
>
>         Attachments: MimeUtilTest.java
>
>
> {{getExtension()}} offers a handy way to add a "detected" extension from a {{MimeType}} for a file that didn't come with an extension.  However, this functionality doesn't work with texty files: html, xml, css, csv, etc.  
> Let's add a static helper class (or build it into {{MimeType}}?) that will output an extension for all mime types including texty mime types. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)