You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/29 10:18:57 UTC

[jira] [Resolved] (TIKA-1034) MimeTypes seems to be doing unnecessary work in the detect method

     [ https://issues.apache.org/jira/browse/TIKA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved TIKA-1034.
---------------------------------

    Resolution: Won't Fix

See the {{Detector}} javadocs. You can pass {{null}} as the {{InputStream}} in such cases. If you do pass a non-{{null}} stream, it needs to support the mark/reset feature (you'll need to wrap the stream in {{TikaInputStream}} or {{BufferedInputStream}} if necessary).

The reason why we only check the type hint from the input metadata after trying the other detection methods is that often such type hints (for example coming from a remote web server) are not very accurate. Thus we only use them if a more specific type can't automatically be detected.

Resolving as Won't Fix.
                
> MimeTypes seems to be doing unnecessary work in the detect method
> -----------------------------------------------------------------
>
>                 Key: TIKA-1034
>                 URL: https://issues.apache.org/jira/browse/TIKA-1034
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>    Affects Versions: 1.2
>            Reporter: Bice Dibley
>
> The final section of MimeTypes.detect is always used to set the type if provided in the metadata, but does this after using two other resolution strategies. Would it be possible to move the following to the top of the detect method
> {code}
> // Get type based on metadata hint (if available)
> String typeName = metadata.get(Metadata.CONTENT_TYPE);
> if (typeName != null) {
>     try {
>          MediaType hint = forName(typeName).getType();
>          if (registry.isSpecializationOf(hint, type)) {
>             type = hint;
>          }
>     } catch (MimeTypeException e) {
>         // Malformed type name, ignore
> }
> {code}
> and if the type is successfully set, return at that point rather than continuing with the other resolution strategies?
> The reason I ask is that I'm experiencing a problem with MimeType.detect causing the stream to be closed because the type of the stream being used is read-once and so doesn't support mark/reset. However, I am passing the content type of the file in as part of the metadata, so the detect method shouldn't need to read from the stream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira