You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Vadim Roizman (JIRA)" <ji...@apache.org> on 2013/12/26 20:32:50 UTC

[jira] [Updated] (TIKA-1110) Incorrectly declared SUPPORTED_TYPES in ChmParser.

     [ https://issues.apache.org/jira/browse/TIKA-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vadim Roizman updated TIKA-1110:
--------------------------------

    Attachment: TIKA-1110.patch

Nick, the patch lists all 3 types, also content-type in metadata set to  "application/vnd.ms-htmlhelp".

> Incorrectly declared SUPPORTED_TYPES in ChmParser.
> --------------------------------------------------
>
>                 Key: TIKA-1110
>                 URL: https://issues.apache.org/jira/browse/TIKA-1110
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.3, 1.4
>            Reporter: Andrzej Bialecki 
>             Fix For: 1.5
>
>         Attachments: TIKA-1110.patch
>
>
> [This link|http://www.iana.org/assignments/media-types/application/vnd.ms-htmlhelp] assigns the official mime type for these files to "application/vnd.ms-htmlhelp". In the wild there are also two other types used:
> * application/chm
> * application/x-chm
> tika-mimetypes.xml uses the correct official mime type, but ChmParser declares that it supports only "application/chm". For this reason content that uses the official mime type (e.g. coming via Detector or parsed using AutoDetectParser, or simply declared in metadata) fails to parse due to unknown mime type.
> The fix seems simple - ChmParser should declare also all of the above types in its SUPPORTED_TYPES.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)