You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2014/06/06 15:22:02 UTC

[jira] [Commented] (TIKA-1326) MSI file detection

    [ https://issues.apache.org/jira/browse/TIKA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019834#comment-14019834 ] 

Nick Burch commented on TIKA-1326:
----------------------------------

I was about to say "that can't possibly be right", but it looks like msi files really are based on the OLE2 document structure, bizarre!

Any chance you could help us find a really small (few kb?) liberally licensed .msi file, which we can use for tests?

> MSI file detection
> ------------------
>
>                 Key: TIKA-1326
>                 URL: https://issues.apache.org/jira/browse/TIKA-1326
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>    Affects Versions: 1.5
>            Reporter: Luis Filipe Nassif
>            Priority: Minor
>
> Please remove *.msi extension from application/x-msdownload mime-type definition, incorrectly listed there, and add the following mime-type in tika-mimetypes.xml:
> {code}
> <mime-type type="application/x-ms-installer">
>     	<_comment>Microsoft Windows Installer</_comment> 
>     	<sub-class-of type="application/x-tika-msoffice"/>
>     	<glob pattern="*.msi"/>
>     	<glob pattern="*.msp"/>
>     	<glob pattern="*.mst"/>
> </mime-type>
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)