You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/01 16:38:00 UTC

[jira] [Commented] (TIKA-2436) Support for GZIP-compressed EMF files

    [ https://issues.apache.org/jira/browse/TIKA-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109247#comment-16109247 ] 

Tim Allison commented on TIKA-2436:
-----------------------------------

Thank you for explaining this in more detail, [~mcaruanagalizia].  [~gagravarr], given that emf and wmf are nearly always embedded in MSOffice files, should we do the detection there and enable the EMFParser and WMFParser to handle both emz and wmz respectively?  If someone somehow came across a standalone, they'd get GZ->EMF.

So, that's alternative 1.  Alternatives 2 and 3 I really don't like:
2) create a separate detector for emz/wmz
3) add emf/wmf detection inside the CompressorParser...the HORROR!

Option 4, I guess, is to stop now and leave it as is, but that doesn't meet [~mcaruanagalizia]'s use case.

> Support for GZIP-compressed EMF files
> -------------------------------------
>
>                 Key: TIKA-2436
>                 URL: https://issues.apache.org/jira/browse/TIKA-2436
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime, parser
>    Affects Versions: 1.15
>            Reporter: Matthew Caruana Galizia
>         Attachments: image004.emz
>
>
> Tika is currently detecting EMZ (compressed EMF) files as simple gzip files. These files should instead be detected as EMF files and the EMFParser should perform decompression transparently.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)