You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2009/05/27 20:47:45 UTC

[jira] Resolved: (TIKA-193) PDFParser adds mime-type twice

     [ https://issues.apache.org/jira/browse/TIKA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved TIKA-193.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.4
         Assignee: Jukka Zitting

Patch committed in revision 779269, thanks! Resolving as Fixed.

Re: Setting the type only in AutoDetectParser
there are cases where the specific parser classes are used directly, and even in those cases it would be useful to have the content type metadata set. Also, in some cases the specific parser implementation may have more information than AutoDetectParser and can thus provide a more accurate content type.

> PDFParser adds mime-type twice
> ------------------------------
>
>                 Key: TIKA-193
>                 URL: https://issues.apache.org/jira/browse/TIKA-193
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.3
>            Reporter: Jonathan Koren
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 0.4
>
>         Attachments: patch
>
>
> Using AutoDetectParser to call PDFParser causes the mime-type to be added twice.  It should be added exactly once.
> Proposed Fix:
> parser/pdf/PDFParser.java should be changed from:
> metadata.add(Metadata.CONTENT_TYPE, "application/pdf");
> to:
> metadata.set(Metadata.CONTENT_TYPE, "application/pdf");
> as per other Tika bundled parsers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.