You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/10/11 22:58:50 UTC

[jira] Commented: (TIKA-56) Mime type detection fails with upper case file extensions such as "PDF".

    [ https://issues.apache.org/jira/browse/TIKA-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534157 ] 

Chris A. Mattmann commented on TIKA-56:
---------------------------------------

Hi Keith:

I'm not necessary sure that this is a bug, no? If you're doing mime detection, with magic turned off, and it has to use the file extension, is it ever the case (no pun intended ;) ), where the "case" of the file extension matters? If so, then I would suggest we not change the mime system to be case insensitive.

Know of any cases where this is true?

> Mime type detection fails with upper case file extensions such as "PDF".
> ------------------------------------------------------------------------
>
>                 Key: TIKA-56
>                 URL: https://issues.apache.org/jira/browse/TIKA-56
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 0.1-incubator
>            Reporter: Keith R. Bennett
>            Priority: Critical
>             Fix For: 0.1-incubator
>
>
> Mime type detection only seems to work when the file extension is lower case.  Both PDF and DOC extensions failed.
> To test this, add the following method to TestParsers:
>     public void testGetParsers() throws TikaException, MalformedURLException {
>         assertNotNull(ParseUtils.getParser(new URL("file:x.pdf"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.PDF"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.doc"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.DOC"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.txt"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.TXT"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.html"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.HTML"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.HtMl"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.htm"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.HTM"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.ppt"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.PPT"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.xls"), tc));
>         assertNotNull(ParseUtils.getParser(new URL("file:x.XLS"), tc));
>         // more?
>     }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.