You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/10/11 22:58:50 UTC
[jira] Commented: (TIKA-56) Mime type detection fails with upper
case file extensions such as "PDF".
[ https://issues.apache.org/jira/browse/TIKA-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534157 ]
Chris A. Mattmann commented on TIKA-56:
---------------------------------------
Hi Keith:
I'm not necessary sure that this is a bug, no? If you're doing mime detection, with magic turned off, and it has to use the file extension, is it ever the case (no pun intended ;) ), where the "case" of the file extension matters? If so, then I would suggest we not change the mime system to be case insensitive.
Know of any cases where this is true?
> Mime type detection fails with upper case file extensions such as "PDF".
> ------------------------------------------------------------------------
>
> Key: TIKA-56
> URL: https://issues.apache.org/jira/browse/TIKA-56
> Project: Tika
> Issue Type: Bug
> Components: general
> Affects Versions: 0.1-incubator
> Reporter: Keith R. Bennett
> Priority: Critical
> Fix For: 0.1-incubator
>
>
> Mime type detection only seems to work when the file extension is lower case. Both PDF and DOC extensions failed.
> To test this, add the following method to TestParsers:
> public void testGetParsers() throws TikaException, MalformedURLException {
> assertNotNull(ParseUtils.getParser(new URL("file:x.pdf"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.PDF"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.doc"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.DOC"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.txt"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.TXT"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.html"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.HTML"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.HtMl"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.htm"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.HTM"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.ppt"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.PPT"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.xls"), tc));
> assertNotNull(ParseUtils.getParser(new URL("file:x.XLS"), tc));
> // more?
> }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.