You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (Jira)" <ji...@apache.org> on 2020/06/12 05:47:00 UTC
[jira] [Commented] (TIKA-3113) Currently Tika is detecting a .aux
file as text/html
[ https://issues.apache.org/jira/browse/TIKA-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133944#comment-17133944 ]
Nick Burch commented on TIKA-3113:
----------------------------------
I'm not sure what this is, but I'm fairly sure it isn't latex...
Maybe some kind of scientific format?
> Currently Tika is detecting a .aux file as text/html
> ----------------------------------------------------
>
> Key: TIKA-3113
> URL: https://issues.apache.org/jira/browse/TIKA-3113
> Project: Tika
> Issue Type: Bug
> Components: detector
> Affects Versions: 1.24
> Reporter: Danny McKinney
> Priority: Minor
> Attachments: TES.PC.00010363.1.aux
>
>
> While processing files from an Enron test data set a file with extension aux was detected to be MediaType of text/html. The file contains elements <Header> and <Data> but is a type of LaTex file I believe. I am attachingĀ sample file.[^TES.PC.00010363.1.aux]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)