You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Ghenadie (JIRA)" <ji...@apache.org> on 2018/09/05 08:51:00 UTC
[jira] [Created] (TIKA-2723) Issue with parsing .mht container
Ghenadie created TIKA-2723:
------------------------------
Summary: Issue with parsing .mht container
Key: TIKA-2723
URL: https://issues.apache.org/jira/browse/TIKA-2723
Project: Tika
Issue Type: Bug
Components: mime
Affects Versions: 1.17
Reporter: Ghenadie
Fix For: 1.17
Hello,
I have a file with .mht extension. Tika processes this file as an email (Is Email? - true), and uses RFC822Parser to parse it. As a result, I have the content with email fields, as: From, To, CC, BCC, Subject.
This is an issue for me. And seems to be an issue from Tika. As far as this is a web container, it should not be parsed through RFCParser (which is an email parser).
Please investigate this issue as soon as possible.
Please let me know in case of any questions.
Thank you,
Ghenadie R.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)