You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/07/03 14:03:00 UTC

[jira] [Resolved] (TIKA-2403) Elasticsearch 5.2.2 - Ingest Node - PDF - Parsing Issue

     [ https://issues.apache.org/jira/browse/TIKA-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison resolved TIKA-2403.
-------------------------------
    Resolution: Not A Problem

> Elasticsearch 5.2.2 - Ingest Node - PDF - Parsing Issue
> -------------------------------------------------------
>
>                 Key: TIKA-2403
>                 URL: https://issues.apache.org/jira/browse/TIKA-2403
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Boopathi
>         Attachments: SampleDocument.pdf
>
>
> We are using Elasticsearch 5.2.2  for Full text search. With the help of ingest node we are able to parse the content of files which tika supports. We are facing some issue while parsing the content of some PDF files . It parsed the content of file successfully and in addition to that some additional terms which is not even the content of that document. [sample screen shot|https://www.screencast.com/t/AQWK9Rzvrdo8]. Kindly let me know what is reason for this and how can it be fixed



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)