You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Amit Kumar (JIRA)" <ji...@apache.org> on 2017/02/01 11:23:51 UTC

[jira] [Comment Edited] (TIKA-2249) Tika not able to parse tables from pdf

    [ https://issues.apache.org/jira/browse/TIKA-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848246#comment-15848246 ] 

Amit Kumar edited comment on TIKA-2249 at 2/1/17 11:22 AM:
-----------------------------------------------------------

[~tallison@mitre.org] : I tried Aspose.Pdf for Java(free trial version), it works quite well in converting pdf into html. To use it seemlessly I would have to buy the license, ahhh... Just wished we could have such solutions in open source as well. Definitely Aspose team has done something which Tika is lacking and can build upon.

They provide lots of option like save html file with image embedded, the converted html file maintains table structure and other structures intact.


was (Author: devilsuse):
[~tallison@mitre.org] : I tried Aspose.Pdf for Java(free trial version), it works quite well in converting pdf into html. To use it seemlessly I would have to buy the license, ahhh... Just wished we could have such solutions in open source as well. Definitely Aspose team has done something which Tika is lacking and can build upon.

> Tika not able to parse tables from pdf 
> ---------------------------------------
>
>                 Key: TIKA-2249
>                 URL: https://issues.apache.org/jira/browse/TIKA-2249
>             Project: Tika
>          Issue Type: Bug
>          Components: handler
>            Reporter: Amit Kumar
>         Attachments: Japanese.pdf
>
>
> Tika not able to parse tables from pdf. I want to attach sample pdf which I tried but attachment/browse link is not visible to me.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)