You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2014/06/09 21:51:02 UTC

[jira] [Commented] (PDFBOX-18) Possible to Extractact Just Table From PDF

    [ https://issues.apache.org/jira/browse/PDFBOX-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025643#comment-14025643 ] 

Tilman Hausherr commented on PDFBOX-18:
---------------------------------------

Indeed, there is no such thing as a "table" in a PDF. See also the similar question and its answer here
http://stackoverflow.com/q/23828463/535646

Many OCR programs attempt to identify "tables" and sometimes it works, sometimes it doesn't.

> Possible to Extractact Just Table From PDF
> ------------------------------------------
>
>                 Key: PDFBOX-18
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-18
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: Text extraction
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552835&aid=1020203
> Originally submitted by nobody on 2004-08-31 23:23.
> Sir i want to know is it possible to extract Just Table 
> From PDF File ,if it is possible then 
> Tell me how i can identify in Streams that this Streams 
> contains Table 
> Sir i want to mention you also that previously i 
> extracted the Text from PDF file and i know the whole 
> structure of PDF file 
> Just Tell me the exact way how i identify 
> Sir i am waiting for you reply
> [comment on SourceForge]
> Originally sent by benlitchfield.
> Logged In: YES 
> user_id=601708
> This is an RFE for table support, not a bug request, so I 
> am changing the issue type.  In addition, PDF documents do 
> not contain 'tables', so that information would need to be 
> derived and could only be done with little accuracy.  I am 
> changing the priority to 1, as I will probably never 
> implement this myself.  Please feel free to submit a patch 
> though.
> Ben



--
This message was sent by Atlassian JIRA
(v6.2#6252)