You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2015/11/09 17:45:11 UTC
[jira] [Closed] (PDFBOX-495) PDFTextStripperByArea extracts text
only from 1 region, despite several regions being defined
[ https://issues.apache.org/jira/browse/PDFBOX-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr closed PDFBOX-495.
----------------------------------
Resolution: Cannot Reproduce
Closing for lack of details. Please reopen only if you attach
- a PDF
- some code that shows the problem on a current version.
> PDFTextStripperByArea extracts text only from 1 region, despite several regions being defined
> ---------------------------------------------------------------------------------------------
>
> Key: PDFBOX-495
> URL: https://issues.apache.org/jira/browse/PDFBOX-495
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 0.8.0-incubator
> Environment: Debian, java SE 6
> Reporter: Ismael Hasan
>
> When trying to extract the text from several areas defined in the PDFTextStripperByArea, it only
> retrieves the text from one. The problem can be seen with the following steps:
> Divide a page in 4 regions and add the regions to the stripper in
> the following order:
> 1-upper left, 2-upper right, 3-lower left, 4-lower right.
> After calling "extractRegions" function, only the text for the third
> one is retrieved.
> If the third region is not added (i.e., only regions 1, 2 and 4 are added), only the text for region 2 is retrieved.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org