You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2014/10/11 00:20:34 UTC

[jira] [Closed] (PDFBOX-123) too many space made in extracted text file

     [ https://issues.apache.org/jira/browse/PDFBOX-123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Hewson closed PDFBOX-123.
------------------------------
       Resolution: Fixed
    Fix Version/s: 2.0.0

Without the PDF I can't confirm that this is fixed, but the relevant code in 2.0 has been fixed for similar issues.

> too many space made in extracted text file
> ------------------------------------------
>
>                 Key: PDFBOX-123
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-123
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>             Fix For: 2.0.0
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1416011
> Originally submitted by nobody on 2006-01-26 21:34.
> Hi,
> I use FontBox-0.1.0-dev.jar and PDFBox-0.7.3-dev.jar 
> from version PDFBox-0.7.3-dev-20060126.
> My tested pdf file is test.pdf and the text extracted 
> text file is testPdf.txt as in uploaded test.zip file.
> It extracted successfully,but there are many space 
> made between each chinese character.
> Please check ! thanks
> Regards
> NanFei
> [attachment on SourceForge]
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1416011&file_id=165091
> test.zip (application/x-zip-compressed), 7307 bytes
> one is test.pdf , the other is extracted text file as testPdf.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)