You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Ashok Chigullapally (JIRA)" <ji...@apache.org> on 2011/02/08 01:54:57 UTC
[jira] Updated: (PDFBOX-957) Text extraction using ExtractText (pdf
file is input file) generates some weired characters
[ https://issues.apache.org/jira/browse/PDFBOX-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ashok Chigullapally updated PDFBOX-957:
---------------------------------------
Attachment: Resume1.pdf
Resume file as pdf which cannot be extracted.
> Text extraction using ExtractText (pdf file is input file) generates some weired characters
> -------------------------------------------------------------------------------------------
>
> Key: PDFBOX-957
> URL: https://issues.apache.org/jira/browse/PDFBOX-957
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.4.0
> Environment: Windows 7
> Reporter: Ashok Chigullapally
> Priority: Critical
> Labels: pdfbox, textExtraction
> Attachments: Resume1.pdf, Resume2.pdf
>
>
> When I tried to extract text from pdf document it is generating some gibberish text.
> ExtractText.exe "\Jobvite\Resumes\Resume-Boston.pdf Resume-Boston.txt
> Will provide the pdf documents when requested, I could not find a way to include attachments.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira