You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Slava G <sl...@gmail.com> on 2020/01/22 09:27:48 UTC
Incorrect text extraction of the PDF
Hi,
I have PDF, which is looks fine in readers but when I trying to extract
text I get garbage.
What am I doing wrong ?
PDF is attached.
Thanks
Re: Incorrect text extraction of the PDF
Posted by Slava G <sl...@gmail.com>.
Thanks Maruan,
I got the explanation.
Slava
On Wed, Jan 22, 2020 at 12:18 PM Maruan Sahyoun <sa...@fileaffairs.de>
wrote:
> Hi,
>
> please take a look at the FAQ at
>
> https://pdfbox.apache.org/2.0/faq.html#how-come-i-am-getting-gibberishg38g43g36g51g5-when-extracting-text
>
> BR
> Maruan
>
> > Hi,
> > I have PDF, which is looks fine in readers but when I trying to extract
> text I get garbage.
> > What am I doing wrong ?
> > PDF is attached.
> > Thanks
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
Re: Incorrect text extraction of the PDF
Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi,
please take a look at the FAQ at
https://pdfbox.apache.org/2.0/faq.html#how-come-i-am-getting-gibberishg38g43g36g51g5-when-extracting-text
BR
Maruan
> Hi,
> I have PDF, which is looks fine in readers but when I trying to extract text I get garbage.
> What am I doing wrong ?
> PDF is attached.
> Thanks
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Incorrect text extraction of the PDF
Posted by Gilad Denneboom <gi...@gmail.com>.
You can't attach files here directly. Upload it to a file-sharing website
(Dropbox, Google Drive, etc.) and then post a link to it.
On Wed, Jan 22, 2020 at 10:28 AM Slava G <sl...@gmail.com> wrote:
> Hi,
> I have PDF, which is looks fine in readers but when I trying to extract
> text I get garbage.
> What am I doing wrong ?
> PDF is attached.
> Thanks
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org