You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Thamizh Thomas <th...@gmail.com> on 2015/01/30 10:19:00 UTC

Tamil PDF issues

Hi All,

I have a requirement to read tamil pdf content and persist in db. When I
read using pdfbox, the characters are junked and unreadable. Please help me
on this to resolve the issue. If you have worked on such contexts, please
post me sample code snippet.

-----

*Thanks*

*   Thamizh Thomas A*

Re: Tamil PDF issues

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi,

when you write 'read pdf content' do you mean to extract the text? If so - can the text be extracted using Adobe Reader? What's the output you are getting when you save the file as text with Adobe Reader?

BR

Maruan

Am 30.01.2015 um 10:19 schrieb Thamizh Thomas <th...@gmail.com>:

> Hi All,
> 
> I have a requirement to read tamil pdf content and persist in db. When I
> read using pdfbox, the characters are junked and unreadable. Please help me
> on this to resolve the issue. If you have worked on such contexts, please
> post me sample code snippet.
> 
> -----
> 
> *Thanks*
> 
> *   Thamizh Thomas A*