You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pdfbox.apache.org by Krishnan Kishore <sk...@hovservices.in> on 2010/11/26 12:54:04 UTC

RE: Identity-H (solution may help )

Hi.

While I am using other pdf library. I got the same issue

Now I identified the font of the pdf text,
I installed it then I got the correct character.

Reason:
I think if u r getting "??" character
the windows application (notepad etc..)
cannot identify the correct character to the unicode character.
So it is displaying it as "?".(so need to install the font)

While  debugging if you get the equal Unicode value of the text character
Then you your convertion is correct.

Hope this helps 
Cheers
Krishna Kishore

-----Original Message-----
From: arun segar [mailto:arunsegar@gmail.com] 
Sent: Friday, November 26, 2010 4:42 PM
To: users@pdfbox.apache.org; users-help@pdfbox.apache.org
Subject: Re: Identity-H

Hi Guys,

Any update on the below help...

Thanks,
Arun Segar

On Mon, Nov 22, 2010 at 12:48 PM, arun segar <ar...@gmail.com> wrote:

> Hi,
>
> Is there any one can help me to solve the Identity-H fonts issue while
> extracting text from PDF.
>
> While extracting the text Identity-H fonts came as question mark(*?*).
> Anybody can help me regarding it.
>
> Thanks,
> Arun Segar

Confidentiality Notice:  This transmittal is a confidential communication.  If you are not the intended recipient, you are hereby notified that you have received this transmittal in error and that any review, dissemination, distribution or copying of this transmittal is strictly prohibited.  If you have received this communication in error, please notify this office immediately by reply and immediately delete this message and all of its attachments, if any.

RE: Identity-H (solution may help )

Posted by Krishnan Kishore <sk...@hovservices.in>.

Hi, arun

You need to identify the  font and install  the corresponding 
Font family (where the conversion is going on) 
Cheers,
KrishnaKishore


-----Original Message-----
From: arun segar [mailto:arunsegar@gmail.com] 
Sent: Saturday, November 27, 2010 11:23 AM
To: users@pdfbox.apache.org
Subject: Re: Identity-H (solution may help )

Hi Krishna,

Thanks for your email. I don't have the fonts like below in my workflow.

Font: ZapfDingbatsITC(Embedded Subset)
TYPE: Type 1(CID)
Encoding: Identity-H

Is there any way to change the Encoding Identity-H in the already build PDF
so that it can be useful for me.

Your help in this regard is greatly appreciated.

Thanks,
Arun Segar


On Fri, Nov 26, 2010 at 5:24 PM, Krishnan Kishore
<sk...@hovservices.in>wrote:

> Hi.
>
> While I am using other pdf library. I got the same issue
>
> Now I identified the font of the pdf text,
> I installed it then I got the correct character.
>
> Reason:
> I think if u r getting "??" character
> the windows application (notepad etc..)
> cannot identify the correct character to the unicode character.
> So it is displaying it as "?".(so need to install the font)
>
> While  debugging if you get the equal Unicode value of the text character
> Then you your convertion is correct.
>
> Hope this helps
> Cheers
> Krishna Kishore
>
>
>
> -----Original Message-----
> From: arun segar [mailto:arunsegar@gmail.com]
> Sent: Friday, November 26, 2010 4:42 PM
> To: users@pdfbox.apache.org; users-help@pdfbox.apache.org
> Subject: Re: Identity-H
>
> Hi Guys,
>
> Any update on the below help...
>
> Thanks,
> Arun Segar
>
> On Mon, Nov 22, 2010 at 12:48 PM, arun segar <ar...@gmail.com> wrote:
>
> > Hi,
> >
> > Is there any one can help me to solve the Identity-H fonts issue while
> > extracting text from PDF.
> >
> > While extracting the text Identity-H fonts came as question mark(*?*).
> > Anybody can help me regarding it.
> >
> > Thanks,
> > Arun Segar
>
>
> Confidentiality Notice:  This transmittal is a confidential communication.
>  If you are not the intended recipient, you are hereby notified that you
> have received this transmittal in error and that any review,
dissemination,
> distribution or copying of this transmittal is strictly prohibited.  If
you
> have received this communication in error, please notify this office
> immediately by reply and immediately delete this message and all of its
> attachments, if any.
>
>


Confidentiality Notice:  This transmittal is a confidential communication.  If you are not the intended recipient, you are hereby notified that you have received this transmittal in error and that any review, dissemination, distribution or copying of this transmittal is strictly prohibited.  If you have received this communication in error, please notify this office immediately by reply and immediately delete this message and all of its attachments, if any.

Re: Identity-H (solution may help )

Posted by arun segar <ar...@gmail.com>.

Hi Krishna,

Thanks for your email. I don't have the fonts like below in my workflow.

Font: ZapfDingbatsITC(Embedded Subset)
TYPE: Type 1(CID)
Encoding: Identity-H

Is there any way to change the Encoding Identity-H in the already build PDF
so that it can be useful for me.

Your help in this regard is greatly appreciated.

Thanks,
Arun Segar


On Fri, Nov 26, 2010 at 5:24 PM, Krishnan Kishore
<sk...@hovservices.in>wrote:

> Hi.
>
> While I am using other pdf library. I got the same issue
>
> Now I identified the font of the pdf text,
> I installed it then I got the correct character.
>
> Reason:
> I think if u r getting "??" character
> the windows application (notepad etc..)
> cannot identify the correct character to the unicode character.
> So it is displaying it as "?".(so need to install the font)
>
> While  debugging if you get the equal Unicode value of the text character
> Then you your convertion is correct.
>
> Hope this helps
> Cheers
> Krishna Kishore
>
>
>
> -----Original Message-----
> From: arun segar [mailto:arunsegar@gmail.com]
> Sent: Friday, November 26, 2010 4:42 PM
> To: users@pdfbox.apache.org; users-help@pdfbox.apache.org
> Subject: Re: Identity-H
>
> Hi Guys,
>
> Any update on the below help...
>
> Thanks,
> Arun Segar
>
> On Mon, Nov 22, 2010 at 12:48 PM, arun segar <ar...@gmail.com> wrote:
>
> > Hi,
> >
> > Is there any one can help me to solve the Identity-H fonts issue while
> > extracting text from PDF.
> >
> > While extracting the text Identity-H fonts came as question mark(*?*).
> > Anybody can help me regarding it.
> >
> > Thanks,
> > Arun Segar
>
>
> Confidentiality Notice:  This transmittal is a confidential communication.
>  If you are not the intended recipient, you are hereby notified that you
> have received this transmittal in error and that any review, dissemination,
> distribution or copying of this transmittal is strictly prohibited.  If you
> have received this communication in error, please notify this office
> immediately by reply and immediately delete this message and all of its
> attachments, if any.
>
>