You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by John Hewson <jo...@jahewson.com> on 2016/08/01 18:23:53 UTC

Re: Character widths

> On 31 Jul 2016, at 14:01, Aaron Mulder <am...@gmail.com> wrote:
> 
> I was poking around the code for string widths in PDFont.getStringWidth and etc.
> 
> It looks like to compute the width of a string it just adds up the
> widths of each individual character in the string:
> 
> https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/PDFont.java?revision=1750576&view=markup#l339
> 
> But the AFM files for the standard 14 fonts defines pairwise kerning
> adjustments, so e.g. "AV" takes less width than the normal width of
> "A" plus the normal width of "V":
> 
> https://wwwimages2.adobe.com/content/dam/Adobe/en/devnet/font/pdfs/Core14_AFMs.zip
> 
> Is there something I'm missing?  Or does this just mean PDFont
> occasionally overestimates the width of text in those fonts?

That’s correct. getStringWidth is for use when creating new PDFs only, and we don’t
implement kerning, so the width you get will be “correct” with respect to the text
in the newly created PDF.

The same goes for the “kern” and “GPOS” tables in TTF fonts. We don’t even parse them.

— John

> Thanks,
>      Aaron
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Character widths

Posted by John Hewson <jo...@jahewson.com>.
> On 1 Aug 2016, at 14:12, Aaron Mulder <am...@gmail.com> wrote:
> 
> On Mon, Aug 1, 2016 at 2:23 PM, John Hewson <jo...@jahewson.com> wrote:
>> That’s correct. getStringWidth is for use when creating new PDFs only, and we don’t
>> implement kerning, so the width you get will be “correct” with respect to the text
>> in the newly created PDF.
> 
> So, I guess I don't understand this part.
> 
> My impression was that if you wrote something like "/Helvetica 12 Tf
> (AVAVAV) Tj" to the PDF document, a reader displaying the document
> would use the standard-14 Helvetica font to draw the string "AVAVAV"
> to the screen, and because that font implements kerning for those
> character combinations, the reader would display the text as more
> compressed horizontally than in the absence of kerning.

No, PDF is rather unusual in this respect. Any kerns have to be specified
manually in the Tj array. The idea being that the PDF renderer is a “dumb”
client:

e.g. "AWAY again” with A-W-A kerned:

[ (A) 120 (W) 120 (A) 95 (Y again) ] TJ

See p251 of ISO 32000 for more details.

— John

> So my expectation is that if you did the commands above and then drew
> a box around it using the horizontal dimension provided by
> getStringWidth("AVAVAV"), your box would display as slightly larger
> than the visible text width due to the absence of kerning in
> calculating the box size based on the text width at the time you wrote
> the document, but the presence of kerning in rendering the text in the
> reader.
> 
> Where am I going wrong there?
> 
> Thanks,
>       Aaron
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Character widths

Posted by Aaron Mulder <am...@gmail.com>.
On Mon, Aug 1, 2016 at 2:23 PM, John Hewson <jo...@jahewson.com> wrote:
> That’s correct. getStringWidth is for use when creating new PDFs only, and we don’t
> implement kerning, so the width you get will be “correct” with respect to the text
> in the newly created PDF.

So, I guess I don't understand this part.

My impression was that if you wrote something like "/Helvetica 12 Tf
(AVAVAV) Tj" to the PDF document, a reader displaying the document
would use the standard-14 Helvetica font to draw the string "AVAVAV"
to the screen, and because that font implements kerning for those
character combinations, the reader would display the text as more
compressed horizontally than in the absence of kerning.

So my expectation is that if you did the commands above and then drew
a box around it using the horizontal dimension provided by
getStringWidth("AVAVAV"), your box would display as slightly larger
than the visible text width due to the absence of kerning in
calculating the box size based on the text width at the time you wrote
the document, but the presence of kerning in rendering the text in the
reader.

Where am I going wrong there?

Thanks,
       Aaron

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Character widths

Posted by John Hewson <jo...@jahewson.com>.
> On 1 Aug 2016, at 11:27, Tilman Hausherr <TH...@t-online.de> wrote:
> 
> Am 01.08.2016 um 20:23 schrieb John Hewson:
>> The same goes for the “kern” and “GPOS” tables in TTF fonts. We don’t even parse them.
> 
> 
> We do parse the kern table.

So we do, I see that it was added last year. That’s good.
GPOS is the important one though, “kern” is a legacy table.

> 
> Tilman
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Character widths

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 01.08.2016 um 20:23 schrieb John Hewson:
> The same goes for the \u201ckern\u201d and \u201cGPOS\u201d tables in TTF fonts. We don\u2019t even parse them.


We do parse the kern table.


Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org