You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Pradhan C N <pr...@gmail.com> on 2013/07/19 09:34:28 UTC

PDFBox TextPostion.getHeightDir() gives 4 times the actual height only for font cambria

Hi all,
I am using PDFBox 1.8.2 PDFTextStripper. I noticed that if the pdf file has
Cambria font the height of the text position is completely wrong. It turns
out that the height is exactly 4 times the actual height.
I have attached the test pdf file I used and also find below the output
from my program for each char in the pdf. Its a simple pdf which has the
word cambria in different font sizes.
I have tested with all other fonts and it gives me perfect height but why
cambria fails ?

Font size 2

C getHeightDir() 5.3568 getYScale()  1.92
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 5.3568 getYScale()  1.92
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 5.3568 getYScale()  1.92
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 5.3568 getYScale()  1.92
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 5.3568 getYScale()  1.92
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 5.3568 getYScale()  1.92
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 5.3568 getYScale()  1.92
getFont().getFontBoundingBox().getHeight() 5580.0

Font size 4

c getHeightDir() 10.7136 getYScale()  3.84
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 10.7136 getYScale()  3.84
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 10.7136 getYScale()  3.84
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 10.7136 getYScale()  3.84
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 10.7136 getYScale()  3.84
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 10.7136 getYScale()  3.84
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 10.7136 getYScale()  3.84
getFont().getFontBoundingBox().getHeight() 5580.0

Font size 6

c getHeightDir() 16.740002 getYScale()  6.0
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 16.740002 getYScale()  6.0
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 16.740002 getYScale()  6.0
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 16.740002 getYScale()  6.0
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 16.740002 getYScale()  6.0
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 16.740002 getYScale()  6.0
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 16.740002 getYScale()  6.0
getFont().getFontBoundingBox().getHeight() 5580.0

Font size 8

c getHeightDir() 22.0968 getYScale()  7.9199996
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 22.0968 getYScale()  7.9199996
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 22.0968 getYScale()  7.9199996
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 22.0968 getYScale()  7.9199996
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 22.0968 getYScale()  7.9199996
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 22.0968 getYScale()  7.9199996
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 22.0968 getYScale()  7.9199996
getFont().getFontBoundingBox().getHeight() 5580.0

Font size 10

c getHeightDir() 27.453602 getYScale()  9.84
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 27.453602 getYScale()  9.84
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 27.453602 getYScale()  9.84
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 27.453602 getYScale()  9.84
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 27.453602 getYScale()  9.84
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 27.453602 getYScale()  9.84
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 27.453602 getYScale()  9.84
getFont().getFontBoundingBox().getHeight() 5580.0

Font size 12

c getHeightDir() 33.480003 getYScale()  12.0
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 33.480003 getYScale()  12.0
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 33.480003 getYScale()  12.0
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 33.480003 getYScale()  12.0
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 33.480003 getYScale()  12.0
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 33.480003 getYScale()  12.0
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 33.480003 getYScale()  12.0
getFont().getFontBoundingBox().getHeight() 5580.0

Font size 14

c getHeightDir() 38.836803 getYScale()  13.92
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 38.836803 getYScale()  13.92
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 38.836803 getYScale()  13.92
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 38.836803 getYScale()  13.92
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 38.836803 getYScale()  13.92
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 38.836803 getYScale()  13.92
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 38.836803 getYScale()  13.92
getFont().getFontBoundingBox().getHeight() 5580.0

Font size 16

c getHeightDir() 44.1936 getYScale()  15.839999
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 44.1936 getYScale()  15.839999
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 44.1936 getYScale()  15.839999
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 44.1936 getYScale()  15.839999
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 44.1936 getYScale()  15.839999
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 44.1936 getYScale()  15.839999
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 44.1936 getYScale()  15.839999
getFont().getFontBoundingBox().getHeight() 5580.0

Font size 18

c getHeightDir() 50.220005 getYScale()  18.0
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 50.220005 getYScale()  18.0
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 50.220005 getYScale()  18.0
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 50.220005 getYScale()  18.0
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 50.220005 getYScale()  18.0
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 50.220005 getYScale()  18.0
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 50.220005 getYScale()  18.0
getFont().getFontBoundingBox().getHeight() 5580.0

Font size 72

c getHeightDir() 200.88002 getYScale()  72.0
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 200.88002 getYScale()  72.0
getFont().getFontBoundingBox().getHeight() 5580.0
m getHeightDir() 200.88002 getYScale()  72.0
getFont().getFontBoundingBox().getHeight() 5580.0
b getHeightDir() 200.88002 getYScale()  72.0
getFont().getFontBoundingBox().getHeight() 5580.0
r getHeightDir() 200.88002 getYScale()  72.0
getFont().getFontBoundingBox().getHeight() 5580.0
i getHeightDir() 200.88002 getYScale()  72.0
getFont().getFontBoundingBox().getHeight() 5580.0
a getHeightDir() 200.88002 getYScale()  72.0
getFont().getFontBoundingBox().getHeight() 5580.0

As you can see for font size 72 the height was 200.88 but the actual height
is 50.22 which is 200.88 / 4

Thanks,
Pradhan

Re: PDFBox TextPostion.getHeightDir() gives 4 times the actual height only for font cambria

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,

Am 19.07.2013 09:34, schrieb Pradhan C N:
> Hi all,
> I am using PDFBox 1.8.2 PDFTextStripper. I noticed that if the pdf file has
> Cambria font the height of the text position is completely wrong. It turns out
> that the height is exactly 4 times the actual height.
> I have attached the test pdf file I used and also find below the output from my
Due to some restrictions your pdf didn't  make it. Please send either a link to
a download location (shared hoster or something similar) or create an issue on
JIRA [1] and attach the pdf in question.

> program for each char in the pdf. Its a simple pdf which has the word cambria in
> different font sizes.
> I have tested with all other fonts and it gives me perfect height but why
> cambria fails ?
>
> Font size 2
>
> C getHeightDir() 5.3568 getYScale()  1.92
> getFont().getFontBoundingBox().getHeight() 5580.0
> a getHeightDir() 5.3568 getYScale()  1.92
> getFont().getFontBoundingBox().getHeight() 5580.0
> m getHeightDir() 5.3568 getYScale()  1.92
> getFont().getFontBoundingBox().getHeight() 5580.0
> b getHeightDir() 5.3568 getYScale()  1.92
> getFont().getFontBoundingBox().getHeight() 5580.0
> r getHeightDir() 5.3568 getYScale()  1.92
> getFont().getFontBoundingBox().getHeight() 5580.0
> i getHeightDir() 5.3568 getYScale()  1.92
> getFont().getFontBoundingBox().getHeight() 5580.0
> a getHeightDir() 5.3568 getYScale()  1.92
> getFont().getFontBoundingBox().getHeight() 5580.0
> SNIP

> Thanks,
> Pradhan

BR
Andreas Lehmkühler

[1] https://issues.apache.org/jira/browse/PDFBOX