You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2016/04/18 18:26:25 UTC

[jira] [Commented] (PDFBOX-3319) Chinese character overlap other chinese character

    [ https://issues.apache.org/jira/browse/PDFBOX-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245944#comment-15245944 ] 

Tilman Hausherr commented on PDFBOX-3319:
-----------------------------------------

When removing the "!" the widths are OK. There is a problem with fonts that have some identical hmetrics, i.e. where the last advanceWidth value applies to all subsequent glyphs. (I'd call this a "partial monotype font") I'm wondering whether in TTFSubsetter.buildHmtxTable the "catchall" entry ("The last entry applies to all subsequent glyphs") is copied.

In the font numHMetrics is 855. Some debug output:
{code}
Subsetter.add unicode 20320, gid 1430
Subsetter.add unicode 33, gid 4
Subsetter.add unicode 20013, gid 1123
Subsetter.add unicode 22269, gid 3379
Subsetter.add unicode 22909, gid 4019
{code}
The "1" (gid 4) is the only gid smaller than 855 so it will get its specific width. All the other ones have an identical width.

numHMetrics for the subset is written in buildHheaTable:

{code}
writeUint16(out, glyphIds.subSet(0, h.getNumberOfHMetrics()).size());
{code}
So it just counts the used gids that are between 0 and 855. If any gid is larger than 855 then it should put one extra entry in the subset for all, i.e. the one from 854. Currently the one from gid 4 would be the last one, and the "!" is smaller than the chinese glyphs.

One quick solution would be to add in the 2nd TTFSubsetter constructor this line at the end:
{code}
glyphIds.add(ttf.getHorizontalHeader().getNumberOfHMetrics()-1);
{code}
However this would also mean that the subsetted font gets too big (for this example, the difference is 263 bytes) because we don't check whether it is needed, and we would also write the path of that glyph instead of just writing one single hmtx entry (4 bytes).

> Chinese character overlap other chinese character
> -------------------------------------------------
>
>                 Key: PDFBOX-3319
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3319
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.0
>            Reporter: huazhong
>            Priority: Critical
>         Attachments: SimHei.ttf, TestPDF.java, china-word.pdf, testChinese.pdf
>
>
> i'm using SimHei.ttf copy from my windows fonts folder.
> i found when i use my font to show Chinese.character will overlap other chinese. just for Chinese character, english is ok. 
> it looks the second character start from the a half width first character?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org