You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Petras (JIRA)" <ji...@apache.org> on 2015/08/07 17:31:45 UTC

[jira] [Commented] (PDFBOX-2923) CFFParser parser treats CIDFont's charset data as SID

    [ https://issues.apache.org/jira/browse/PDFBOX-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661991#comment-14661991 ] 

Petras commented on PDFBOX-2923:
--------------------------------

Here is a test case which works when patch is applied.
{code:java}
    @Test
    public void testParseCIDFont() throws Exception {
        final CFFParser parser = new CFFParser();

        byte[] sourceBytes = ... ; // bytes from MyriadPro-Regular.cff

        final List<CFFFont> fontList = parser.parse(sourceBytes);
        Assert.assertEquals(1, fontList.size());
        final CFFFont cffFont = fontList.get(0);

        // correct number of charstrings
        Assert.assertEquals("expected 3 charstrings", 3, cffFont.getCharStringsDict().size());

        List<CFFCharset.Entry> charsetEntries = cffFont.getCharset().getEntries();

        // correct CID expected
        Assert.assertEquals("expected 2 SID found in Charset data", 2, charsetEntries.size());
        Assert.assertEquals("expected SID 469 be found in Charset data", 469, charsetEntries.get(0).getSID());
        Assert.assertEquals("expected SID 469 be found in Charset data", "469", charsetEntries.get(0).getName());
        Assert.assertEquals("expected SID 508 be found in Charset data", 508, charsetEntries.get(1).getSID());
        Assert.assertEquals("expected SID 508 be found in Charset data", "508", charsetEntries.get(1).getName());

        // correct width expected
        Assert.assertEquals(500, cffFont.getWidth(0));
        Assert.assertEquals(501, cffFont.getWidth(469));
        Assert.assertEquals(551, cffFont.getWidth(508));
    }
{code}

> CFFParser parser treats CIDFont's charset data as SID
> -----------------------------------------------------
>
>                 Key: PDFBOX-2923
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2923
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 1.8.10
>            Reporter: Petras
>         Attachments: MyriadPro-Regular.cff, Patch_to_fix_PDFBOX-2923.patch
>
>
> As stated in Compact Font File specification:
> {quote}
> The charset data, although in the same format as non-CIDFonts, will represent CIDs rather than SIDs, i.e. charstrings are “named” by CIDs in a CIDFont.
> {quote}
> Unfortunately, {{CFFParser}} does not consider this specific and always treat charset data as SID: is looking for SID referenced text in _String INDEX_ structure. Since such SID-indexed string does not exist there, it sets the name of the glyph to "{{.ndef}}".
> Consequently, {{CFFParser}} fails to register correct charstrings, as it associates glyph names to them using a map. As there can be several charstrings, only the last charstring entry is retained.
> Then also {{CFFFont.getWidth()}} method fails to return correct width of the given CID as correct link between CID and charstring is lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org