You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2018/04/15 00:12:00 UTC

[jira] [Comment Edited] (PDFBOX-4189) Enable rendering of Indian languages, by reading and utilizing the GSUB table

    [ https://issues.apache.org/jira/browse/PDFBOX-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438540#comment-16438540 ] 

John Hewson edited comment on PDFBOX-4189 at 4/15/18 12:11 AM:
---------------------------------------------------------------

Hi guys, this is a really welcome contribution, thank you. With regards to PDFont#encode(String text) being non-final I can add some insight as I was the original designer of our current PDFont#encode mechanism.

Basically, the PDFont classes are designed to represent fonts identically to how they are represented when embedded in PDF files. So there's no support for OpenType, by design. A Type0 font knows nothing about OpenType.

So how can we use OpenType in PDFBox? The answer is that we do it one layer of abstraction up, during text _layout_ instead of text _encoding_*_._* So you want to put your glyph substitution code inside PDPageContentStream#showText, actually you want [PDPageContentStream#showTextInternal|https://github.com/apache/pdfbox/blob/7e721643c0b1fca9fdc349f78431f36e68abc097/pdfbox/src/main/java/org/apache/pdfbox/contentstream/PDAbstractContentStream.java#L256].

That way PDFont#encode(String text) can stay non-final :)


was (Author: jahewson):
Hi guys, this is a really welcome contribution, thank you. With regards to PDFont#encode(String text) being non-final I can add some insight as I was the original designer of our current PDFont#encode mechanism.

Basically, the PDFont classes are designed to represent fonts identically to how they are represented when embedded in PDF files. So there's no support for OpenType, by design. A Type0 font knows nothing about OpenType (by design).

So how can we use OpenType in PDFBox? The answer is that we do it one layer of abstraction up, during text _layout_ instead of text _encoding_*_._* So you want to put your glyph substitution code inside PDPageContentStream#showText, actually you want [PDPageContentStream#showTextInternal|https://github.com/apache/pdfbox/blob/7e721643c0b1fca9fdc349f78431f36e68abc097/pdfbox/src/main/java/org/apache/pdfbox/contentstream/PDAbstractContentStream.java#L256].

That way PDFont#encode(String text) can stay non-final :)

> Enable rendering of Indian languages, by reading and utilizing the GSUB table
> -----------------------------------------------------------------------------
>
>                 Key: PDFBOX-4189
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4189
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: FontBox, PDModel
>            Reporter: Palash Ray
>            Priority: Major
>         Attachments: Bengali-text-after.pdf, Bengali-text-before.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Implemented proper rendering of Indian languages, which need extensive Glyph substitution. The GSUB table has been read and used effectively to replace some compound words with their respective Glyphs. All tests are passing. I have tested this for the Bengali font. Please review these changes and let me know if it makes sense to incorporate these.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org