You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by Maruan Sahyoun <sa...@fileaffairs.de> on 2016/05/18 16:55:01 UTC

PDFBox and word forming for certain languages

Hi,

longer term I'd like to support word forming for languages such as Arabic for form filling and annotations.

For that I'd like to understand if we already have access to the GSUB table in OpenType fonts and discuss where the forming would best fit

- font.encode()
- showText()
- specific prior to these

There is already some code at Apache FOP [1] where we could potentially work with the colleagues.

BR
Maruan  


[1] http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/fop-core/src/main/java/org/apache/fop/complexscripts/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: PDFBox and word forming for certain languages

Posted by John Hewson <jo...@jahewson.com>.
> On 22 May 2016, at 15:14, John Hewson <jo...@jahewson.com> wrote:
> 
> 
>> On 18 May 2016, at 09:55, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>> 
>> Hi,
>> 
>> longer term I'd like to support word forming for languages such as Arabic for form filling and annotations.
>> 
>> For that I'd like to understand if we already have access to the GSUB table in OpenType fonts and discuss where the forming would best fit

Oh, I should add that we don’t parse the GSUB table currently. It’s quite a complex table, too. Again, FOP has code to do this. Note that fonts embedded in PDFs likely have this table stripped.

— John

>> - font.encode()
>> - showText()
>> - specific prior to these
> 
> showText() is a good choice, you could keep the existing behaviour as showUnshapedText(). What you’re
> wanting to implement is “shaping”, which maps Unicode characters => glyph indexes + position deltas.
> AWT represents the shaped glyphs as a GlyphVector, you might want to take inspiration from that API.
> 
> — John
> 
>> There is already some code at Apache FOP [1] where we could potentially work with the colleagues.
>> 
>> BR
>> Maruan  
>> 
>> 
>> [1] http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/fop-core/src/main/java/org/apache/fop/complexscripts/
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: PDFBox and word forming for certain languages

Posted by John Hewson <jo...@jahewson.com>.
> On 18 May 2016, at 09:55, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
> 
> Hi,
> 
> longer term I'd like to support word forming for languages such as Arabic for form filling and annotations.
> 
> For that I'd like to understand if we already have access to the GSUB table in OpenType fonts and discuss where the forming would best fit
> 
> - font.encode()
> - showText()
> - specific prior to these

showText() is a good choice, you could keep the existing behaviour as showUnshapedText(). What you’re
wanting to implement is “shaping”, which maps Unicode characters => glyph indexes + position deltas.
AWT represents the shaped glyphs as a GlyphVector, you might want to take inspiration from that API.

— John

> There is already some code at Apache FOP [1] where we could potentially work with the colleagues.
> 
> BR
> Maruan  
> 
> 
> [1] http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/fop-core/src/main/java/org/apache/fop/complexscripts/
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org