You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2016/11/01 17:25:58 UTC

[jira] [Comment Edited] (PDFBOX-3550) Unicode Letters fail to join

    [ https://issues.apache.org/jira/browse/PDFBOX-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15626063#comment-15626063 ] 

John Hewson edited comment on PDFBOX-3550 at 11/1/16 5:25 PM:
--------------------------------------------------------------

Parsing the OpenType tables is the tip of the iceberg. Many complex scripts (such as Arabic) require shaping engines which require deep knowledge of the languages in order to follow the rules in the OpenType tables.

Looking into FOP's shaping engine, it doesn't support many scripts and in general doesn't look particularly great. Right now Harfbuzz is the only game in town, but that's C++.


was (Author: jahewson):
Parsing the OpenType tables is the tip of the iceberg. Many complex scripts (such as Arabic) require shaping engines which require deep knowledge of the languages in order to follow the rules in the OpenType tables.

Looking into FOP's shaping engine, it doesn't support many scripts and in general doesn't look particularly great.

> Unicode Letters fail to join
> ----------------------------
>
>                 Key: PDFBOX-3550
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3550
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: FontBox, PDModel
>         Environment: All
>            Reporter: Omid Pourhadi
>              Labels: unicode
>         Attachments: BYekan.ttf
>
>
> the problem is, in some languages letters need to be joined together for example, consider this word 
> {color:red}
> سلام 
> {color}
> but after creating a pdf it contorts to 
> {color:red}
> س‌ل‌ام
> {color}
> with extra semi-spaces. I think this is a bug in pdfbox and definetly is not related to font.
> {code:title=SampleCode.java|borderStyle=solid}
> public class SampleCode
> {
>     public static void main(String[] args) throws IOException
>     {
>         
>         PDDocument document = new PDDocument();
> 	//this font perfectly works in iText and JasperReport with the same text
>         PDFont titleFont = PDType0Font.load(document, SampleCode.class.getResourceAsStream("/BYekan.ttf"));
>         PDPage page = new PDPage(PDRectangle.A4);
>         document.addPage(page);
>         PDPageContentStream contentStream = new PDPageContentStream(document, page);
>         contentStream.beginText();
>         contentStream.setFont(titleFont, 12);
>         contentStream.newLineAtOffset(0, 100);
>         contentStream.showText("سلام");
>         contentStream.endText();
>         contentStream.close();
>         
>       
>         document.save(new File("/home/omidp/temp/htmltopdf/output.pdf"));
>         document.close();
>     }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org