You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Volker Kunert (Jira)" <ji...@apache.org> on 2020/10/03 15:00:00 UTC

[jira] [Commented] (PDFBOX-4951) Sequences with combining letters are rendered incorrectly

    [ https://issues.apache.org/jira/browse/PDFBOX-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206724#comment-17206724 ] 

Volker Kunert commented on PDFBOX-4951:
---------------------------------------

I uploaded the following files to utilize the positioning features of 
FOP in PDFBOX:

{{patch-2020-10-02.txt}}
	Patch containing changes to {{PDAbstractContentStream}}
	and 4 new classes to integrate FOP positioning in new package 
	{{org.apache.fontbox.fop}}

{{ExamplePdfboxFopPos.java
ExamplePdfboxFopPos.pdf
}}	Example how to create a new PDF-document with correct positioning
	
{{ExamplePdfboxFopPosForm.java
ExamplePdfboxFopPosForm.pdf
}}	Example how to fill text into a PDF-form with correct positioning


{{DefaultScriptProcessor.java
}}	a patched version of {{org.apache.fop.complexscripts.scripts.DefaultScriptProcessor}}
	to circumvent FOP-2969


To use FOP-positioning the user would be required to add FOP to the
class path and create and register the positioning class
with {{PDAppearanceContentStream}} resp. {{PDPageContentStream}}, e.g.:

{{FopTextPositioner fopTextPositioner = FopTextPositioner.create(new FileInputStream(fontFile), font.getName());			
if(fopTextPositioner!=null) {
		PDAppearanceContentStream.put(font, fopTextPositioner);
}
}}

> Sequences with combining letters are rendered incorrectly
> ---------------------------------------------------------
>
>                 Key: PDFBOX-4951
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4951
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.21
>            Reporter: Volker Kunert
>            Priority: Major
>         Attachments: DIN_SPEC_91379_Sequences-aa.pdf, DIN_SPEC_91379_Sequences-ab.pdf, DIN_SPEC_91379_Sequences-ac.pdf, DIN_SPEC_91379_Sequences.txt, DefaultScriptProcessor.java, ExamplePdfboxFopPos.java, ExamplePdfboxFopPos.pdf, ExamplePdfboxFopPosForm.java, ExamplePdfboxFopPosForm.pdf, TestPdfbox.java, TestPdfboxFop2.java, TestPdfboxFop2.pdf, TestPdfboxJava2D.java, TestPdfboxJava2D.pdf, patch-2020-10-02.txt, pdfbox.pdf, screenshot-1.png
>
>
> Accented Letters composed of Unicode base letter and combining accent are rendered wrong. E.g. with 0041 030B LATIN CAPITAL LETTER A WITH COMBINING DOUBLE ACUTE ACCENT the accent appears at the right hand side of the letter A, not above the letter A.
> The position is wrong for most of the sequences defined in the following spec:
> DIN SPEC 91379: Characters in Unicode for the electronic processing of names and data 
>  exchange in Europe; with digital attachment
>  [https://www.xoev.de/downloads-2316#StringLatin]
>  [https://www.din.de/de/wdc-beuth:din21:301228458]
>  
> The correct rendering should look like the output of hb-view 2.6.8, see files DIN_SPEC_91379_Sequences*.pdf.
> The output of PDFBox is appended in pdfbox.pdf, which is created by running TestPdfbox.java. The sequences are read from file DIN_SPEC_91379_Sequences.txt.
>  
> Font used for testing: NotoSansMono-Regular.ttf, see [https://www.google.com/get/noto/] 
> download: [https://noto-website-2.storage.googleapis.com/pkgs/NotoSansMono-hinted.zip]
>  See also FOP-2969
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org