You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2015/12/01 18:34:10 UTC

[jira] [Commented] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

    [ https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034142#comment-15034142 ] 

John Hewson commented on PDFBOX-3138:
-------------------------------------

The embedded font used by the field does indeed contain Hebrew glyphs, and a valid "cmap" table which can be used to look up those glyphs. The mentioned character, U+05D7, is indeed is present in the font. 

The embedded font file is in OpenType format, however the PDF Font dictionary is Type1 and specifies WinAnsiEncoding, which does not include Hebrew characters. So, strictly speaking, the field cannot be filled using any non-ANSI characters and so PDFBox's behaviour is correct.

It would seem that PDFBox could so something more helpful in this instance. Filling the form with Acrobat results in the font from the form's DR being overridden in the Field itself with a new CIDFontType0 which has been created from the DR font. Ideally we would do that.

Do you have any control over the software producing these fields? I might be able to offer a workaround.

> PDTextField doesn't accept any Hebrew characters as new value
> -------------------------------------------------------------
>
>                 Key: PDFBOX-3138
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3138
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm, FontBox
>    Affects Versions: 2.0.0
>         Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>            Reporter: Gilad Denneboom
>            Priority: Minor
>             Fix For: 2.1.0
>
>         Attachments: SetHebrewFieldValueTest.java, Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for U+05D7 in font AdobeHebrew-Regular
> 	at org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
> 	at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
> 	at org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
> 	at org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
> 	at org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
> 	at org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
> 	at org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
> 	at org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
> 	at org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
> 	at org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
> 	at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org