You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Johanneke Lamberink (JIRA)" <ji...@apache.org> on 2018/11/15 18:19:00 UTC
[jira] [Commented] (PDFBOX-3646) Annotations parsed from XFDF
containing ampersand characters are not properly imported
[ https://issues.apache.org/jira/browse/PDFBOX-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16688466#comment-16688466 ]
Johanneke Lamberink commented on PDFBOX-3646:
---------------------------------------------
[~tilman] I switched jobs last year and no longer work with PDFBox. Kudos on fixing the issue though :)
> Annotations parsed from XFDF containing ampersand characters are not properly imported
> --------------------------------------------------------------------------------------
>
> Key: PDFBOX-3646
> URL: https://issues.apache.org/jira/browse/PDFBOX-3646
> Project: PDFBox
> Issue Type: Bug
> Components: AcroForm, PDModel
> Affects Versions: 2.0.3, 2.0.4, 2.0.5, 2.0.6
> Environment: java 1.8.0_112
> Reporter: Kai Keggenhoff
> Assignee: Tilman Hausherr
> Priority: Major
> Labels: xfdf
> Fix For: 2.0.13, 3.0.0 PDFBox
>
> Attachments: MergeTest.java, output1.pdf, output2.pdf, sample.xfdf
>
>
> Annotations containing "&" in their text are displayed incorrectly when parsed unmodified from XFDF (the ampersands are encoded as "&" there) and added to a PDF document.
> This occurs for both "text comment" and "text box" type annotations.
> However, if the XFDF is modified by replacing "&" with "&amp;" prior to parsing, the imported annotations are then displayed correctly.
> The attached code produces two pdf files. One is the PDF with the unmodified XFDF imported, two the PDF with the modifed XFDF.
> A XFDF containing both a text box and text comment annotation is embedded in the source and attached as a separated file.
> Update 23.03.2017 : This problem persists in 2.0.5 and we noticed the same corruption of merged annotations occur, if the annotation text contains a "<" (encoded as "lt" entity)
> Update 17.10.2018 : This corruption is caused by FDFAnnotation.richContentsToString. This method reads "<" and "&" from the parsed values in the document and puts them as such into the markup, but these characters must be replaced with their entities.
> I'll add this substitution to my proposed bugfix of 4345, please refer to https://issues.apache.org/jira/projects/PDFBOX/issues/PDFBOX-4345
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org