You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Johanneke Lamberink (JIRA)" <ji...@apache.org> on 2018/11/15 18:19:00 UTC

[jira] [Commented] (PDFBOX-3646) Annotations parsed from XFDF containing ampersand characters are not properly imported

    [ https://issues.apache.org/jira/browse/PDFBOX-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16688466#comment-16688466 ] 

Johanneke Lamberink commented on PDFBOX-3646:
---------------------------------------------

[~tilman] I switched jobs last year and no longer work with PDFBox. Kudos on fixing the issue though :)

> Annotations parsed from XFDF containing ampersand characters are not properly imported
> --------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-3646
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3646
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm, PDModel
>    Affects Versions: 2.0.3, 2.0.4, 2.0.5, 2.0.6
>         Environment: java 1.8.0_112
>            Reporter: Kai Keggenhoff
>            Assignee: Tilman Hausherr
>            Priority: Major
>              Labels: xfdf
>             Fix For: 2.0.13, 3.0.0 PDFBox
>
>         Attachments: MergeTest.java, output1.pdf, output2.pdf, sample.xfdf
>
>
> Annotations containing "&" in their text are displayed incorrectly when parsed unmodified from XFDF (the ampersands are encoded as "&amp;" there) and added to a PDF document.
>  This occurs for both "text comment" and "text box" type annotations.
>  However, if the XFDF is modified by replacing "&amp;" with "&amp;amp;" prior to parsing, the imported annotations are then displayed correctly.
> The attached code produces two pdf files. One is the PDF with the unmodified XFDF imported, two the PDF with the modifed XFDF.
> A XFDF containing both a text box and text comment annotation is embedded in the source and attached as a separated file.
> Update 23.03.2017 : This problem persists in 2.0.5 and we noticed the same corruption of merged annotations occur, if the annotation text contains a "<" (encoded as "lt" entity)
> Update 17.10.2018 : This corruption is caused by FDFAnnotation.richContentsToString. This method reads "<" and "&" from the parsed values in the document and puts them as such into the markup, but these characters must be replaced with their entities.
> I'll add this substitution to my proposed bugfix of 4345, please refer to https://issues.apache.org/jira/projects/PDFBOX/issues/PDFBOX-4345



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org