You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2017/07/11 14:57:00 UTC
[jira] [Resolved] (PDFBOX-3864) UTF16 encoded string to
PDFDocEncoding
[ https://issues.apache.org/jira/browse/PDFBOX-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr resolved PDFBOX-3864.
-------------------------------------
Resolution: Fixed
> UTF16 encoded string to PDFDocEncoding
> --------------------------------------
>
> Key: PDFBOX-3864
> URL: https://issues.apache.org/jira/browse/PDFBOX-3864
> Project: PDFBox
> Issue Type: Bug
> Components: PDModel
> Affects Versions: 2.0.6
> Reporter: Tilman Hausherr
> Assignee: Tilman Hausherr
> Fix For: 2.0.7, 3.0.0
>
>
> From [~torakiki] in the mailing list:
> {quote}
> Hi, we came across this case where we are basically cloning outline items
> where the original outline title is a UTF16BE encoded text string
> containing the value 00A0 (non break space). We later use the string to
> assign the title in a new outline item and the A0 is recognised as a € sign.
> Here is a simple test:
> {code}
> COSString victim = COSString
> .parseHex("FEFF004300680061007000740065007200A0");
> PDOutlineItem node = new PDOutlineItem();
> node.setTitle(victim.getString());
> {code}
> If you look at the node dictionary you'll see that the title value is
> Chapter€
> {quote}
> The cause is that in the initialization of PDFDocEncoding it was forgotten that there are "holes" in the 0..255 sequence. I'll add that and a test.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org