You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2015/02/05 13:05:35 UTC
[jira] [Commented] (PDFBOX-904) Potential issue with COSString and
UTF-16-encoded Strings.
[ https://issues.apache.org/jira/browse/PDFBOX-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307115#comment-14307115 ]
Tilman Hausherr commented on PDFBOX-904:
----------------------------------------
[~jahewson] Does it still exist now? "ISO-8859-1" and 255 are no longer in COSString after PDFBOX-1242.
> Potential issue with COSString and UTF-16-encoded Strings.
> ----------------------------------------------------------
>
> Key: PDFBOX-904
> URL: https://issues.apache.org/jira/browse/PDFBOX-904
> Project: PDFBox
> Issue Type: Bug
> Components: PDModel
> Affects Versions: 1.4.0, 2.0.0
> Reporter: Neil McErlean
> Priority: Critical
> Fix For: 2.0.0
>
> Attachments: PDFBOX-904.patch
>
>
> I've been looking into PDFBOX-903 and I came across a potential issue with the COSString class.
> The issue occurs when you construct an instance of COSString and pass a UTF-16-encoded String.
> The current code (trunk) checks the passed String parameter in the constructor to see if it is UTF-16. It does this by looking for char values above 255.
> Whilst a String that contains char values greater than 255 is likely to be UTF-16, it is possible to have UTF-16-encoded Strings whose characters do not exceed this limit.
> These Strings would be incorrectly marked as being not unicode16. An example (from the upcoming patch)
> /**してく */
> String textHighBits = "\u3057\u3066\u304f";
> Furthermore, if you construct a COSString using the COSString(byte[]) constructor, then the COSString class cannot know what the encoding is.
> I will attach a patch in a moment which includes a test case to reproduce the issue and a fix for the product code.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org