You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2014/12/09 23:21:14 UTC

[jira] [Comment Edited] (PDFBOX-1242) Handle non ISO-8859-1 chars with drawString

    [ https://issues.apache.org/jira/browse/PDFBOX-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240161#comment-14240161 ] 

John Hewson edited comment on PDFBOX-1242 at 12/9/14 10:21 PM:
---------------------------------------------------------------

Yes, this is a bug in PDFBox, but it's one we know about already. What does the code you've posted do?

Please use file attachments to post code, JIRA uses markup so your code is unusable when posted in this manner. You can delete the previous comment and attach it as a file with More > Attach Files.


was (Author: jahewson):
Yes, this is a bug in PDFBox, but it's one we know about already. What does the code you've posted do?

Please use file attachments to post code, JIRA uses markup so your code is unusable when posted In this manner. You can delete the previous comment an attach it as a fils with More > Attach Files.

> Handle non ISO-8859-1 chars with drawString
> -------------------------------------------
>
>                 Key: PDFBOX-1242
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1242
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Writing
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Peter Andersen
>            Assignee: John Hewson
>             Fix For: 2.0.0
>
>
> The PDPageContentStream.drawString take a String as argument, it construct a COSString of the input.
> If the input contain chars above 255, the COSString is prefixed 0xFe, 0xff and the bytes are taken from the
> input as "UTF-16BE" encoded.
> Back in the drawString method this unicode16 encoded COSString is appended as a "ISO-8859-1"        
> 	appendRawCommands( new String( buffer.toByteArray(), "ISO-8859-1"));
>  
> The result of this is that a line with UTF-16 chars is shown prefix with þÿ, and with double space between the other chars.
> The chars above 255 are shown as the two corresponding ISO-8859-1 characters.
> As a side question to this observation, is there an alternative way to use Pdfbox, to support UTF16?
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)