You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Philip Helger (JIRA)" <ji...@apache.org> on 2014/04/02 19:33:28 UTC

[jira] [Created] (PDFBOX-2009) PDFStreamEngine.processEncodedText incorrectly handling UTF-16 text with BOM FEFF

Philip Helger created PDFBOX-2009:
-------------------------------------

             Summary: PDFStreamEngine.processEncodedText incorrectly handling UTF-16 text with BOM FEFF
                 Key: PDFBOX-2009
                 URL: https://issues.apache.org/jira/browse/PDFBOX-2009
             Project: PDFBox
          Issue Type: Bug
          Components: Text extraction
    Affects Versions: 2.0.0
            Reporter: Philip Helger
             Fix For: 2.0.0


When having a text print operation like
<FEFF21222193219103B103A003A6> Tj
than the PDFStreamEngine.processEncodedText does not handle this correctly.
Am I correct that if a BOM was determined, the codelength should be set to 2 (and not be changed)? Or should alternatively simply the BOM be skipped?



--
This message was sent by Atlassian JIRA
(v6.2#6252)