You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2008/10/09 23:56:44 UTC

[jira] Resolved: (PDFBOX-373) (null) printed when characters cannot be decoded during text extraction

     [ https://issues.apache.org/jira/browse/PDFBOX-373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved PDFBOX-373.
----------------------------------

    Resolution: Fixed
      Assignee: Jukka Zitting

Good point, thanks!

I committed a slightly modified version of your patch (I merged the if statement with the preceding one) in revision 703273.

> (null) printed when characters cannot be decoded during text extraction
> -----------------------------------------------------------------------
>
>                 Key: PDFBOX-373
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-373
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Brian Carrier
>            Assignee: Jukka Zitting
>             Fix For: 0.8.0-incubator
>
>
> We have some PDF files where the TO_UNICODE map is corrupt and PDFBox cannot extract the text.  font.encode() returns null and PDFStreamEngine.showString() adds the null to the result, which is then printed as "(null)". 
> Here is a patch (against the trunk) that replaces the null with "?".  
> --- PDFStreamEngine.java	2008-09-17 16:09:13.529318500 -0400
> +++ PDFStreamEngine-new.java	2008-09-17 16:12:51.617318500 -0400
> @@ -422,6 +422,11 @@
>                  }
>              }
>  
> +            // Replace a null entry with "?" so it is not printed as "(null)"
> +            if (c == null)
> +            {
> +                c = "?";
> +            }
>              totalStringWidth += width;
>              stringResult.append( c );
>          }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.