You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Chitrang Natu (JIRA)" <ji...@apache.org> on 2014/01/02 16:30:51 UTC

[jira] [Updated] (PDFBOX-1823) Apache PDFBox 1.6.0 TextStripper not able to recognise characters having "Frutiger LT - 45" fonts

     [ https://issues.apache.org/jira/browse/PDFBOX-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chitrang Natu updated PDFBOX-1823:
----------------------------------

    Attachment: Test_Frutiger.java
                pom.xml
                pdfbox-checkstyle.xml
                fontbox-checkstyle.xml
                TC01_output.concat.MD302AE_Part2.doc
                PDF_With_Frutiger_font.pdf

> Apache PDFBox 1.6.0 TextStripper not able to recognise characters having "Frutiger LT - 45" fonts
> -------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-1823
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1823
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 1.6.0
>         Environment: jdk1.6
>            Reporter: Chitrang Natu
>              Labels: newbie
>         Attachments: PDF_With_Frutiger_font.pdf, TC01_output.concat.MD302AE_Part2.doc, Test_Frutiger.java, fontbox-checkstyle.xml, pdfbox-checkstyle.xml, pom.xml
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> When i tried to extract contents from PDF's I am successfully able to extract all text with PDFBox API but getting trouble with fonts having 'Frutiger' style. For these i am getting squared Boxes in place of characters.
> It seems PDFBox FontBox supports only 14 UTF characters set  And none of them is Frutiger style fonts. 
> If anybody please can suggest something. That would be of great help. I am in urgent need of the solution.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)