You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "yuedaxia (JIRA)" <ji...@apache.org> on 2010/12/21 09:30:01 UTC

[jira] Created: (PDFBOX-925) ExtractText china pdf ,but pdfbox distinguish Korea,The pdf 1.2 is ok,since 1.3 error

ExtractText china  pdf  ,but pdfbox  distinguish  Korea,The  pdf 1.2 is ok,since 1.3 error
------------------------------------------------------------------------------------------

                 Key: PDFBOX-925
                 URL: https://issues.apache.org/jira/browse/PDFBOX-925
             Project: PDFBox
          Issue Type: Bug
          Components: Text extraction
    Affects Versions: 1.4.0, 1.3.1
         Environment: wndows xp 
            Reporter: yuedaxia
            Priority: Critical
             Fix For: 1.5.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PDFBOX-925) ExtractText china pdf ,but pdfbox distinguish Korea,The pdf 1.2 is ok,since 1.3 error

Posted by "yuedaxia (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

yuedaxia updated PDFBOX-925:
----------------------------

    Attachment: java_Enter_Best-03.PDF

This file encode is china but  pdfbox   korea 

> ExtractText china  pdf  ,but pdfbox  distinguish  Korea,The  pdf 1.2 is ok,since 1.3 error
> ------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-925
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-925
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.3.1, 1.4.0
>         Environment: wndows xp 
>            Reporter: yuedaxia
>            Priority: Critical
>             Fix For: 1.5.0
>
>         Attachments: java_Enter_Best-03.PDF
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (PDFBOX-925) ExtractText china pdf ,but pdfbox distinguish Korea,The pdf 1.2 is ok,since 1.3 error

Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler resolved PDFBOX-925.
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.5.0
         Assignee: Andreas Lehmkühler

I improved the the encoding for CID encoded fonts in revision 1062427. Find attached the result of the text extraction.

> ExtractText china  pdf  ,but pdfbox  distinguish  Korea,The  pdf 1.2 is ok,since 1.3 error
> ------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-925
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-925
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.3.1, 1.4.0
>         Environment: wndows xp 
>            Reporter: yuedaxia
>            Assignee: Andreas Lehmkühler
>             Fix For: 1.5.0
>
>         Attachments: java_Enter_Best-03.PDF, PDFBOX925-java_Enter_Best-03.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PDFBOX-925) ExtractText china pdf ,but pdfbox distinguish Korea,The pdf 1.2 is ok,since 1.3 error

Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler updated PDFBOX-925:
--------------------------------------

    Attachment: PDFBOX925-java_Enter_Best-03.txt

> ExtractText china  pdf  ,but pdfbox  distinguish  Korea,The  pdf 1.2 is ok,since 1.3 error
> ------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-925
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-925
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.3.1, 1.4.0
>         Environment: wndows xp 
>            Reporter: yuedaxia
>         Attachments: java_Enter_Best-03.PDF, PDFBOX925-java_Enter_Best-03.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PDFBOX-925) ExtractText china pdf ,but pdfbox distinguish Korea,The pdf 1.2 is ok,since 1.3 error

Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler updated PDFBOX-925:
--------------------------------------

         Priority: Major  (was: Critical)
    Fix Version/s:     (was: 1.5.0)

> ExtractText china  pdf  ,but pdfbox  distinguish  Korea,The  pdf 1.2 is ok,since 1.3 error
> ------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-925
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-925
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.3.1, 1.4.0
>         Environment: wndows xp 
>            Reporter: yuedaxia
>         Attachments: java_Enter_Best-03.PDF
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.