You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Christian Czech (JIRA)" <ji...@apache.org> on 2012/07/23 17:57:34 UTC
[jira] [Created] (PDFBOX-1362) Slovakian characters
Christian Czech created PDFBOX-1362:
---------------------------------------
Summary: Slovakian characters
Key: PDFBOX-1362
URL: https://issues.apache.org/jira/browse/PDFBOX-1362
Project: PDFBox
Issue Type: Bug
Components: Text extraction
Affects Versions: 1.7.0
Environment: Windows XP, Java 1.6.0_33
Reporter: Christian Czech
Hello,
I have a PDF document with Slovakian characters:
Hlavní administrátor
My code:
PDDocument document = null;
document = PDDocument.load(pdfFile, true); PDFTextStripper stripper =
null; stripper = new PDFTextStripper("ISO-8859-2");
stripper.getText(document);
I always get this result: Hlavn\? administr\ ?tor
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PDFBOX-1362) Slovakian characters
Posted by "Christian Czech (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Christian Czech updated PDFBOX-1362:
------------------------------------
Attachment: test_7_2_test.pdf
> Slovakian characters
> --------------------
>
> Key: PDFBOX-1362
> URL: https://issues.apache.org/jira/browse/PDFBOX-1362
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.7.0
> Environment: Windows XP, Java 1.6.0_33
> Reporter: Christian Czech
> Attachments: test_7_2_test.pdf
>
>
> Hello,
> I have a PDF document with Slovakian characters:
> Hlavní administrátor
> My code:
> PDDocument document = null;
> document = PDDocument.load(pdfFile, true); PDFTextStripper stripper =
> null; stripper = new PDFTextStripper("ISO-8859-2");
> stripper.getText(document);
> I always get this result: Hlavn\? administr\ ?tor
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1362) Slovakian characters
Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490524#comment-13490524 ]
Andreas Lehmkühler commented on PDFBOX-1362:
--------------------------------------------
The most recent version is 1.7.1. There isn't any plan for a next release yet.
Please don't hijack JIRAs for such questions. Use our mailinglists instead [1]
[1] http://pdfbox.apache.org/mail-lists.html
> Slovakian characters
> --------------------
>
> Key: PDFBOX-1362
> URL: https://issues.apache.org/jira/browse/PDFBOX-1362
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.7.0
> Environment: Windows XP, Java 1.6.0_33
> Reporter: Christian Czech
> Assignee: Andreas Lehmkühler
> Fix For: 1.8.0
>
> Attachments: PDFBOX-1362.patch, test_7_2_test.pdf
>
>
> Hello,
> I have a PDF document with Slovakian characters:
> Hlavní administrátor
> My code:
> PDDocument document = null;
> document = PDDocument.load(pdfFile, true); PDFTextStripper stripper =
> null; stripper = new PDFTextStripper("ISO-8859-2");
> stripper.getText(document);
> I always get this result: Hlavn\? administr\ ?tor
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1362) Slovakian characters
Posted by "Joe Lee (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489024#comment-13489024 ]
Joe Lee commented on PDFBOX-1362:
---------------------------------
Andreas,
Could you show me the URL that I can download the latest PDFbox jar file? Version 2.0.0 or 1.8.0 at least. Thanks.
Joe
> Slovakian characters
> --------------------
>
> Key: PDFBOX-1362
> URL: https://issues.apache.org/jira/browse/PDFBOX-1362
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.7.0
> Environment: Windows XP, Java 1.6.0_33
> Reporter: Christian Czech
> Assignee: Andreas Lehmkühler
> Fix For: 1.8.0
>
> Attachments: PDFBOX-1362.patch, test_7_2_test.pdf
>
>
> Hello,
> I have a PDF document with Slovakian characters:
> Hlavní administrátor
> My code:
> PDDocument document = null;
> document = PDDocument.load(pdfFile, true); PDFTextStripper stripper =
> null; stripper = new PDFTextStripper("ISO-8859-2");
> stripper.getText(document);
> I always get this result: Hlavn\? administr\ ?tor
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PDFBOX-1362) Slovakian characters
Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Lehmkühler resolved PDFBOX-1362.
----------------------------------------
Resolution: Fixed
Fix Version/s: 1.8.0
Assignee: Andreas Lehmkühler
I applied the fix in revision 1404698 as proposed.
Thanks for the contribution!
> Slovakian characters
> --------------------
>
> Key: PDFBOX-1362
> URL: https://issues.apache.org/jira/browse/PDFBOX-1362
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.7.0
> Environment: Windows XP, Java 1.6.0_33
> Reporter: Christian Czech
> Assignee: Andreas Lehmkühler
> Fix For: 1.8.0
>
> Attachments: PDFBOX-1362.patch, test_7_2_test.pdf
>
>
> Hello,
> I have a PDF document with Slovakian characters:
> Hlavní administrátor
> My code:
> PDDocument document = null;
> document = PDDocument.load(pdfFile, true); PDFTextStripper stripper =
> null; stripper = new PDFTextStripper("ISO-8859-2");
> stripper.getText(document);
> I always get this result: Hlavn\? administr\ ?tor
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira