You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2009/09/28 12:48:15 UTC
[jira] Created: (TIKA-292) PDFBox is too verbose
PDFBox is too verbose
---------------------
Key: TIKA-292
URL: https://issues.apache.org/jira/browse/TIKA-292
Project: Tika
Issue Type: Improvement
Components: parser
Reporter: Jukka Zitting
Priority: Minor
PDFBox 0.8 logs INFO messages for all PDF primitives that are not enabled in the respective PDFBox configuration. Many of these primitives are explicitly not needed for text extraction, so there's no point in logging so much about them.
Until this is fixed in PDFBox, we should work around it in Tika.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (TIKA-292) PDFBox is too verbose
Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-292.
--------------------------------
Resolution: Fixed
Fix Version/s: 0.5
Assignee: Jukka Zitting
Fixed in revision 819503.
> PDFBox is too verbose
> ---------------------
>
> Key: TIKA-292
> URL: https://issues.apache.org/jira/browse/TIKA-292
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Reporter: Jukka Zitting
> Assignee: Jukka Zitting
> Priority: Minor
> Fix For: 0.5
>
>
> PDFBox 0.8 logs INFO messages for all PDF primitives that are not enabled in the respective PDFBox configuration. Many of these primitives are explicitly not needed for text extraction, so there's no point in logging so much about them.
> Until this is fixed in PDFBox, we should work around it in Tika.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.