You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Egbert (JIRA)" <ji...@apache.org> on 2016/08/01 12:23:20 UTC
[jira] [Commented] (TIKA-2045) TIKA crashes / runs out of memory on
simple PDF
[ https://issues.apache.org/jira/browse/TIKA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401938#comment-15401938 ]
Egbert commented on TIKA-2045:
------------------------------
Thanks for investigating and reporting it with PDFBox. I'll subscribe to PDFBOX-3442 to keep track of a possible solution!
> TIKA crashes / runs out of memory on simple PDF
> -----------------------------------------------
>
> Key: TIKA-2045
> URL: https://issues.apache.org/jira/browse/TIKA-2045
> Project: Tika
> Issue Type: Bug
> Components: core
> Affects Versions: 1.13
> Environment: Linux, Java 8
> Reporter: Egbert
>
> We're using TIKA embedded in a webcrawler and today I've encountered a PDF that results in OutOfMemory errors while being processed by TIKA.
> It's a small, 1 page PDF file, so I don't think that it should consume that much memory.
> I verified the problem by using the GUI from the tika-app-1.13.jar file and that results in the same error on the same file. The file can be found at:
> http://www.spesmea.nl/pdf/algemene_voorwaarden_bbztcn_2010_nl.pdf
> If I can help by providing any additional information, please let me know.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)