You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/21 12:49:00 UTC

[jira] [Created] (TIKA-2533) Improve embedded image extraction in PDFs

Tim Allison created TIKA-2533:
---------------------------------

             Summary: Improve embedded image extraction in PDFs
                 Key: TIKA-2533
                 URL: https://issues.apache.org/jira/browse/TIKA-2533
             Project: Tika
          Issue Type: Improvement
            Reporter: Tim Allison
            Priority: Minor


PDFBOX-4043, [~tilman] pinged us to fix a parallel bug in our extraction of images.  Given that we're copying/pasting from PDFBox's {{ExtractImages}}, we should fix that bug and consider refactoring our PDFParser a bit to make it easier to copy/paste from {{ExtractImages}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)