You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Damiano (JIRA)" <ji...@apache.org> on 2014/08/14 16:46:12 UTC

[jira] [Created] (TIKA-1396) Embedded images in PDF documents

Damiano created TIKA-1396:
-----------------------------

             Summary: Embedded images in PDF documents
                 Key: TIKA-1396
                 URL: https://issues.apache.org/jira/browse/TIKA-1396
             Project: Tika
          Issue Type: Bug
          Components: cli
    Affects Versions: 1.5
         Environment: OS: 
Ubuntu 14.04.1 LTS

KERNEL:
3.13.0-33-generic 
gcc version 4.8.2

JAVA:
java version "1.8.0_11"
Java(TM) SE Runtime Environment (build 1.8.0_11-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.11-b03, mixed mode)

            Reporter: Damiano
            Priority: Critical


Hello!
I just found a problem with PDF documents that have embedded images.

Doing:

java -jar tika-app-1.5.jar --extract tika.pdf

Tika can not find the image.

Is this a PDF related problem? Because if i do the same operation with a DOC document Tika finds the image correctly.




--
This message was sent by Atlassian JIRA
(v6.2#6252)