You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/06/12 03:27:21 UTC

[jira] [Commented] (TIKA-1993) Image Recognition with Tika

    [ https://issues.apache.org/jira/browse/TIKA-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326163#comment-15326163 ] 

ASF GitHub Bot commented on TIKA-1993:
--------------------------------------

GitHub user thammegowda opened a pull request:

    https://github.com/apache/tika/pull/125

    TIKA-1993: ObjectRecognitionParser + Tensorflow image recognition with Inception-V3 model as default implementation

    Summary of changes:
    
    - Fixed TIKA-2002 : ExternalParser.check() empties stdout and stderr buffers so no more hanging is expected
    - Added ObjectRecognitionParser, ObjectRecogniser, RecognisedObject - A parser, interface and a model class respectively
    - implemented TensorFlowImageRecParser - an `ExternalParser` which (if missing) downloads and calls tensorflow `image_classify.py` script (the script then downloads Inception-v3 model) 
    
    
    ---
    ## Quick Setup and Test
    -  Install tensor flow using pip - https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#pip-installation
    - Checkout the test case  `tika-parsers/src/test/java/org/apache/tika/parser/recognition/ObjectRecognitionParserTest.java`
    
    ## Demos
    Compile package : `mvn clean install` # `-DskipTests` if you dont like to wait for tests
    
    Lets check 
    - (for animal lovers,) on a cat's image at https://raw.githubusercontent.com/apache/tika/master/tika-parsers/src/test/resources/test-documents/testJPEG.jpg
    ```
    java -jar tika-app/target/tika-app-1.14-SNAPSHOT.jar \
     --config=tika-parsers/src/test/resources/org/apache/tika/parser/recognition/tika-config-tflow.xml \
      tika-parsers/src/test/resources/test-documents/testJPEG.jpg
    ```
    ```xml
    <meta name="English foxhound" content="0.02759"/>
    <meta name="Egyptian cat" content="0.09168"/>
    <meta name="collie" content="0.02982"/>
    <meta name="bluetick" content="0.06043"/>
    <meta name="Border collie" content="0.07553"/>
    ```
    - (For law-keepers) On a rifle at https://upload.wikimedia.org/wikipedia/commons/thumb/3/38/US_Navy_100714-N-4965F-174_Chief_Mass_Communication_Specialist_Paula_Ludwick%2C_assigned_to_Fleet_Combat_Camera_Group_Pacific%2C_shoots_at_a_target_during_a_Navy_Rifle_Qualification_Course.jpg/220px-thumbnail.jpg
    ```xml
    <meta name="projectile, missile" content="0.00034"/>
    <meta name="military uniform" content="0.00763"/>
    <meta name="bulletproof vest" content="0.00489"/>
    <meta name="assault rifle, assault gun" content="0.92418"/>
    <meta name="rifle" content="0.04343"/>
    ```
    - (for law-keepers) On a revolver  at https://upload.wikimedia.org/wikipedia/commons/8/8d/Glock17.jpg
    ```xml
    <meta name="power drill" content="0.00470"/>
    <meta name="revolver, six-gun, six-shooter" content="0.69355"/>
    <meta name="holster" content="0.21180"/>
    <meta name="assault rifle, assault gun" content="0.01513"/>
    <meta name="rifle" content="0.01053"/>
    ```
    - (for car enthusiasts) On a car at http://www.trbimg.com/img-57226a08/turbine/ct-tesla-model-3-unveiling-20160404/650/650x366
    ```xml
    <meta name="car wheel" content="0.02527"/>
    <meta name="convertible" content="0.01338"/>
    <meta name="sports car, sport car" content="0.87855"/>
    <meta name="beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon" content="0.00903"/>
    <meta name="minivan" content="0.01217"/>
    ```
    
    
    / /NOTE:
    1. The most efficient way to make use of tensorflow would be to use C++ api via JNI. I didn't have a chance to learn that stuff so far so help needed to make this efficient. Or else we may wait for tensorflow folks to offer Java bindings! Right now, the image recognition model is loaded and unloaded every time by the script (200MB of disk-read per parse call, very inefficient!).
    2. The very first call takes plenty of time as the model is downloaded lazily
    3.  Only `image/jpeg` is supported. PNG coming later
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/thammegowda/tika TIKA-1993

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tika/pull/125.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #125
    
----
commit 9b5dc7fae4456b12b75ec21d050b9439e6527c47
Author: Thamme Gowda <th...@apache.org>
Date:   2016-06-12T02:04:08Z

    External Parser now have consumer for ignored lines,  Fix TIKA-2002

commit eccc15387f0d4a5c62d8d12e6579878dba2f52a8
Author: Thamme Gowda <th...@apache.org>
Date:   2016-06-12T02:04:28Z

    Added an utility to load and insatiate classes

commit 2184e2c2c2a0e507be6be4f9692e0fab5b38a476
Author: Thamme Gowda <th...@apache.org>
Date:   2016-06-12T02:04:49Z

    Object recognition parser, tensorflow based implementation, and test cases for these

commit 0305cfb402f5d5e289533411d5737e1e832888ac
Author: Thamme Gowda <th...@apache.org>
Date:   2016-06-12T02:43:07Z

    Explicit Locale

----


> Image Recognition with Tika 
> ----------------------------
>
>                 Key: TIKA-1993
>                 URL: https://issues.apache.org/jira/browse/TIKA-1993
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Thamme Gowda
>
> Create "ImageRecognitionParser" which can have pluggable implementation for core recognition logic.
> As the name suggests, this parser should detect objects in the images, and support many implementations + models (similar to what NamedEntityParser did for text).
> Supply a default implementation based on Tensorflow with the current state-of-the-art model \[1\].
> Links:
> \[1\] https://www.tensorflow.org/versions/r0.8/tutorials/image_recognition/index.html#usage-with-python-api



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)