You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Avtar Singh (JIRA)" <ji...@apache.org> on 2017/03/14 18:37:43 UTC

[jira] [Updated] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

     [ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avtar Singh updated TIKA-2298:
------------------------------
    Description: 
When ObjectRecognitionParser was built to do image recognition, there wasn't
good support for Java frameworks.  All the popular neural networks were in
C++ or python.  Since there was nothing that runs within JVM, we tried
several ways to glue them to Tika (like CLI, JNI, gRPC, REST).
However, this game is changing slowly now. Deeplearning4j, the most famous
neural network library for JVM, now supports importing models that are
pre-trained in python/C++ based kits [5].

*Improvement:*
It will be nice to have an implementation of ObjectRecogniser that
doesn't require any external setup(like installation of native libraries or
starting REST services). Reasons: easy to distribute and also to cut the IO
time.

  was:
When we built ObjectRecognitionParser to do image recognition, there wasn't
good support for Java frameworks.  All the popular neural networks were in
C++ or python.  Since there was nothing that runs within JVM, we tried
several ways to glue them to Tika (like CLI, JNI, gRPC, REST).
However, this game is changing slowly now. Deeplearning4j, the most famous
neural network library for JVM, now supports importing models that are
pre-trained in python/C++ based kits [5].

*Improvement:*
It will be nice to have an implementation of ObjectRecogniser that
doesn't require any external setup(like installation of native libraries or
starting REST services). Reasons: easy to distribute and also to cut the IO
time.


> To improve object recognition parser so that it may work without external RESTful service setup
> -----------------------------------------------------------------------------------------------
>
>                 Key: TIKA-2298
>                 URL: https://issues.apache.org/jira/browse/TIKA-2298
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.14
>            Reporter: Avtar Singh
>              Labels: ObjectRecognitionParser
>             Fix For: 1.15
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> When ObjectRecognitionParser was built to do image recognition, there wasn't
> good support for Java frameworks.  All the popular neural networks were in
> C++ or python.  Since there was nothing that runs within JVM, we tried
> several ways to glue them to Tika (like CLI, JNI, gRPC, REST).
> However, this game is changing slowly now. Deeplearning4j, the most famous
> neural network library for JVM, now supports importing models that are
> pre-trained in python/C++ based kits [5].
> *Improvement:*
> It will be nice to have an implementation of ObjectRecogniser that
> doesn't require any external setup(like installation of native libraries or
> starting REST services). Reasons: easy to distribute and also to cut the IO
> time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)