You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/08/24 20:18:00 UTC

[jira] [Commented] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

    [ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140622#comment-16140622 ] 

ASF GitHub Bot commented on TIKA-2298:
--------------------------------------

boegel commented on issue #182: Creation of TIKA-2298 contributed by asmehra95- Import of vgg16 via Deeplearning4j into tika-dl
URL: https://github.com/apache/tika/pull/182#issuecomment-324745266
 
 
   I'm trying to build and install Tika 1.16 from source, and I'm running into a failing test; it seems like this test was added in this PR.
   
   Any pointers to what is wrong here? How can I debug this further?
   
   ```
   Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
   SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
   SLF4J: Defaulting to no-operation (NOP) logger implementation
   SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
   Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.559 sec <<< FAILURE! - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
   recognise(org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest)  Time elapsed: 5.556 sec  <<< ERROR!
   org.apache.tika.exception.TikaConfigException: Read timed out
           at java.net.SocketInputStream.socketRead0(Native Method)
           at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
           at java.net.SocketInputStream.read(SocketInputStream.java:171)
           at java.net.SocketInputStream.read(SocketInputStream.java:141)
           at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
           at sun.security.ssl.InputRecord.read(InputRecord.java:503)
           at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
           at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:940)
           at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
           at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
           at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
           at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
           at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
           at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
           at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569)
           at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
           at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254)
           at org.apache.commons.io.FileUtils.copyURLToFile(FileUtils.java:1506)
           at org.apache.tika.dl.imagerec.DL4JInceptionV3Net.cachedDownload(DL4JInceptionV3Net.java:216)
           at org.apache.tika.dl.imagerec.DL4JInceptionV3Net.initialize(DL4JInceptionV3Net.java:232)
           at org.apache.tika.parser.recognition.ObjectRecognitionParser.initialize(ObjectRecognitionParser.java:101)
           at org.apache.tika.config.TikaConfig$XmlLoader.loadOne(TikaConfig.java:638)
           at org.apache.tika.config.TikaConfig$XmlLoader.loadOverall(TikaConfig.java:550)
           at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:187)
           at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:168)
           at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:161)
           at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:157)
           at org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest.recognise(DL4JInceptionV3NetTest.java:33)
   
   Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest
   Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.6 sec - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest
   
   Results :
   
   Tests in error:
     DL4JInceptionV3NetTest.recognise:33 ยป TikaConfig Read timed out
   
   Tests run: 2, Failures: 0, Errors: 1, Skipped: 0
   ```
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> To improve object recognition parser so that it may work without external RESTful service setup
> -----------------------------------------------------------------------------------------------
>
>                 Key: TIKA-2298
>                 URL: https://issues.apache.org/jira/browse/TIKA-2298
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.14
>            Reporter: Avtar Singh
>            Assignee: Chris A. Mattmann
>              Labels: ObjectRecognitionParser, gsoc, memex
>             Fix For: 1.16
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> When ObjectRecognitionParser was built to do image recognition, there wasn't
> good support for Java frameworks.  All the popular neural networks were in
> C++ or python.  Since there was nothing that runs within JVM, we tried
> several ways to glue them to Tika (like CLI, JNI, gRPC, REST).
> However, this game is changing slowly now. Deeplearning4j, the most famous
> neural network library for JVM, now supports importing models that are
> pre-trained in python/C++ based kits [5].
> *Improvement:*
> It will be nice to have an implementation of ObjectRecogniser that
> doesn't require any external setup(like installation of native libraries or
> starting REST services). Reasons: easy to distribute and also to cut the IO
> time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)