You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Roannel Fernández Hernández <ro...@uci.cu> on 2015/11/02 22:05:23 UTC

Can't retrieve Tika parser for mime-type image/jpeg

Hi: 

I downloaded and compiled the Nutch trunk. But when I try to make a parsechecker I get the error: Can't retrieve Tika parser for mime-type image/jpeg 

My log file content is this: 

2015-11-02 10:50:57,421 INFO parse.ParserChecker - fetching: http://www.cubadebate.cu/wp-content/uploads/2015/11/air-china-3-150x125.jpg 
2015-11-02 10:50:57,897 INFO protocol.RobotRulesParser - robots.txt whitelist not configured. 
2015-11-02 10:50:57,897 INFO httpclient.Http - http.proxy.host = null 
2015-11-02 10:50:57,897 INFO httpclient.Http - http.proxy.port = 8080 
2015-11-02 10:50:57,897 INFO httpclient.Http - http.proxy.exception.list = false 
2015-11-02 10:50:57,897 INFO httpclient.Http - http.timeout = 60000 
2015-11-02 10:50:57,897 INFO httpclient.Http - http.content.limit = 1048576000 
2015-11-02 10:50:57,897 INFO httpclient.Http - http.agent = agent/Nutch-1.11 (Agent; admin@do.main) 
2015-11-02 10:50:57,897 INFO httpclient.Http - http.accept.language = en-us,en-gb,en;q=0.7,*;q=0.3 
2015-11-02 10:50:57,897 INFO httpclient.Http - http.accept = text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 
2015-11-02 10:50:58,582 ERROR tika.TikaParser - Can't retrieve Tika parser for mime-type image/jpeg 
2015-11-02 10:50:58,594 INFO crawl.SignatureFactory - Using Signature impl: org.apache.nutch.crawl.MD5Signature 
2015-11-02 10:50:58,602 INFO parse.ParserChecker - parsing: http://www.cubadebate.cu/wp-content/uploads/2015/11/air-china-3-150x125.jpg 
2015-11-02 10:50:58,602 INFO parse.ParserChecker - contentType: image/jpeg 
2015-11-02 10:50:58,602 INFO parse.ParserChecker - signature: bfdbe472ed3e43e686b4619b2c043d50 
2015-11-02 10:50:58,603 INFO parse.ParserChecker - --------- 

Thanks 

Noviembre 13-14: Final Caribeña 2015 del Concurso de Programación ACM-ICPC
https://icpc.baylor.edu/regionals/finder/cf-2015