You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Yusniel Hidalgo Delgado <yh...@uci.cu> on 2011/11/10 22:20:13 UTC

Nutch 1.3 error with solr 3.4

Greetings, 
I am trying to integrate nutch 1.3 and solr 3.4. I am using bin/nutch crawl command with solr param, but before to finish completly the process, I get the following output in my terminal: 

SolrIndexer: starting at 2011-11-10 15:58:39 
java.io.IOException: Job failed! 
SolrDeleteDuplicates: starting at 2011-11-10 15:58:44 
SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/ 
SolrDeleteDuplicates: finished at 2011-11-10 15:58:45, elapsed: 00:00:01 
crawl finished: ../data 

I thinks that something is wrong because the Job fail with java.io.IOException. The last lines in my hadoop.log are: 

no segments* file found in org.apache.lucene.store.NIOFSDirectory@/opt/apache-solr-3.4.0/example/solr/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@9b1670: files: [write.lock] org.apache.lucene.index.IndexNotFoundException: no segments* file found in org.apache.lucene.store.NIOFSDirectory@/opt/apache-solr-3.4.0/example/solr/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@9b1670: files: [write.lock] at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:712) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:593) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1152) at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:83) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:101) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:175) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:223) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:158) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:67) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.se 

request: http://localhost:8983/solr/update?wt=javabin&version=2 
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:436) 
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:245) 
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) 
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49) 
at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:82) 
at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48) 
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474) 
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) 
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) 
2011-11-10 15:58:44,309 ERROR solr.SolrIndexer - java.io.IOException: Job failed! 
2011-11-10 15:58:44,311 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: starting at 2011-11-10 15:58:44 
2011-11-10 15:58:44,311 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/ 
2011-11-10 15:58:45,512 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: finished at 2011-11-10 15:58:45, elapsed: 00:00:01 
2011-11-10 15:58:45,512 INFO crawl.Crawl - crawl finished: ../data 

Any idea? 

Greetings 
-- 

-------------------------------------------------------------------------------------------- 
Yusniel Hidalgo Delgado 
Universidad de las Ciencias Informáticas 
https://twitter.com/#!/yhdelgado 
La Habana, Cuba. 
-------------------------------------------------------------------------------------------- 



Fin a la injusticia, LIBERTAD AHORA A NUESTROS CINCO COMPATRIOTAS QUE SE ENCUENTRAN INJUSTAMENTE EN PRISIONES DE LOS EEUU!
http://www.antiterroristas.cu
http://justiciaparaloscinco.wordpress.com


Re: Nutch 1.3 error with solr 3.4

Posted by Yusniel Hidalgo Delgado <yh...@uci.cu>.
Hi Markus, thanks for your quick reply. 

I did restart my Solr instance, but it don't work fine yet. I am following the Nutch Tutorial from wiki page in the nutch page, however I am not sure if this tutorial work fine with nutch 1.3 and solr 3.4. I am reading others tutorials from Internet. In those tutorials is necesary to move a copy of shema.xml from nutch configuration to solr configurations, however, in the solr 3.4 this file have others options that not be present in the nutch schema.xml file. It posible that my actual integration between nutch 1.3 and solr 3.4 don't work fine for this? 

----- Mensaje original -----

De: "Markus Jelsma" <ma...@openindex.io> 
Para: user@nutch.apache.org 
Enviados: Jueves, 10 de Noviembre 2011 22:52:19 
Asunto: Re: Nutch 1.3 error with solr 3.4 

Restart Solr. You likely manually removed parts of the index that i cannot 
recreate. A restart usually fixes this. 

> no segments* file found in 
> org.apache.lucene.store.NIOFSDirectory@/opt/apache-solr-3.4.0/example/solr 
> /data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@9b1670: 
> files: [write.lock] org.apache.lucene.index.IndexNotFoundException: no 
> segments* file found in 
> org.apache.lucene.store.NIOFSDirectory@/opt/apache-solr-3.4.0/example/solr 
> /data/index 


Fin a la injusticia, LIBERTAD AHORA A NUESTROS CINCO COMPATRIOTAS QUE SE ENCUENTRAN INJUSTAMENTE EN PRISIONES DE LOS EEUU! 
http://www.antiterroristas.cu 
http://justiciaparaloscinco.wordpress.com 



-- 

-------------------------------------------------------------------------------------------- 
Yusniel Hidalgo Delgado 
Universidad de las Ciencias Informáticas 
https://twitter.com/#!/yhdelgado 
La Habana, Cuba. 
-------------------------------------------------------------------------------------------- 



Fin a la injusticia, LIBERTAD AHORA A NUESTROS CINCO COMPATRIOTAS QUE SE ENCUENTRAN INJUSTAMENTE EN PRISIONES DE LOS EEUU!
http://www.antiterroristas.cu
http://justiciaparaloscinco.wordpress.com


Re: Nutch 1.3 error with solr 3.4

Posted by Markus Jelsma <ma...@openindex.io>.
Restart Solr. You likely manually removed parts of the index that i cannot 
recreate. A restart usually fixes this.

> no segments* file found in
> org.apache.lucene.store.NIOFSDirectory@/opt/apache-solr-3.4.0/example/solr
> /data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@9b1670:
> files: [write.lock] org.apache.lucene.index.IndexNotFoundException: no
> segments* file found in
> org.apache.lucene.store.NIOFSDirectory@/opt/apache-solr-3.4.0/example/solr
> /data/index