You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by David Xiao <da...@gmail.com> on 2007/06/23 14:37:55 UTC

Integrate nutch crawler with Solr index server

Hello folks,

 

As title said, I have some difficult to integrate them together. I tried to followed instruction at http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html but I don’t actually understand part that java piece of code. In article it doesn’t go detail configuration of Solr. I have download solr-client.zip but what to do with Nutch?

 

 

Very appreciate if you can give out example of that.

 

 

Best Regards,

David

 

 

 


Re: Integrate nutch crawler with Solr index server

Posted by Brian Whitman <br...@variogr.am>.
On Jun 23, 2007, at 8:37 AM, David Xiao wrote:
> As title said, I have some difficult to integrate them together. I  
> tried to followed instruction at http://blog.foofactory.fi/2007/02/ 
> online-indexing-integrating-nutch-with.html but I don’t actually  
> understand part that java piece of code. In article it doesn’t go  
> detail configuration of Solr. I have download solr-client.zip but  
> what to do with Nutch?


It's my understanding that the code Sami posted will no longer work  
with recent versions of Solr / solrj.

However, the solr client (SOLR-20) was recently added to trunk,  
http://issues.apache.org/jira/browse/SOLR-20#action_12505314 , I sent  
Sami a patch on his posted code and hopefully we'll see SolrIndexer  
get into Nutch trunk sometime soon?

As far as configuration of Solr, that post does a good job at  
explaining it, there's not much to it- just use the schema he posted  
and start Solr normally.