You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2013/03/20 22:35:39 UTC

[Nutch Wiki] Update of "bin/nutch solrindex" by kiranchitturi

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "bin/nutch solrindex" page has been changed by kiranchitturi:
http://wiki.apache.org/nutch/bin/nutch%20solrindex?action=diff&rev1=5&rev2=6

  This class replaces the legacy dependency for Nutch <1.3 to index to Apache Lucene for subsequent search. We now pass a SolrURL (amongst other arguements) to post data crawled by Nutch for search within an Apache Solr core.
  
  Note: This class currently does commits once for all the reducers in one go. This is subject to change in subseqent versions of Nutch as a commit can take a lot of resources (cache warming) and it's not always necessary to commit after solrindex, solrdedup or solrclean, especially if they are run immediately after the other.
+ 
+ === Nutch 1.x ===
  
  Usage:
  {{{
@@ -28, +30 @@

  '''[-filter]''': Enable URL filtering.
  
  '''[-normalize]''': Enable URL normalizing.
+ 
+ === Nutch 2.x ===
+ 
+ {{{
+ Usage: SolrIndexerJob <solr url> (<batchId> | -all | -reindex) [-crawlId <id>]
+ }}}
  CommandLineOptions