You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2010/02/16 23:13:56 UTC

[Solr Wiki] Trivial Update of "HadoopIndexing" by JasonRutherglen

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "HadoopIndexing" page has been changed by JasonRutherglen.
http://wiki.apache.org/solr/HadoopIndexing?action=diff&rev1=2&rev2=3

--------------------------------------------------

  
  CSVIndexer is provided as an example, though for your own application, you will need to create your own CSVIndexer like class.  CSVIndexer extends org.apache.hadoop.conf.Configured to be instantiated via the Hadoop command line.  Your CSVIndexer like class will set your custom mapper and set the output format as SolrOutputFormat.  Use the SolrDocumentConverter.setSolrDocumentConverter to set your custom SolrDocumentConverter.
  
- == SolrDocumentConverter == 
+ == SolrDocumentConverter ==
  
  Implement this class to convert an object from a proprietary format into a SolrInputDocument that may be indexed into Solr.
  
- == Heartbeater == 
+ == Heartbeater ==
  
  Hadoop will try to kill a running task if it doesn't receive a periodic heartbeat.  This is why in a background thread the HeartBeater class continuously notifies Hadoop that the current task is still executing.  This is necessary with Solr and Lucene because some tasks such as an optimizing a large shard can take several minutes to several hours.