You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2009/06/08 22:19:00 UTC

[Solr Wiki] Trivial Update of "SolrReplication" by JohnBennett

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by JohnBennett:
http://wiki.apache.org/solr/SolrReplication

The comment on the change is:
Made stylistic changes to the description of this feature.

------------------------------------------------------------------------------
  {{{
  <str name="confFiles">solrconfig_slave.xml:solrconfig.xml,x.xml,y.xml</str>
  }}}
- This ensures that 'solrconfig_slave.xml' will be saved as 'solrconfig.xml' in slave. All other files will be saved in their original name.
+ This ensures that 'solrconfig_slave.xml' will be saved as 'solrconfig.xml' on the slave. All other files will be saved in their original names.
  
- The file name can be anything in the master and it will be saved as the name after the colon ':'
+ The file name can be anything in the master, and it will be saved as the name after the colon ':'.
   
  === in slave: ===
  {{{
@@ -74, +74 @@

  }}}
  
  '''Note:''' 
- If you are not using cores, then you simply omit the "corename" parameter above in the masterUrl. To ensure that the url is correct just hit the url with a browser. You must get a status OK response.
+ If you are not using cores, then you simply omit the "corename" parameter above in the masterUrl. To ensure that the url is correct, just hit the url with a browser. You must get a status OK response.
  
  === Setting up a Repeater ===
- A master may be able to serve only so many slaves w/o affecting performance. Some times there are multiple data centers were the slaves are deployed. If each slave downloads the the index from a remote data center it may take up too much bandwidth. In that case some of the servers can be configured as a repeater . A repeater is nothing but a node that acts as a master as well as slave .
+ A master may be able to serve only so many slaves without affecting performance. Some organizations have deployed slave servers across multiple data centers. If each slave downloads the index from a remote data center, the resulting download may consume too much network bandwidth. To avoid performance degradation in cases like this, you can configure some of the slaves as repeaters.  A repeater is simply a node that acts as both a master and a slave.
-  * In that case both the master and slave configuration lists need to be present inside the !ReplicationHandler requestHandler in the solrconfig.xml. 
+  * To configure a server as a repeater, both the master and slave configuration lists need to be present inside the !ReplicationHandler requestHandler in the solrconfig.xml. 
  
  
  
@@ -99, +99 @@

  
  This feature relies on the !IndexDeletionPolicy feature of Lucene. Lucene exposes the different !IndexCommits and the files associated w/ that commit through this API. This enables us to quickly identify the files that need to be downloaded.
  
- True to the tradition of Solr all the operations are performed over a REST API. The !ReplicationHandler exposes a REST API for discovering the current index version and the files (and their metadata) associated with each version. The slave uses this API to find out the new files in master's index. The slave then finds out the files those need to be downloaded from the master. It requests (HTT GET) the master for the content of each file . This uses a custom format (akin to the http chunked encoding) to download full/part of each file. The downloaded files are stored in  a temp directory. After all the required files are downloaded the files from the temp directory are moved to the index directory and a 'commit'  command is issued.
+ True to the tradition of Solr, all operations are performed over a REST API. The !ReplicationHandler exposes a REST API for discovering the current index version and the files (and their metadata) associated with each version. The slave uses this API to discover the new files in the master's index. The slave then determines which of those files need to be downloaded from the master. It sends a request (using HTTP GET) to the master for the content of each file. This uses a custom format (akin to the HTTP chunked encoding) to download the full content or a part of each file. The slave stores downloaded files in a temp directory. Once all the required files are downloaded, the slave moves all the files to the index directory and issues a 'commit'  command.
  
  == HTTP API ==
  These commands can be invoked over HTTP to the !ReplicationHandler 
   * Abort copying snapshot from master to slave command : http://slave_host:port/solr/replication?command=abort
-  * Force a snapshot on master.This is useful to take periodic backups .command : http://master_host:port/solr/replication?command=snapshoot
+  * Force a snapshot on master. This is useful to take periodic backups .command : http://master_host:port/solr/replication?command=snapshoot
   * Force a snap pull on slave from master command : http://slave_host:port/solr/replication?command=snappull
    * It is possible to pass on extra attribute 'masterUrl' or other attributes like 'compression' (or any other parameter which is specified in the <lst name="slave"> tag) to do a one time replication from a master. This obviates the need for hardcoding the master in the slave.
   * Disable polling for snapshot from slave command : http://slave_host:port/solr/replication?command=disablepoll
@@ -116, +116 @@

   * Enable replication on master for all slaves : http://master_host:port/solr/replication?command=enablereplication
  
  == Hack to enable/disable master/slave in a node ==
- If a server needs to be turned into a master from a slave or if you wish to use the same solrconfig.xml for master and slave do as follows,
+ If a server needs to be turned into a master from a slave or if you wish to use the same solrconfig.xml for both master and slave, do as follows,
  {{{
  <requestHandler name="/replication" class="solr.ReplicationHandler" >
    <lst name="${master:master}">
@@ -129, +129 @@

   </lst>
  </requestHandler>
  }}}
- when the master is started , pass in -Dslave=disabled and in the slave pass in -Dmaster=disabled this will change the tag value and it can become disable master/slave automatically
+ When the master is started, pass in -Dslave=disabled and in the slave pass in -Dmaster=disabled. These arguments will change the tag value for the requestHandler and thereby disable the unwanted functionality in each server.
  
  == Admin Page for Replication ==
  inline:replication.png