You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2011/08/18 09:02:59 UTC

[Solr Wiki] Trivial Update of "NewSolrCloudDesign" by NoblePaul

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "NewSolrCloudDesign" page has been changed by NoblePaul:
http://wiki.apache.org/solr/NewSolrCloudDesign?action=diff&rev1=1&rev2=2

= New SolrCloud Design =
+ (Work in progress)
+ == What is SolrCloud? ==

+ SolrCloud is an enhancement to the existing Solr to manage and operate Solr as a search service in a cloud.
+
+ == Glossary ==
+
+ * Cluster : Cluster is a set of Solr nodes managed as a single unit. The entire cluster must have a single schema and solrconfig
+ * Node : A JVM instance running Solr
+ * Partition : A partition is a subset of the entire document collection. A partition is created in such a way that all its documents can be contained in a single index.
+ * Shard : A Partition needs to be stored in multiple nodes as specified by the replication factor. All these nodes collectively form a shard. A node may be a part of multiple shards
+ * Leader : Each Shard has one node identified as its leader. All the writes for documents belonging to a partition should be routed through the leader.
+ * Replication Factor : Minimum number of copies of a document maintained by the cluster
+ * Transaction Log : An append-only log of write operations maintained by each node
+ * Partition version : This is a counter maintained with the leader of each shard and incremented on each write operation and sent to the peers
+ * Cluster Lock : This is a global lock which must be acquired in order to change the range -> partition or the partition -> node mappings.
+
+ == Guiding Principles ==
+
+ * Any operation can be invoked on any node in the cluster.
+ * No non-recoverable single point of failures
+ * Cluster should be elastic
+ * Writes must never be lost i.e. durability is guaranteed
+ * Order of writes should be preserved
+ * If two clients send document "A" to two different replicas at the same time, one should consistently "win" on all replicas.
+ * Cluster configuration should be managed centrally and can be updated through any node in the cluster. No per node configuration other than local values such as the port, index/logs storage locations should be required
+ * Automatic failover of reads
+ * Automatic failover of writes
+ * Automatically honour the replication factor in the event of a node failure
+