You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sling.apache.org by st...@apache.org on 2015/11/26 13:31:04 UTC

svn commit: r1716615 - /sling/site/trunk/content/documentation/bundles/discovery-api-and-impl.mdtext

Author: stefanegli
Date: Thu Nov 26 12:31:04 2015
New Revision: 1716615

URL: http://svn.apache.org/viewvc?rev=1716615&view=rev
Log:
SLING-5021 : added section about different nodes under /var/discovery/impl

Modified:
    sling/site/trunk/content/documentation/bundles/discovery-api-and-impl.mdtext

Modified: sling/site/trunk/content/documentation/bundles/discovery-api-and-impl.mdtext
URL: http://svn.apache.org/viewvc/sling/site/trunk/content/documentation/bundles/discovery-api-and-impl.mdtext?rev=1716615&r1=1716614&r2=1716615&view=diff
==============================================================================
--- sling/site/trunk/content/documentation/bundles/discovery-api-and-impl.mdtext (original)
+++ sling/site/trunk/content/documentation/bundles/discovery-api-and-impl.mdtext Thu Nov 26 12:31:04 2015
@@ -30,7 +30,7 @@ Multiple instances that are connected to
 In the discovery API this cluster concept is represented via a `ClusterView` object. A 'view' because it is a momentary snapshot of the cluster and only contains instances that are currently alive. It's features are:
 
 * each cluster has a stable leader. Stable meaning it won't change unless that leader crashes.
-* it has a list of instances that are part of it, thus currently alive
+* it has an ordered, stable list of instances that are part of it, thus currently alive. the relative order of instances in this list is stable, meaning that it only stays or moves up one position if an instance listed 'above' crashes - a newly started instance will always be added at the end of this list.
 * plus it has a unique id that is persistent across restarts
 
 ### Topology, TopologyView
@@ -38,6 +38,7 @@ In the discovery API this cluster concep
 The topology - or more precisely the `TopologyView` - represents a snapshot (`view`) of a number of loosely coupled Sling instances (`InstanceDescription`)
 and clusters (`ClusterView`) of a particular deployment. A cluster can consist of one or more instances. Each instance
 is always part of a cluster (even if the cluster consists of only one instance). The features are:
+
 * only one: it has a list of clusters
 
 There are no further assumption made on the structure of a topology.
@@ -48,7 +49,7 @@ that model such cluster types or other a
 
 ## Cluster Leader and Instance Ordering
 
-The discovery API introduces support for a `cluster leader`: within each cluster, the API guarantees that one and only one
+As mentioned the discovery API introduces support for a `cluster leader`: within each cluster, the API guarantees that one and only one
 instance is leader at any time. That leader is guaranteed to be `stable`, ie as long as it stays alive and is visible
 by other instances of the same cluster, it will stay leader. As soon as it leaves the cluster (or the corresponding
 implementation bundle is deactivated), another instance in that cluster is elected leader. The leader can be used to
@@ -144,6 +145,58 @@ Administrative note: All the information
 
 	/var/discovery/impl
 
+#### /var/discovery/impl/clusterInstances/<slingId>
+
+Each instance has its own node under `clusterInstances/` where it stores:
+
+* `lastHeartbeat`: property, which marks the instance alive for another `heartbeatTimeout`
+* `leaderElectionId`: an id which is used to determine the leader: the instance with the lowest such leaderElectionId is the leader.
+Therefore this id is crucial to implement stable leader and ordering. The id contains a prefix (to account for a crx2 edge case
+where jobs might want to be executed on slave rather than on master), followed by the bundle activate time (to honour stability)
+and ultimately by the slingId (to have a discriminator should there be multiple instances started at the same time)
+* `runtimeId`: a plain, random UUID that is created fresh upon bundle activation. It is used to detect situations where
+multiple instances have the same slingId and thus write into the same `/var/discovery/impl/clsuterInstances/<slingId>` node.
+* `slingHomePath` and `endpoints`: these are used for logging purpose only
+
+Additionally, there are two sub-nodes:
+
+* `announcements`: this contains announcements of topology connector peers (also see below). An announcement is a json-encoded
+representation of the sub-tree that the connector peer is aware of and is thereby announcing to this instance. Announcements
+are sent in both directions of a topology connector. Discovery.impl takes care of filtering out duplicate instances should
+the structure of topology connectors, and thus these announcements overlap (which is legal)
+* `properties`: contains all properties as specified by registered `PropertyProvider`
+
+#### /var/discovery/impl/establishedView
+
+This contains the currently valid, agreed/voted upon cluster view that lists all alive instances:
+
+* the name of the node directly under `establishedView` is a unique id of the current incarnation of the cluster view -
+thus changes whenever an instance joins or leaves or there is a new voting for another reason.
+** `clusterId` : name of the persistent identifier of this cluster. As this is propagated from cluster view to cluster view
+it stays unchanged forever.
+** `leaderElectionId`: the leaderElectionId that was winning, ie that was lowest
+** `leaderId`: the slingId of the instance that is leader of this cluster view
+* `members`: just an intermediate node containing all alive instances as child nodes
+* child node of `members`: each child represents a particular alive node (with the name being the slingId) and contains
+the following properties:
+** `leaderElectionId`: the id that will be used to determine the leader - this value is copied from the corresponding
+`/var/discovery/impl/clusterInstances/<slingId>`
+** `initiator`: this marks the instance that originally created this voting
+** `vote`: represents this instance's vote, which is true for a voting that got promoted to established view
+
+#### /var/discovery/impl/ongoingVotings
+
+This area is used for voting. Each instance can initiate a voting when it realizes that the live instances - denominated
+by those instances that have a not-yet-timed-out heartbeat property - does not match with the `establishedView`.
+
+Once a voting gets a yes vote by all instances it is promoted (moved) under `establishedView` by the initiating instance. 
+Each establishedView was once a voting, thus the structure is the same as described above.
+
+#### /var/discovery/impl/previousView
+
+The instance that promotes its winning voting to `establishedView` first moves what was there before under `previousView`.
+This is purely for debugging and not used anywhere, it just represents a persistet history of previous views of length 1.
+
 ### Heartbeats, Voting and Intra-Cluster Discovery
 
 `discovery.impl` uses the fact that all instance of a cluster are connected to the same repository as the