You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@shardingsphere.apache.org by te...@apache.org on 2020/10/26 11:52:46 UTC

[shardingsphere-elasticjob] branch master updated: Complete English document of features/elastic (#1644) (#1658)

This is an automated email from the ASF dual-hosted git repository.

technoboy pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/shardingsphere-elasticjob.git


The following commit(s) were added to refs/heads/master by this push:
     new b053952  Complete English document of features/elastic (#1644) (#1658)
b053952 is described below

commit b053952301b214df8b7067cfe97d4ff7d07dca6f
Author: wwj <22...@qq.com>
AuthorDate: Mon Oct 26 19:52:32 2020 +0800

    Complete English document of features/elastic (#1644) (#1658)
---
 docs/content/features/elastic.en.md | 69 +++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/docs/content/features/elastic.en.md b/docs/content/features/elastic.en.md
index ae3ab67..5c2e417 100644
--- a/docs/content/features/elastic.en.md
+++ b/docs/content/features/elastic.en.md
@@ -52,3 +52,72 @@ The unfinished job from a crashed server will be transferred and executed contin
 
 Setting the total number of sharding items to 1 and more than 1 servers to execute the jobs makes the job run in the mode of `1` master and `n` slaves.
 Once the servers that are executing jobs are down, the idle servers will take over the jobs and execute them in the next scheduling, or better, if the failover option is enabled, the idle servers can take over the failed jobs immediately.
+
+## ElasticJob-Lite Implementation Principle
+
+ElasticJob-Lite does not have a job scheduling center node, but the programs based on the deployment job framework trigger the scheduling when the corresponding time point is reached.
+The registration center is only used for job registration and monitoring information storage. The main job node is only used to handle functions such as sharding and cleaning.
+
+### Elastic Distributed Implementation
+
+- The first server went online to trigger the main server election. Once the main server goes offline, the election is triggered again, and the election process is blocked. Only when the main server election is completed, other tasks will be performed.
+- When a job server goes online, it will automatically register the server information to the registry, and automatically update the server status when it goes offline.
+- The re-sharding flag will be updated when the master node is elected, the server goes offline, and the total number of shards changes.
+- When a scheduled task is triggered, if it needs to be sharded again, it will be sharded by the main server. The sharding process is blocked, and the task can be executed after the sharding ends.
+ If the main server goes offline during the sharding process, the master server will be elected first and then perform sharding.
+- From the previous description, in order to maintain the stability of the job runtime, only the sharding status will be marked during the running process, and the sharding will not be re-sharded. Sharding can only occur before the next task is triggered.
+- Each execution of sharding will sort instances by server IP to ensure that the sharding result will not produce large fluctuations.
+- Realize the failover function, actively grab the unallocated shards after a certain server is executed, and actively search for available servers to perform tasks after a certain server goes offline.
+
+### Registry Data Structure
+
+The registration center creates a job name node under the defined namespace to distinguish different jobs, so once a job is created, the job name cannot be modified. If the name is modified, it will be regarded as a new job. 
+There are 5 data sub-nodes under the job name node, namely config, instances, sharding, servers and leader.
+
+### config node
+
+Job configuration information, stored in YAML format.
+
+### instances node
+
+Job running instance information, the child node is the primary key of the current job running instance.
+The primary key of the job running instance is composed of the IP address and PID of the job running server.
+The primary keys of the job running instance are all ephemeral nodes, which are registered when the job instance is online and automatically cleaned up when the job instance is offline. The registry monitors the changes of these nodes to coordinate the sharding and high availability of distributed jobs.
+You can write TRIGGER in the job running instance node to indicate that the instance will be executed once immediately.
+
+### sharding node
+
+Job sharding information. The child node is the sharding item sequence number, starting from zero and ending with the total number of shards minus one.
+The child node of the sharding item sequence number stores detailed information. The child node under each shard is used to control and record the running status of the shard.
+Node details description:
+
+| Child node name  | Ephemeral node   | Description                                                                                                                          |
+| ---------------- |:---------------- |:------------------------------------------------------------------------------------------------------------------------------------ |
+| instance         | NO               | The primary key of the job running instance that executes the shard                                                                  |
+| running          | YES             | The running state of the shard item.<br/>Only valid when monitorExecution is configured                                               |
+| failover         | YES             | If the shard item is assigned to another job server by failover, this node value records the job server IP that executes the shard    |
+| misfire          | NO              | Whether to restart the missed task                                                                                                    |
+| disabled         | NO              | Whether to disable this shard                                                                                                         |
+
+### servers node
+
+Job server information, the child node is the IP address of the job server.
+You can write DISABLED in the IP address node to indicate that the server is disabled.
+Under the new cloud-native architecture, the servers node is greatly weakened, only including controlling whether the server can be disabled.
+In order to achieve the core of the job more purely, the server function may be deleted in the future, and the ability to control whether the server is disabled should be delegated to the automated deployment system.
+
+### leader node
+
+The master node information of the job server is divided into three sub-nodes: election, sharding and failover.
+They are used for master node election, sharding and failover processing respectively.
+
+The leader node is an internally used node. If you are not interested in the principle of the job framework, you don't need to pay attention to this node.
+
+| Child node name           | Ephemeral node | Description                                                                                                                                                                                                                                                                                                                                             |
+| ------------------------- |:-------------- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| election\instance         | YES            | The IP address of the master node server.<br />Once the node is deleted, a re-election will be triggered.<br />All operations related to the master node will be blocked during the re-election process.                                                                                                                                                |
+| election\latch            | NO             | Distributed locks elected by the master node<br />Used for distributed locks of curator                                                                                                                                                                                                                                                                 |
+| sharding\necessary        | NO             | The flag for re-sharding. If the total number of shards changes, or the job server node goes online or offline or enabled/disabled, as well as the master node election, the re-sharded flag will be triggered. The master node is re-sharded without being interrupted in the middle<br />The sharding will not be triggered when the job is executed  |
+| sharding\processing       | YES            | The node held by the master node during sharding.<br />If there is this node, all job execution will be blocked until the sharding ends.<br />The ephemeral node will be deleted when the master node sharding is over or the master node crashes                                                                                                       |
+| failover\items\shard item | NO             | Once a job crashes, it will record to this node.<br />When there is an idle job server, it will grab the job items that need to failover from this node                                                                                                                                                                                                 |
+| failover\items\latch      | NO             | Distributed locks used when allocating failover shard items.<br /> Used by curator distributed locks                                                                                                                                                                                                                                                    |