You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by ki...@apache.org on 2021/11/18 03:30:32 UTC
[dolphinscheduler-website] branch master updated: Optimize documents (#526)

This is an automated email from the ASF dual-hosted git repository.

kirs pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/dolphinscheduler-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 5979228  Optimize documents (#526)
5979228 is described below

commit 59792284b72605bbc4be9a7cad1256ed142d4942
Author: lifeng <53...@users.noreply.github.com>
AuthorDate: Thu Nov 18 11:30:17 2021 +0800

    Optimize documents (#526)
    
    * updata listdocs.md
    
    updata listdocs.md
    
    * Optimize documents
    
    add about*.md
    add desi*.md
    updata docs.2.0.0
---
 .../About_DolphinScheduler.md                      | 10 ++++
 .../2.0.0/user_doc/architecture/designplus.md      | 58 ++++++++++++++++++++++
 site_config/docs2-0-0.js                           | 21 ++++++--
 3 files changed, 85 insertions(+), 4 deletions(-)

diff --git a/docs/en-us/2.0.0/user_doc/About_DolphinScheduler/About_DolphinScheduler.md b/docs/en-us/2.0.0/user_doc/About_DolphinScheduler/About_DolphinScheduler.md
new file mode 100644
index 0000000..d56b029
--- /dev/null
+++ b/docs/en-us/2.0.0/user_doc/About_DolphinScheduler/About_DolphinScheduler.md
@@ -0,0 +1,10 @@
+Apache DolphinScheduler is a cloud-native visual Big Data workflow scheduler system, committed to “solving complex big-data task dependencies and triggering relationships in data OPS orchestration so that various types of big data tasks can be used out of the box”.
+
+# High Reliability
+- Decentralized multi-master and multi-worker, HA is supported by itself, overload processing
+# User-Friendly
+- All process definition operations are visualized, Visualization process defines key information at a glance, One-click deployment
+# Rich Scenarios
+- Support multi-tenant. Support many task types e.g., spark,flink,hive, mr, shell, python, sub_process
+# High Expansibility
+- Support custom task types, Distributed scheduling, and the overall scheduling capability will increase linearly with the scale of the cluster
\ No newline at end of file
diff --git a/docs/en-us/2.0.0/user_doc/architecture/designplus.md b/docs/en-us/2.0.0/user_doc/architecture/designplus.md
new file mode 100644
index 0000000..7ee31f1
--- /dev/null
+++ b/docs/en-us/2.0.0/user_doc/architecture/designplus.md
@@ -0,0 +1,58 @@
+## System Architecture Design
+Before explaining the architecture of the scheduling system, let's first understand the commonly used terms of the scheduling system
+
+### 1.Glossary
+**DAG：** The full name is Directed Acyclic Graph, referred to as DAG. Task tasks in the workflow are assembled in the form of a directed acyclic graph, and topological traversal is performed from nodes with zero degrees of entry until there are no subsequent nodes. Examples are as follows:
+
+<p align="center">
+  <img src="/img/dag_examples_cn.jpg" alt="dag example"  width="60%" />
+  <p align="center">
+        <em>dag example</em>
+  </p>
+</p>
+
+**Process definition**: Visualization formed by dragging task nodes and establishing task node associations**DAG**
+
+**Process instance**: The process instance is the instantiation of the process definition, which can be generated by manual start or scheduled scheduling. Each time the process definition runs, a process instance is generated
+
+**Task instance**: The task instance is the instantiation of the task node in the process definition, which identifies the specific task execution status
+
+**Task type**: Currently supports SHELL, SQL, SUB_PROCESS (sub-process), PROCEDURE, MR, SPARK, PYTHON, DEPENDENT (depends), and plans to support dynamic plug-in expansion, note: **SUB_PROCESS**  It is also a separate process definition that can be started and executed separately
+
+**Scheduling method**: The system supports scheduled scheduling and manual scheduling based on cron expressions. Command type support: start workflow, start execution from current node, resume fault-tolerant workflow, resume pause process, start execution from failed node, complement, timing, rerun, pause, stop, resume waiting thread. Among them **Resume fault-tolerant workflow** and **Resume waiting thread** The two command types are used by the internal control of scheduling, and canno [...]
+
+**Scheduled**: System adopts **quartz** distributed scheduler, and supports the visual generation of cron expressions
+
+**Rely**: The system not only supports **DAG** simple dependencies between the predecessor and successor nodes, but also provides **task dependent** nodes, supporting **between processes**
+
+**Priority**: Support the priority of process instances and task instances, if the priority of process instances and task instances is not set, the default is first-in-first-out
+
+**Email alert**: Support **SQL task** Query result email sending, process instance running result email alert and fault tolerance alert notification
+
+**Failure strategy**: For tasks running in parallel, if a task fails, two failure strategy processing methods are provided. **Continue** refers to regardless of the status of the task running in parallel until the end of the process failure. **End** means that once a failed task is found, Kill will also run the parallel task at the same time, and the process fails and ends
+
+**Complement**: Supplement historical data，Supports **interval parallel and serial** two complement methods
+
+
+
+### 2.Module introduction
+- dolphinscheduler-alert alarm module, providing AlertServer service.
+
+- dolphinscheduler-api web application module, providing ApiServer service.
+
+- dolphinscheduler-common General constant enumeration, utility class, data structure or base class
+
+- dolphinscheduler-dao provides operations such as database access.
+
+- dolphinscheduler-remote client and server based on netty
+
+- dolphinscheduler-server MasterServer and WorkerServer services
+
+- dolphinscheduler-service service module, including Quartz, Zookeeper, log client access service, easy to call server module and api module
+
+- dolphinscheduler-ui front-end module
+
+### Sum up
+From the perspective of scheduling, this article preliminarily introduces the architecture principles and implementation ideas of the big data distributed workflow scheduling system-DolphinScheduler. To be continued
+
+
diff --git a/site_config/docs2-0-0.js b/site_config/docs2-0-0.js
index 7f2b45b..a8eddfd 100644
--- a/site_config/docs2-0-0.js
+++ b/site_config/docs2-0-0.js
@@ -2,6 +2,23 @@ export default {
   'en-us': {
     sidemenu: [
       {
+        title: 'About Apache DolphinScheduler',
+        children: [
+          {
+            title: 'Introduction',
+            link: '/en-us/docs/About_DolphinScheduler/About_DolphinScheduler.html',
+          },
+          {
+            title: 'Hardware Environment',
+            link: '/en-us/docs/2.0.0/user_doc/guide/installation/hardware.html',
+          },
+          {
+            title: 'Glossary',
+            link: '/en-us/docs/2.0.0/user_doc/About_DolphinScheduler/designplus.html',
+          },
+        ],
+      },
+      {
         title: 'User Manual',
         children: [
           {
@@ -16,10 +33,6 @@ export default {
             title: 'Installation',
             children: [
               {
-                title: 'Hardware Environment',
-                link: '/en-us/docs/2.0.0/user_doc/guide/installation/hardware.html',
-              },
-              {
                 title: 'Standalone Deployment',
                 link: '/en-us/docs/2.0.0/user_doc/guide/installation/standalone.html',
               },