You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/02/15 12:40:29 UTC

[GitHub] [dolphinscheduler-website] QuakeWang commented on a change in pull request #691: [Feature-8021][Document] Add example and notice about task type Spark

QuakeWang commented on a change in pull request #691:
URL: https://github.com/apache/dolphinscheduler-website/pull/691#discussion_r806788580



##########
File path: docs/en-us/2.0.3/user_doc/guide/task/spark.md
##########
@@ -1,22 +1,62 @@
-# SPARK
+# Spark
 
-- Through the SPARK node, you can directly execute the SPARK program. For the spark node, the worker will use the `spark-submit` method to submit tasks
+## Overview
 
-> Drag in the toolbar![PNG](https://analysys.github.io/easyscheduler_docs_cn/images/toolbar_SPARK.png)The task node to the drawing board, as shown in the following figure:
+Spark task type for executing Spark programs. For Spark nodes, the worker submits the task by using the spark command `spark submit`. See [spark-submit](https://spark.apache.org/docs/3.2.1/submitting-applications.html#launching-applications-with-spark-submit) for more details.
 
-<p align="center">
-   <img src="/img/spark-submit-en.png" width="80%" />
- </p>
+## Create task
 
-- Program type: supports JAVA, Scala and Python three languages
-- The class of the main function: is the full path of the Spark program’s entry Main Class
-- Main jar package: Spark jar package
-- Deployment mode: support three modes of yarn-cluster, yarn-client and local
-- Driver core number: You can set the number of Driver cores and the number of memory
-- Number of Executors: You can set the number of Executors, the number of Executor memory, and the number of Executor cores
-- Command line parameters: Set the input parameters of the Spark program and support the substitution of custom parameter variables.
-- Other parameters: support --jars, --files, --archives, --conf format
-- Resource: If the resource file is referenced in other parameters, you need to select and specify in the resource
-- User-defined parameter: It is a user-defined parameter of the MR part, which will replace the content with \${variable} in the script
+- Click Project Management -> Project Name -> Workflow Definition, and click the "Create Workflow" button to enter the DAG editing page.
+- Drag the <img src="/img/tasks/icons/spark.png" width="15"/> from the toolbar to the drawing board.
 
-Note: JAVA and Scala are only used for identification, there is no difference, if it is Spark developed by Python, there is no main function class, and the others are the same
+## Task Parameter
+
+- **Node name**: The node name in a workflow definition is unique.
+- **Run flag**: Identifies whether this node can be scheduled normally, if it does not need to be executed, you can turn on the prohibition switch.
+- **Descriptive information**: describe the function of the node.
+- **Task priority**: When the number of worker threads is insufficient, they are executed in order from high to low, and when the priority is the same, they are executed according to the first-in first-out principle.
+- **Worker grouping**: Tasks are assigned to the machines of the worker group to execute. If Default is selected, a worker machine will be randomly selected for execution.
+- **Environment Name**: Configure the environment name in which to run the script.
+- **Number of failed retry attempts**: The number of times the task failed to be resubmitted.
+- **Failed retry interval**: The time, in cents, interval for resubmitting the task after a failed task.
+- **Delayed execution time**: the time, in cents, that a task is delayed in execution.
+- **Timeout alarm**: Check the timeout alarm and timeout failure. When the task exceeds the "timeout period", an alarm email will be sent and the task execution will fail.
+- **Program type**: supports Java、Scala and Python.

Review comment:
       > 
   
   I have finished revising it. : )




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org