You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/07/14 15:16:33 UTC

[GitHub] [dolphinscheduler] zhongjiajie commented on a diff in pull request #10937: [Feature][Task] Support chunjun (FlinkX) task

zhongjiajie commented on code in PR #10937:
URL: https://github.com/apache/dolphinscheduler/pull/10937#discussion_r921269344


##########
docs/docs/en/guide/task/chunjun.md:
##########
@@ -0,0 +1,73 @@
+# CHUNJUN
+
+## Overview
+
+CHUNJUN task type for executing CHUNJUN programs. For CHUNJUN nodes, the worker will execute `${CHUNJUN_HOME}/bin/start-chunjun` to analyze the input json file.
+
+## Create Task
+
+- Click `Project Management -> Project Name -> Workflow Definition`, and click the `Create Workflow` button to enter the DAG editing page.
+- Drag the <img src="../../../../img/tasks/icons/chunjun.png" width="15"/> from the toolbar to the drawing board.
+
+## Task Parameters
+
+| **Parameter** | **Description** |
+| ------- | ---------- |
+| Node name | The node name in a workflow definition is unique. |
+| Run flag | Identifies whether this node schedules normally, if it does not need to execute, select the prohibition execution. |
+| Task priority | When the number of worker threads is insufficient, execute in the order of priority from high to low, and tasks with the same priority will execute in a first-in first-out order. |
+| Description | Describe the function of the node. |
+| Worker group | Assign tasks to the machines of the worker group to execute. If `Default` is selected, randomly select a worker machine for execution. |
+| Environment Name | Configure the environment name in which run the script. |
+| Number of failed retries | The number of times the task failed to resubmit. |
+| Failed retry interval | The time interval (unit minute) for resubmitting the task after a failed task. |
+| Task group name | The task group name. |
+| Priority | The task priority. |
+| Delayed execution time |  The time, in cents, that a task is delayed in execution. |
+| Timeout alarm | Check the timeout alarm and timeout failure. When the task exceeds the "timeout period", an alarm email will be sent and the task execution will fail. |
+| Custom template | Custom the content of the CHUNJUN node's json profile. |
+| json | json configuration file for CHUNJUN synchronization. |
+| Custom parameters | SQL task type, and stored procedure is a custom parameter order to set values for the method. The custom parameter type and data type are the same as the stored procedure task type. The difference is that the SQL task type custom parameter will replace the \${variable} in the SQL statement. |
+| Deploy mode | Execute chunjun task mode, eg local standalone. |
+| Option Parameters | Support such as `-confProp "{\"flink.checkpoint.interval\":60000}"` |
+| Predecessor task | Selecting a predecessor task for the current task will set the selected predecessor task as upstream of the current task. |
+
+## Task Example
+
+This example demonstrates importing data from Hive into MySQL.
+
+### Configuring the CHUNJUN environment in DolphinScheduler
+
+If you are using the CHUNJUN task type in a production environment, it is necessary to configure the required environment first. The configuration file is as follows: `/dolphinscheduler/conf/env/dolphinscheduler_env.sh`.
+
+![chunjun_task01](../../../../img/tasks/demo/chunjun_task01.png)
+
+After the environment has been configured, DolphinScheduler needs to be restarted.
+
+### Configuring CHUNJUN Task Node
+
+As the data to be read from Hive, a custom json is required, refer to: the template json in directory chunjun/chunjun-examples/json/hive.
+
+After writing the required json file, you can configure the node content by following the steps in the diagram below.
+
+![chunjun_task02](../../../../img/tasks/demo/chunjun_task02.png)
+
+### View run results
+
+![chunjun_task03](../../../../img/tasks/demo/chunjun_task03.png)
+
+### Note
+
+Before execute ${CHUNJUN_HOME}/bin/start-chunjun, need to change the shell ${CHUNJUN_HOME}/bin/start-chunjun, remove '&' in order to run in front. 
+
+ such as:
+
+```shell
+nohup $JAVA_RUN -cp $JAR_DIR $CLASS_NAME $@ &
+```
+
+update to following:
+
+```shell
+nohup $JAVA_RUN -cp $JAR_DIR $CLASS_NAME $@

Review Comment:
   It is seems you are using the different format in this for English and Chinses docs, could you migrate them to the same format?



##########
dolphinscheduler-task-plugin/dolphinscheduler-task-chunjun/src/main/java/org/apache/dolphinscheduler/plugin/task/chunjun/ChunJunParameters.java:
##########
@@ -0,0 +1,264 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.dolphinscheduler.plugin.task.chunjun;
+
+import org.apache.dolphinscheduler.plugin.task.api.enums.ResourceType;
+import org.apache.dolphinscheduler.plugin.task.api.model.ResourceInfo;
+import org.apache.dolphinscheduler.plugin.task.api.parameters.AbstractParameters;
+import org.apache.dolphinscheduler.plugin.task.api.parameters.resource.ResourceParametersHelper;
+import org.apache.dolphinscheduler.spi.enums.Flag;
+import org.apache.dolphinscheduler.spi.utils.StringUtils;
+
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * chunjun parameters
+ */
+public class ChunJunParameters extends AbstractParameters {

Review Comment:
   I wonder can we use `@Data` here to getter and setter?



##########
docs/docs/en/guide/task/chunjun.md:
##########
@@ -0,0 +1,73 @@
+# CHUNJUN
+
+## Overview
+
+CHUNJUN task type for executing CHUNJUN programs. For CHUNJUN nodes, the worker will execute `${CHUNJUN_HOME}/bin/start-chunjun` to analyze the input json file.

Review Comment:
   could you add some hyper link for `CHUNJUN`? and I wonder is it have to use uppercase for all characters?



##########
docs/docs/en/guide/task/chunjun.md:
##########
@@ -0,0 +1,73 @@
+# CHUNJUN
+
+## Overview
+
+CHUNJUN task type for executing CHUNJUN programs. For CHUNJUN nodes, the worker will execute `${CHUNJUN_HOME}/bin/start-chunjun` to analyze the input json file.

Review Comment:
   Add you have to add `ChuJun` to https://github.com/apache/dolphinscheduler/blob/81930e54208f7c0f305c0b9f8846149bfc38f428/docs/configs/docsdev.js#L603 and https://github.com/apache/dolphinscheduler/blob/81930e54208f7c0f305c0b9f8846149bfc38f428/docs/configs/docsdev.js#L183 to display it in the dolphinscheduler website sidebar



##########
dolphinscheduler-task-plugin/dolphinscheduler-task-chunjun/src/main/java/org/apache/dolphinscheduler/plugin/task/chunjun/ChunJunConstants.java:
##########
@@ -0,0 +1,31 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.dolphinscheduler.plugin.task.chunjun;
+
+/**
+ * ChunJun constants
+ */
+public class ChunJunConstants {
+
+    public static final String FLINK_CONF_DIR = "${FLINK_HOME}/conf";
+
+    public static final String FLINK_LIB_DIR = "${FLINK_HOME}/lib";
+
+    public static final String HADOOP_CONF_DIR = "${HADOOP_HOME}/etc/hadoop";

Review Comment:
   Does ChuJun have to run base on flink or hadoop? and have to config both flink and hadoop home? if so please add them to our docs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org