You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/05/01 10:52:02 UTC

[GitHub] [dolphinscheduler] sq-q commented on a diff in pull request #9851: [Improvement-9772][docs/docs-zh] add spark sql docs for Docs

sq-q commented on code in PR #9851:
URL: https://github.com/apache/dolphinscheduler/pull/9851#discussion_r862457003


##########
docs/docs/en/guide/task/spark.md:
##########
@@ -39,30 +45,42 @@ Spark task type used to execute Spark program. For Spark nodes, the worker submi
 
 ## Task Example
 
-### Execute the WordCount Program
+### (1) spark submit
+
+#### Execute the WordCount Program
 
 This is a common introductory case in the big data ecosystem, which often apply to computational frameworks such as MapReduce, Flink and Spark. The main purpose is to count the number of identical words in the input text. (Flink's releases attach this example job)
 
-#### Configure the Spark Environment in DolphinScheduler
+##### Configure the Spark Environment in DolphinScheduler
 
 If you are using the Spark task type in a production environment, it is necessary to configure the required environment first. The following is the configuration file: `bin/env/dolphinscheduler_env.sh`.
 
 ![spark_configure](/img/tasks/demo/spark_task01.png)
 
-#### Upload the Main Package
+##### Upload the Main Package
 
 When using the Spark task node, you need to upload the jar package to the Resource Centre for the execution, refer to the [resource center](../resource.md).
 
 After finish the Resource Centre configuration, upload the required target files directly by dragging and dropping.
 
 ![resource_upload](/img/tasks/demo/upload_jar.png)
 
-#### Configure Spark Nodes
+##### Configure Spark Nodes
 
 Configure the required content according to the parameter descriptions above.
 
 ![demo-spark-simple](/img/tasks/demo/spark_task02.png)
 
+### (2) spark sql
+
+#### Execute DDL and DML statements
+
+This case is to create a view table terms and write three rows of data and a table wc in parquet format and determine whether the table exists. The program type is SQL. Insert the data of the view table terms into the table wc in parquet format.
+
+![spark_sql](/img/tasks/demo/spark_sql.png)
+
 ## Notice
 
-JAVA and Scala only used for identification, there is no difference. If you use Python to develop Spark application, there is no class of the main function and the rest is the same.
+JAVA and Scala are only used for identification, and there is no difference. If it is Spark developed by Python, there is no class for the main function. Others are the same. JAVA, Scala and Python do not have SQL scripts.

Review Comment:
   OK, I'll revise it now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org