You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/08/10 02:37:16 UTC

[GitHub] [dolphinscheduler] Chris-Arith commented on issue #11349: [Bug] [Data Quality] `Error Output Path` doesn't created on HDFS

Chris-Arith commented on issue #11349:
URL: https://github.com/apache/dolphinscheduler/issues/11349#issuecomment-1210082611

   > Please provide the following task execution log, thanks
   
   task execution log as below
   ```java
   [INFO] 2022-08-08 14:51:08.638 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[83] - data quality task params {"localParams":[],"resourceList":[],"ruleId":10,"ruleInputParameter":{"check_type":"1","comparison_type":1,"comparison_name":"0","failure_strategy":"0","operator":"3","src_connector_type":5,"src_datasource_id":11,"src_field":null,"src_table":"BW_BI0_TSTOR_LOC","threshold":"0"},"sparkParameters":{"deployMode":"cluster","driverCores":1,"driverMemory":"512M","executorCores":2,"executorMemory":"2G","numExecutors":2,"others":"--conf spark.yarn.maxAppAttempts=1"}}
   [INFO] 2022-08-08 14:51:08.694 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - data quality task command: ${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\
 "sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,775 as process_instance_id,1896 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_775_chris_data_quality_test' as error_output_path,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"da
 tabase\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1896 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-08 14:51:08' as data_time,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count\"} }]}"
   [INFO] 2022-08-08 14:51:08.696 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[85] - tenantCode user:dolphinscheduler, task dir:775_1896
   [INFO] 2022-08-08 14:51:08.696 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[90] - create command file:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_4/775/1896/775_1896.command
   [INFO] 2022-08-08 14:51:08.696 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[116] - command : #!/bin/sh
   BASEDIR=$(cd `dirname $0`; pwd)
   cd $BASEDIR
   source /home/dolphinscheduler/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh
   ${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\"sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Drive
 r\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,775 as process_instance_id,1896 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_775_chris_data_quality_test' as error_output_path,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useU
 nicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1896 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-08 14:51:08' as data_time,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count\"} }]}"
   [INFO] 2022-08-08 14:51:08.704 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[290] - task run command: sudo -u dolphinscheduler sh /tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_4/775/1896/775_1896.command
   [INFO] 2022-08-08 14:51:08.706 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - process start, process id is: 18551
   [INFO] 2022-08-08 14:51:09.706 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> WARNING: User-defined SPARK_HOME (/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/spark) overrides detected (/opt/cloudera/parcels/CDH/lib/spark).
   	WARNING: Running spark-class from user-defined location.
   [INFO] 2022-08-08 14:51:10.707 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:10 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm124
   	22/08/08 14:51:10 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers
   	22/08/08 14:51:10 INFO conf.Configuration: resource-types.xml not found
   	22/08/08 14:51:10 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
   	22/08/08 14:51:10 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (61440 MB per container)
   	22/08/08 14:51:10 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
   	22/08/08 14:51:10 INFO yarn.Client: Setting up container launch context for our AM
   	22/08/08 14:51:10 INFO yarn.Client: Setting up the launch environment for our AM container
   	22/08/08 14:51:10 INFO yarn.Client: Preparing resources for our AM container
   	22/08/08 14:51:10 INFO yarn.Client: Uploading resource file:/home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar -> hdfs://haNameservice/user/dolphinscheduler/.sparkStaging/application_1657523889744_0915/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar
   [INFO] 2022-08-08 14:51:11.708 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:11 INFO yarn.Client: Uploading resource file:/tmp/spark-53fcb4bc-b0b4-4495-93ed-ff43dbbf670a/__spark_conf__1110115918348202718.zip -> hdfs://haNameservice/user/dolphinscheduler/.sparkStaging/application_1657523889744_0915/__spark_conf__.zip
   	22/08/08 14:51:11 INFO spark.SecurityManager: Changing view acls to: dolphinscheduler
   	22/08/08 14:51:11 INFO spark.SecurityManager: Changing modify acls to: dolphinscheduler
   	22/08/08 14:51:11 INFO spark.SecurityManager: Changing view acls groups to: 
   	22/08/08 14:51:11 INFO spark.SecurityManager: Changing modify acls groups to: 
   	22/08/08 14:51:11 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(dolphinscheduler); groups with view permissions: Set(); users  with modify permissions: Set(dolphinscheduler); groups with modify permissions: Set()
   	22/08/08 14:51:11 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.cloudera.hive/hive-site.xml
   	22/08/08 14:51:11 INFO security.YARNHadoopDelegationTokenManager: Attempting to load user's ticket cache.
   	22/08/08 14:51:11 INFO yarn.Client: Submitting application application_1657523889744_0915 to ResourceManager
   [INFO] 2022-08-08 14:51:12.709 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:11 INFO impl.YarnClientImpl: Submitted application application_1657523889744_0915
   [INFO] 2022-08-08 14:51:13.710 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:12 INFO yarn.Client: Application report for application_1657523889744_0915 (state: ACCEPTED)
   	22/08/08 14:51:12 INFO yarn.Client: 
   		 client token: N/A
   		 diagnostics: AM container is launched, waiting for AM container to Register with RM
   		 ApplicationMaster host: N/A
   		 ApplicationMaster RPC port: -1
   		 queue: root.users.dolphinscheduler
   		 start time: 1659941471620
   		 final status: UNDEFINED
   		 tracking URL: http://host:8088/proxy/application_1657523889744_0915/
   		 user: dolphinscheduler
   [INFO] 2022-08-08 14:51:14.711 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:13 INFO yarn.Client: Application report for application_1657523889744_0915 (state: ACCEPTED)
   [INFO] 2022-08-08 14:51:15.712 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:14 INFO yarn.Client: Application report for application_1657523889744_0915 (state: ACCEPTED)
   [INFO] 2022-08-08 14:51:16.713 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:15 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   	22/08/08 14:51:15 INFO yarn.Client: 
   		 client token: N/A
   		 diagnostics: N/A
   		 ApplicationMaster host: slbdcompute2
   		 ApplicationMaster RPC port: 38184
   		 queue: root.users.dolphinscheduler
   		 start time: 1659941471620
   		 final status: UNDEFINED
   		 tracking URL: http://host:8088/proxy/application_1657523889744_0915/
   		 user: dolphinscheduler
   [INFO] 2022-08-08 14:51:17.714 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:16 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:18.715 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:17 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:19.716 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:18 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:20.717 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:19 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:21.718 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:20 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:22.719 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:21 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:23.720 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:22 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:24.721 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:23 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:25.722 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:24 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:26.723 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:25 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:27.724 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:26 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:28.725 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:27 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:29.726 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:28 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:30.727 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:29 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:31.728 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:30 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
   [INFO] 2022-08-08 14:51:32.302 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[375] - find app id: application_1657523889744_0915
   [INFO] 2022-08-08 14:51:32.302 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[205] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_4/775/1896, processId:18551 ,exitStatusCode:0 ,processWaitForStatus:true ,processExitValue:0
   [INFO] 2022-08-08 14:51:32.729 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:31 INFO yarn.Client: Application report for application_1657523889744_0915 (state: FINISHED)
   	22/08/08 14:51:31 INFO yarn.Client: 
   		 client token: N/A
   		 diagnostics: N/A
   		 ApplicationMaster host: slbdcompute2
   		 ApplicationMaster RPC port: 38184
   		 queue: root.users.dolphinscheduler
   		 start time: 1659941471620
   		 final status: SUCCEEDED
   		 tracking URL: http://host:8088/proxy/application_1657523889744_0915/
   		 user: dolphinscheduler
   	22/08/08 14:51:31 INFO util.ShutdownHookManager: Shutdown hook called
   	22/08/08 14:51:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-53fcb4bc-b0b4-4495-93ed-ff43dbbf670a
   	22/08/08 14:51:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-79b34451-e157-41e4-9a2a-b9fad415244c
   [INFO] 2022-08-08 14:51:32.730 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[57] - FINALIZE_SESSION
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org