You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/07/25 05:42:44 UTC

[GitHub] [dolphinscheduler] ulnit opened a new issue, #11134: [Bug] [DataQuality] The data quality task is abnormal

ulnit opened a new issue, #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### What happened
   
   The error message is:
   【Caused by: java.sql.BatchUpdateException: Batch entry 0 INSERT INTO t_ds_dq_execute_result ("rule_type","rule_name","process_definition_id","process_instance_id","task_instance_id","statistics_value","comparison_value","comparison_type","check_type","threshold","operator","failure_strategy","error_output_path","create_time","update_time") VALUES (0,'(null_check)',0,796,859,0,0,2,0,0,3,0,'hdfs://mycluster:8020/user/dolphinscheduler/data_quality_error_data/0_796_dq002','2022-07-21 05:20:53','2022-07-21 05:20:53') was aborted: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
   	  建议:You will need to rewrite or cast the expression.
   	  位置:337  Call getNextException to see other errors in the batch.
   		at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165)
   		at org.postgresql.core.ResultHandlerDelegate.handleError(ResultHandlerDelegate.java:52)
   		at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
   		at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:559)
   		at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:887)
   		at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:910)
   		at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1649)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:713)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1(JdbcUtils.scala:868)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1$adapted(JdbcUtils.scala:867)
   		at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2(RDD.scala:1011)
   		at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2$adapted(RDD.scala:1011)
   		at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268)
   		at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   		at org.apache.spark.scheduler.Task.run(Task.scala:136)
   		at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
   		at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
   		at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
   		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   		at java.lang.Thread.run(Thread.java:748)
   	Caused by: org.postgresql.util.PSQLException: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
   	  建议:You will need to rewrite or cast the expression.
   	  位置:337
   		at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
   		at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365)
   		... 18 more】
   
   ### What you expected to happen
   
   Nodes with data quality are running properly and data can be inserted into the database properly.
   
   ### How to reproduce
   
   Normal deployment and use, you can reproduce.
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   3.0.0-beta-2
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1193606004

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### What happened
   
   The error message is:
   【Caused by: java.sql.BatchUpdateException: Batch entry 0 INSERT INTO t_ds_dq_execute_result ("rule_type","rule_name","process_definition_id","process_instance_id","task_instance_id","statistics_value","comparison_value","comparison_type","check_type","threshold","operator","failure_strategy","error_output_path","create_time","update_time") VALUES (0,'(null_check)',0,796,859,0,0,2,0,0,3,0,'hdfs://mycluster:8020/user/dolphinscheduler/data_quality_error_data/0_796_dq002','2022-07-21 05:20:53','2022-07-21 05:20:53') was aborted: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
   	  建议:You will need to rewrite or cast the expression.
   	  位置:337  Call getNextException to see other errors in the batch.
   		at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165)
   		at org.postgresql.core.ResultHandlerDelegate.handleError(ResultHandlerDelegate.java:52)
   		at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
   		at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:559)
   		at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:887)
   		at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:910)
   		at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1649)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:713)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1(JdbcUtils.scala:868)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1$adapted(JdbcUtils.scala:867)
   		at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2(RDD.scala:1011)
   		at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2$adapted(RDD.scala:1011)
   		at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268)
   		at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   		at org.apache.spark.scheduler.Task.run(Task.scala:136)
   		at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
   		at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
   		at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
   		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   		at java.lang.Thread.run(Thread.java:748)
   	Caused by: org.postgresql.util.PSQLException: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
   	  建议:You will need to rewrite or cast the expression.
   	  位置:337
   		at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
   		at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365)
   		... 18 more】
   
   ### What you expected to happen
   
   Nodes with data quality are running properly and data can be inserted into the database properly.
   
   ### How to reproduce
   
   Normal deployment and use, you can reproduce.
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   3.0.0-beta-2
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1234936313

   This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] closed issue #11134: [Bug] [DataQuality] The data quality task is abnormal

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #11134: [Bug] [DataQuality] The data quality task is abnormal
URL: https://github.com/apache/dolphinscheduler/issues/11134


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Bug] [DataQuality] The data quality task is abnormal [dolphinscheduler]

Posted by "wangbowen1024 (via GitHub)" <gi...@apache.org>.
wangbowen1024 commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1871252134

   you can try jdbc:postgresql://localhost:5432/databaseName?stringtype=unspecified


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1193606154

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] aiwhj commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal

Posted by "aiwhj (via GitHub)" <gi...@apache.org>.
aiwhj commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1501602236

   3.1.4 The problem still exists


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1226628856

   This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Bug] [DataQuality] The data quality task is abnormal [dolphinscheduler]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1793589970

   This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] himper commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal

Posted by "himper (via GitHub)" <gi...@apache.org>.
himper commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1663369939

   3.1.7 The problem still exists
   
   23/08/03 14:17:12 INFO Executor: Running task 0.0 in stage 2.0 (TID 2)
   [INFO] 2023-08-03 14:17:14.660 +0800 -  -> 23/08/03 14:17:14 INFO JDBCRDD: closed connection
   	23/08/03 14:17:14 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 1669 bytes result sent to driver
   	23/08/03 14:17:14 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 1604 ms on localhost (executor driver) (1/1)
   	23/08/03 14:17:14 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 
   	23/08/03 14:17:14 INFO DAGScheduler: ShuffleMapStage 2 (save at JdbcWriter.java:86) finished in 1.690 s
   	23/08/03 14:17:14 INFO DAGScheduler: looking for newly runnable stages
   	23/08/03 14:17:14 INFO DAGScheduler: running: Set()
   	23/08/03 14:17:14 INFO DAGScheduler: waiting: Set(ResultStage 3)
   	23/08/03 14:17:14 INFO DAGScheduler: failed: Set()
   	23/08/03 14:17:14 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[16] at save at JdbcWriter.java:86), which has no missing parents
   	23/08/03 14:17:14 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 36.6 KB, free 116.9 MB)
   	23/08/03 14:17:14 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 15.8 KB, free 116.9 MB)
   	23/08/03 14:17:14 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on dolphinscheduler-worker-1.dolphinscheduler-worker-headless.dolphinscheduler.svc.cluster.local:36171 (size: 15.8 KB, free: 116.9 MB)
   	23/08/03 14:17:14 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1184
   	23/08/03 14:17:14 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[16] at save at JdbcWriter.java:86) (first 15 tasks are for partitions Vector(0))
   	23/08/03 14:17:14 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks
   	23/08/03 14:17:14 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 3, localhost, executor driver, partition 0, ANY, 7767 bytes)
   	23/08/03 14:17:14 INFO Executor: Running task 0.0 in stage 3.0 (TID 3)
   [INFO] 2023-08-03 14:17:15.737 +0800 -  -> 23/08/03 14:17:14 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks including 1 local blocks and 0 remote blocks
   	23/08/03 14:17:14 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
   	23/08/03 14:17:14 INFO CodeGenerator: Code generated in 98.387902 ms
   	23/08/03 14:17:15 INFO CodeGenerator: Code generated in 201.736892 ms
   	23/08/03 14:17:15 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 3)
   	java.sql.BatchUpdateException: Batch entry 0 INSERT INTO t_ds_dq_execute_result ("rule_type","rule_name","process_definition_id","process_instance_id","task_instance_id","statistics_value","comparison_value","comparison_type","check_type","threshold","operator","failure_strategy","error_output_path","create_time","update_time") VALUES (3,'(multi_table_value_comparison)',0,22,24,1691033463,1691033476,0,0,0,0,0,'s3a://dolphinscheduler/user/root/data_quality_error_data/0_22_tag_quality','2023-08-03 14:16:03','2023-08-03 14:16:03') was aborted: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
   	  Hint: You will need to rewrite or cast the expression.
   	  Position: 337  Call getNextException to see other errors in the batch.
   		at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165)
   		at org.postgresql.core.ResultHandlerDelegate.handleError(ResultHandlerDelegate.java:52)
   		at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2367)
   		at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:560)
   		at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:887)
   		at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:910)
   		at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1663)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:676)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:838)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:838)
   		at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:980)
   		at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:980)
   		at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2107)
   		at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2107)
   		at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   		at org.apache.spark.scheduler.Task.run(Task.scala:123)
   		at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:411)
   		at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
   		at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
   		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   		at java.lang.Thread.run(Thread.java:750)
   	Caused by: org.postgresql.util.PSQLException: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
   	  Hint: You will need to rewrite or cast the expression.
   	  Position: 337
   		at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2676)
   		at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
   		... 19 more
   	23/08/03 14:17:15 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 3, localhost, executor driver): java.sql.BatchUpdateException: Batch entry 0 INSERT INTO t_ds_dq_execute_result ("rule_type","rule_name","process_definition_id","process_instance_id","task_instance_id","statistics_value","comparison_value","comparison_type","check_type","threshold","operator","failure_strategy","error_output_path","create_time","update_time") VALUES (3,'(multi_table_value_comparison)',0,22,24,1691033463,1691033476,0,0,0,0,0,'s3a://dolphinscheduler/user/root/data_quality_error_data/0_22_tag_quality','2023-08-03 14:16:03','2023-08-03 14:16:03') was aborted: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Bug] [DataQuality] The data quality task is abnormal [dolphinscheduler]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1783950762

   This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Bug] [DataQuality] The data quality task is abnormal [dolphinscheduler]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #11134: [Bug] [DataQuality] The data quality task is abnormal
URL: https://github.com/apache/dolphinscheduler/issues/11134


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org