You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/07/25 05:42:44 UTC
[GitHub] [dolphinscheduler] ulnit opened a new issue, #11134: [Bug] [DataQuality] The data quality task is abnormal
ulnit opened a new issue, #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues.
### What happened
The error message is:
【Caused by: java.sql.BatchUpdateException: Batch entry 0 INSERT INTO t_ds_dq_execute_result ("rule_type","rule_name","process_definition_id","process_instance_id","task_instance_id","statistics_value","comparison_value","comparison_type","check_type","threshold","operator","failure_strategy","error_output_path","create_time","update_time") VALUES (0,'(null_check)',0,796,859,0,0,2,0,0,3,0,'hdfs://mycluster:8020/user/dolphinscheduler/data_quality_error_data/0_796_dq002','2022-07-21 05:20:53','2022-07-21 05:20:53') was aborted: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
建议:You will need to rewrite or cast the expression.
位置:337 Call getNextException to see other errors in the batch.
at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165)
at org.postgresql.core.ResultHandlerDelegate.handleError(ResultHandlerDelegate.java:52)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:559)
at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:887)
at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:910)
at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1649)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:713)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1(JdbcUtils.scala:868)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1$adapted(JdbcUtils.scala:867)
at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2(RDD.scala:1011)
at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2$adapted(RDD.scala:1011)
at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.postgresql.util.PSQLException: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
建议:You will need to rewrite or cast the expression.
位置:337
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365)
... 18 more】
### What you expected to happen
Nodes with data quality are running properly and data can be inserted into the database properly.
### How to reproduce
Normal deployment and use, you can reproduce.
### Anything else
_No response_
### Version
3.0.0-beta-2
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1193606004
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues.
### What happened
The error message is:
【Caused by: java.sql.BatchUpdateException: Batch entry 0 INSERT INTO t_ds_dq_execute_result ("rule_type","rule_name","process_definition_id","process_instance_id","task_instance_id","statistics_value","comparison_value","comparison_type","check_type","threshold","operator","failure_strategy","error_output_path","create_time","update_time") VALUES (0,'(null_check)',0,796,859,0,0,2,0,0,3,0,'hdfs://mycluster:8020/user/dolphinscheduler/data_quality_error_data/0_796_dq002','2022-07-21 05:20:53','2022-07-21 05:20:53') was aborted: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
建议:You will need to rewrite or cast the expression.
位置:337 Call getNextException to see other errors in the batch.
at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165)
at org.postgresql.core.ResultHandlerDelegate.handleError(ResultHandlerDelegate.java:52)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:559)
at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:887)
at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:910)
at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1649)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:713)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1(JdbcUtils.scala:868)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1$adapted(JdbcUtils.scala:867)
at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2(RDD.scala:1011)
at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2$adapted(RDD.scala:1011)
at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.postgresql.util.PSQLException: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
建议:You will need to rewrite or cast the expression.
位置:337
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365)
... 18 more】
### What you expected to happen
Nodes with data quality are running properly and data can be inserted into the database properly.
### How to reproduce
Normal deployment and use, you can reproduce.
### Anything else
_No response_
### Version
3.0.0-beta-2
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1234936313
This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] github-actions[bot] closed issue #11134: [Bug] [DataQuality] The data quality task is abnormal
Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #11134: [Bug] [DataQuality] The data quality task is abnormal
URL: https://github.com/apache/dolphinscheduler/issues/11134
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [Bug] [DataQuality] The data quality task is abnormal [dolphinscheduler]
Posted by "wangbowen1024 (via GitHub)" <gi...@apache.org>.
wangbowen1024 commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1871252134
you can try jdbc:postgresql://localhost:5432/databaseName?stringtype=unspecified
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1193606154
Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
* In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
* If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] aiwhj commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal
Posted by "aiwhj (via GitHub)" <gi...@apache.org>.
aiwhj commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1501602236
3.1.4 The problem still exists
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1226628856
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [Bug] [DataQuality] The data quality task is abnormal [dolphinscheduler]
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1793589970
This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] himper commented on issue #11134: [Bug] [DataQuality] The data quality task is abnormal
Posted by "himper (via GitHub)" <gi...@apache.org>.
himper commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1663369939
3.1.7 The problem still exists
23/08/03 14:17:12 INFO Executor: Running task 0.0 in stage 2.0 (TID 2)
[INFO] 2023-08-03 14:17:14.660 +0800 - -> 23/08/03 14:17:14 INFO JDBCRDD: closed connection
23/08/03 14:17:14 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 1669 bytes result sent to driver
23/08/03 14:17:14 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 1604 ms on localhost (executor driver) (1/1)
23/08/03 14:17:14 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool
23/08/03 14:17:14 INFO DAGScheduler: ShuffleMapStage 2 (save at JdbcWriter.java:86) finished in 1.690 s
23/08/03 14:17:14 INFO DAGScheduler: looking for newly runnable stages
23/08/03 14:17:14 INFO DAGScheduler: running: Set()
23/08/03 14:17:14 INFO DAGScheduler: waiting: Set(ResultStage 3)
23/08/03 14:17:14 INFO DAGScheduler: failed: Set()
23/08/03 14:17:14 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[16] at save at JdbcWriter.java:86), which has no missing parents
23/08/03 14:17:14 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 36.6 KB, free 116.9 MB)
23/08/03 14:17:14 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 15.8 KB, free 116.9 MB)
23/08/03 14:17:14 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on dolphinscheduler-worker-1.dolphinscheduler-worker-headless.dolphinscheduler.svc.cluster.local:36171 (size: 15.8 KB, free: 116.9 MB)
23/08/03 14:17:14 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1184
23/08/03 14:17:14 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[16] at save at JdbcWriter.java:86) (first 15 tasks are for partitions Vector(0))
23/08/03 14:17:14 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks
23/08/03 14:17:14 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 3, localhost, executor driver, partition 0, ANY, 7767 bytes)
23/08/03 14:17:14 INFO Executor: Running task 0.0 in stage 3.0 (TID 3)
[INFO] 2023-08-03 14:17:15.737 +0800 - -> 23/08/03 14:17:14 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks including 1 local blocks and 0 remote blocks
23/08/03 14:17:14 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
23/08/03 14:17:14 INFO CodeGenerator: Code generated in 98.387902 ms
23/08/03 14:17:15 INFO CodeGenerator: Code generated in 201.736892 ms
23/08/03 14:17:15 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 3)
java.sql.BatchUpdateException: Batch entry 0 INSERT INTO t_ds_dq_execute_result ("rule_type","rule_name","process_definition_id","process_instance_id","task_instance_id","statistics_value","comparison_value","comparison_type","check_type","threshold","operator","failure_strategy","error_output_path","create_time","update_time") VALUES (3,'(multi_table_value_comparison)',0,22,24,1691033463,1691033476,0,0,0,0,0,'s3a://dolphinscheduler/user/root/data_quality_error_data/0_22_tag_quality','2023-08-03 14:16:03','2023-08-03 14:16:03') was aborted: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
Hint: You will need to rewrite or cast the expression.
Position: 337 Call getNextException to see other errors in the batch.
at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165)
at org.postgresql.core.ResultHandlerDelegate.handleError(ResultHandlerDelegate.java:52)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2367)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:560)
at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:887)
at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:910)
at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1663)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:676)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:838)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:838)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:980)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:980)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2107)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2107)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:411)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.postgresql.util.PSQLException: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
Hint: You will need to rewrite or cast the expression.
Position: 337
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2676)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
... 19 more
23/08/03 14:17:15 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 3, localhost, executor driver): java.sql.BatchUpdateException: Batch entry 0 INSERT INTO t_ds_dq_execute_result ("rule_type","rule_name","process_definition_id","process_instance_id","task_instance_id","statistics_value","comparison_value","comparison_type","check_type","threshold","operator","failure_strategy","error_output_path","create_time","update_time") VALUES (3,'(multi_table_value_comparison)',0,22,24,1691033463,1691033476,0,0,0,0,0,'s3a://dolphinscheduler/user/root/data_quality_error_data/0_22_tag_quality','2023-08-03 14:16:03','2023-08-03 14:16:03') was aborted: ERROR: column "create_time" is of type timestamp without time zone but expression is of type character varying
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [Bug] [DataQuality] The data quality task is abnormal [dolphinscheduler]
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #11134:
URL: https://github.com/apache/dolphinscheduler/issues/11134#issuecomment-1783950762
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] [Bug] [DataQuality] The data quality task is abnormal [dolphinscheduler]
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #11134: [Bug] [DataQuality] The data quality task is abnormal
URL: https://github.com/apache/dolphinscheduler/issues/11134
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org