You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/06/25 14:03:58 UTC
[GitHub] [dolphinscheduler] yang227 opened a new issue, #10615: [Bug] 3.0 data quality module cannot be used normally
yang227 opened a new issue, #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues.
### What happened
The configured data quality spark environment can run normally. The MySQL 8.0.16 driver exists in all five services, but the driver problem is reported
org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/06/25 21:40:31 INFO yarn.Client: Application report for application_1656159837056_0001 (state: FINISHED)
22/06/25 21:40:31 INFO yarn.Client:
client token: N/A
diagnostics: User class threw exception: java.lang.ClassNotFoundException: com.mysql.cj.jdbc.Driver
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:99)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:317)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
at org.apache.dolphinscheduler.data.quality.flow.batch.reader.JdbcReader.read(JdbcReader.java:72)
at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.registerInputTempView(SparkBatchExecution.java:77)
at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.lambda$execute$0(SparkBatchExecution.java:48)
at java.util.ArrayList.forEach(ArrayList.java:1257)
at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.execute(SparkBatchExecution.java:48)
at org.apache.dolphinscheduler.data.quality.context.DataQualityContext.execute(DataQualityContext.java:62)
at org.apache.dolphinscheduler.data.quality.DataQualityApplication.main(DataQualityApplication.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673)
ApplicationMaster host: hadoop103.test.com
ApplicationMaster RPC port: 37145
queue: root.users.hive
start time: 1656164390942
final status: FAILED
tracking URL: http://hadoop102.test.com:8088/proxy/application_1656159837056_0001/
user: hive
22/06/25 21:40:31 ERROR yarn.Client: Application diagnostics message: User class threw exception: java.lang.ClassNotFoundException: com.mysql.cj.jdbc.Driver
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:99)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:317)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
at org.apache.dolphinscheduler.data.quality.flow.batch.reader.JdbcReader.read(JdbcReader.java:72)
at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.registerInputTempView(SparkBatchExecution.java:77)
at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.lambda$execute$0(SparkBatchExecution.java:48)
at java.util.ArrayList.forEach(ArrayList.java:1257)
at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.execute(SparkBatchExecution.java:48)
at org.apache.dolphinscheduler.data.quality.context.DataQualityContext.execute(DataQualityContext.java:62)
at org.apache.dolphinscheduler.data.quality.DataQualityApplication.main(DataQualityApplication.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673)
Exception in thread "main" org.apache.spark.SparkException: Application application_1656159837056_0001 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1158)
at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1606)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:926)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:935)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/06/25 21:40:31 INFO util.ShutdownHookManager: Shutdown hook called
22/06/25 21:40:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7297abfa-2325-4aa2-84d0-16de02227b0f
22/06/25 21:40:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7862c891-06aa-4437-ace2-56bb0e668406
[INFO] 2022-06-25 13:40:31.928 +0000 [taskAppId=TASK-20220625-5981895327712_2-2-4] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[375] - find app id: application_1656159837056_0001
[INFO] 2022-06-25 13:40:31.929 +0000 [taskAppId=TASK-20220625-5981895327712_2-2-4] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[205] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/5981809635040/5981895327712_2/2/4, processId:112109 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1
[INFO] 2022-06-25 13:40:32.864 +0000 [taskAppId=TASK-20220625-5981895327712_2-2-4] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[57] - FINALIZE_SESSION
### What you expected to happen
After running the test task, the test results can be displayed normally
### How to reproduce
This problem occurs when you install and run the official website
### Anything else
_No response_
### Version
3.0.0-beta-1
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] yang227 commented on issue #10615: [Bug] 3.0 beta quality module cannot be used normally
Posted by GitBox <gi...@apache.org>.
yang227 commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1166450087
The problem is not that the creation of the data source failed, but that all the data sources can be connected normally. The problem is that I still report an error when I use the quality inspection
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #10615: [Bug] 3.0 deta quality module cannot be used normally
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1166294361
Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
* In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
* If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] DriftF commented on issue #10615: [Bug] 3.0 beta quality module cannot be used normally
Posted by GitBox <gi...@apache.org>.
DriftF commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1204963983
I think your spark environment does not contain MySQL 8.0.16 driver.
Both spark driver and executor need mysql driver on class path so specify
spark.driver.extraClassPath = <path>/mysql-connector-java-8.0.16.jar
spark.executor.extraClassPath = <path>/mysql-connector-java-8.0.16.jar
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] yang227 commented on issue #10615: [Bug] 3.0 beta quality module cannot be used normally
Posted by GitBox <gi...@apache.org>.
yang227 commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1208942934
Thank you. I'll have an interview later
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] SbloodyS closed issue #10615: [Bug] 3.0 beta quality module cannot be used normally
Posted by GitBox <gi...@apache.org>.
SbloodyS closed issue #10615: [Bug] 3.0 beta quality module cannot be used normally
URL: https://github.com/apache/dolphinscheduler/issues/10615
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [dolphinscheduler] SbloodyS commented on issue #10615: [Bug] 3.0 beta quality module cannot be used normally
Posted by GitBox <gi...@apache.org>.
SbloodyS commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1166396456
Duplicated with #9860.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org