You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/06/25 14:03:58 UTC

[GitHub] [dolphinscheduler] yang227 opened a new issue, #10615: [Bug] 3.0 data quality module cannot be used normally

yang227 opened a new issue, #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### What happened
   
   The configured data quality spark environment can run normally. The MySQL 8.0.16 driver exists in all five services, but the driver problem is reported
   
   
   org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/06/25 21:40:31 INFO yarn.Client: Application report for application_1656159837056_0001 (state: FINISHED)
   	22/06/25 21:40:31 INFO yarn.Client: 
   		 client token: N/A
   		 diagnostics: User class threw exception: java.lang.ClassNotFoundException: com.mysql.cj.jdbc.Driver
   		at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
   		at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   		at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   		at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
   		at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
   		at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
   		at scala.Option.foreach(Option.scala:257)
   		at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:99)
   		at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
   		at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:317)
   		at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
   		at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
   		at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
   		at org.apache.dolphinscheduler.data.quality.flow.batch.reader.JdbcReader.read(JdbcReader.java:72)
   		at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.registerInputTempView(SparkBatchExecution.java:77)
   		at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.lambda$execute$0(SparkBatchExecution.java:48)
   		at java.util.ArrayList.forEach(ArrayList.java:1257)
   		at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.execute(SparkBatchExecution.java:48)
   		at org.apache.dolphinscheduler.data.quality.context.DataQualityContext.execute(DataQualityContext.java:62)
   		at org.apache.dolphinscheduler.data.quality.DataQualityApplication.main(DataQualityApplication.java:70)
   		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   		at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   		at java.lang.reflect.Method.invoke(Method.java:498)
   		at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673)
   	
   		 ApplicationMaster host: hadoop103.test.com
   		 ApplicationMaster RPC port: 37145
   		 queue: root.users.hive
   		 start time: 1656164390942
   		 final status: FAILED
   		 tracking URL: http://hadoop102.test.com:8088/proxy/application_1656159837056_0001/
   		 user: hive
   	22/06/25 21:40:31 ERROR yarn.Client: Application diagnostics message: User class threw exception: java.lang.ClassNotFoundException: com.mysql.cj.jdbc.Driver
   		at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
   		at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   		at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   		at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
   		at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
   		at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
   		at scala.Option.foreach(Option.scala:257)
   		at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:99)
   		at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
   		at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
   		at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:317)
   		at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
   		at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
   		at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
   		at org.apache.dolphinscheduler.data.quality.flow.batch.reader.JdbcReader.read(JdbcReader.java:72)
   		at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.registerInputTempView(SparkBatchExecution.java:77)
   		at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.lambda$execute$0(SparkBatchExecution.java:48)
   		at java.util.ArrayList.forEach(ArrayList.java:1257)
   		at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.execute(SparkBatchExecution.java:48)
   		at org.apache.dolphinscheduler.data.quality.context.DataQualityContext.execute(DataQualityContext.java:62)
   		at org.apache.dolphinscheduler.data.quality.DataQualityApplication.main(DataQualityApplication.java:70)
   		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   		at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   		at java.lang.reflect.Method.invoke(Method.java:498)
   		at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673)
   	
   	Exception in thread "main" org.apache.spark.SparkException: Application application_1656159837056_0001 finished with failed status
   		at org.apache.spark.deploy.yarn.Client.run(Client.scala:1158)
   		at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1606)
   		at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
   		at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
   		at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
   		at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
   		at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:926)
   		at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:935)
   		at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   	22/06/25 21:40:31 INFO util.ShutdownHookManager: Shutdown hook called
   	22/06/25 21:40:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7297abfa-2325-4aa2-84d0-16de02227b0f
   	22/06/25 21:40:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7862c891-06aa-4437-ace2-56bb0e668406
   [INFO] 2022-06-25 13:40:31.928 +0000 [taskAppId=TASK-20220625-5981895327712_2-2-4] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[375] - find app id: application_1656159837056_0001
   [INFO] 2022-06-25 13:40:31.929 +0000 [taskAppId=TASK-20220625-5981895327712_2-2-4] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[205] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/5981809635040/5981895327712_2/2/4, processId:112109 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1
   [INFO] 2022-06-25 13:40:32.864 +0000 [taskAppId=TASK-20220625-5981895327712_2-2-4] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[57] - FINALIZE_SESSION
   
   
   ### What you expected to happen
   
   After running the test task, the test results can be displayed normally
   
   ### How to reproduce
   
   This problem occurs when you install and run the official website
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   3.0.0-beta-1
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] yang227 commented on issue #10615: [Bug] 3.0 beta quality module cannot be used normally

Posted by GitBox <gi...@apache.org>.
yang227 commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1166450087

   The problem is not that the creation of the data source failed, but that all the data sources can be connected normally. The problem is that I still report an error when I use the quality inspection


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #10615: [Bug] 3.0 deta quality module cannot be used normally

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1166294361

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DriftF commented on issue #10615: [Bug] 3.0 beta quality module cannot be used normally

Posted by GitBox <gi...@apache.org>.
DriftF commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1204963983

   I think your spark environment does not contain MySQL 8.0.16 driver.
   Both spark driver and executor need mysql driver on class path so specify
   spark.driver.extraClassPath = <path>/mysql-connector-java-8.0.16.jar
   spark.executor.extraClassPath = <path>/mysql-connector-java-8.0.16.jar
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] yang227 commented on issue #10615: [Bug] 3.0 beta quality module cannot be used normally

Posted by GitBox <gi...@apache.org>.
yang227 commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1208942934

   Thank you. I'll have an interview later


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] SbloodyS closed issue #10615: [Bug] 3.0 beta quality module cannot be used normally

Posted by GitBox <gi...@apache.org>.
SbloodyS closed issue #10615: [Bug] 3.0 beta quality module cannot be used normally
URL: https://github.com/apache/dolphinscheduler/issues/10615


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] SbloodyS commented on issue #10615: [Bug] 3.0 beta quality module cannot be used normally

Posted by GitBox <gi...@apache.org>.
SbloodyS commented on issue #10615:
URL: https://github.com/apache/dolphinscheduler/issues/10615#issuecomment-1166396456

   Duplicated with #9860.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org