You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "zery (Jira)" <ji...@apache.org> on 2022/04/07 09:31:00 UTC

[jira] [Updated] (SPARK-38815) Not found jdbc driver class when enable hive and call multiple jdbc action function

     [ https://issues.apache.org/jira/browse/SPARK-38815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zery updated SPARK-38815:
-------------------------
    Summary: Not found jdbc driver class when enable hive and call multiple jdbc action function  (was: Not found jdbc driver class when enable hive and call multiple jdbc action)

> Not found jdbc driver class when enable hive and call multiple jdbc action function
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-38815
>                 URL: https://issues.apache.org/jira/browse/SPARK-38815
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes, Spark Core, SQL
>    Affects Versions: 3.1.1
>         Environment: k8s: v1.20.4
> spark: v3.1.1
> hive: 4.0.0-SNAPSHOT
>            Reporter: zery
>            Priority: Major
>
> Hello, the spark code is:
> {code:java}
> // Some comments here
> def main(args : Array[String]): Unit = {
>     val spark = SparkSession.builder
>       .appName("TestCKJdbc")
>       .enableHiveSupport()
>       .getOrCreate()
>     spark.read
>       .format("jdbc")
>       .option("driver","ru.yandex.clickhouse.ClickHouseDriver")
>       .option("url", "jdbc:clickhouse://clickhouse-server-svc.admin.svc.cluster.local:8123/aaa")
>       .option("dbtable", "A")
>       .option("user", "default")
>       .option("password", "abc")
>       .load()
>       .show()
>     spark.read
>       .format("jdbc")
>       .option("driver","ru.yandex.clickhouse.ClickHouseDriver")
>       .option("url", "jdbc:clickhouse://clickhouse-server-svc.admin.svc.cluster.local:8123/aaa")
>       .option("dbtable", "A")
>       .option("user", "default")
>       .option("password", "abc")
>       .load()
>       .show()
>     spark.stop()
>   }
> {code}
> when I submit it to k8s, it will get error:
> {code:java}
> 22/04/07 09:12:08 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0) (10.233.67.129, executor 1, partition 0, PROCESS_LOCAL, 4318 bytes) taskResourceAssignments Map()
> 22/04/07 09:12:09 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.233.67.129:41816 (size: 5.1 KiB, free: 413.9 MiB)
> 22/04/07 09:12:15 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 6596 ms on 10.233.67.129 (executor 1) (1/1)
> 22/04/07 09:12:15 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
> 22/04/07 09:12:15 INFO DAGScheduler: ResultStage 0 (show at App.scala:24) finished in 8.524 s
> 22/04/07 09:12:15 INFO DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job
> 22/04/07 09:12:15 INFO TaskSchedulerImpl: Killing all running tasks in stage 0: Stage finished
> 22/04/07 09:12:15 INFO DAGScheduler: Job 0 finished: show at App.scala:24, took 8.611591 s
> 22/04/07 09:12:15 INFO CodeGenerator: Code generated in 47.462472 ms
> +---+---------+---------+
> | id|discrete1|discrete2|
> +---+---------+---------+
> |  1|        A|        a|
> |  1|        B|        b|
> |  1|        A|        c|
> |  2|        C|        a|
> |  1|        A|        a|
> +---+---------+---------+22/04/07 09:12:15 INFO SparkUI: Stopped Spark web UI at http://spark-7f8c3c80034ac34e-driver-svc.spark-operator.svc:4040
> 22/04/07 09:12:15 INFO KubernetesClusterSchedulerBackend: Shutting down all executors
> 22/04/07 09:12:15 INFO KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each executor to shut down
> 22/04/07 09:12:15 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client has been closed (this is expected if the application is shutting down.)
> 22/04/07 09:12:15 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
> 22/04/07 09:12:15 INFO MemoryStore: MemoryStore cleared
> 22/04/07 09:12:15 INFO BlockManager: BlockManager stopped
> 22/04/07 09:12:15 INFO BlockManagerMaster: BlockManagerMaster stopped
> 22/04/07 09:12:15 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
> 22/04/07 09:12:15 INFO SparkContext: Successfully stopped SparkContext
> Exception in thread "main" java.lang.ClassNotFoundException: ru.yandex.clickhouse.ClickHouseDriver
>     at java.base/java.net.URLClassLoader.findClass(Unknown Source)
>     at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
>     at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
>     at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:46)
>     at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.$anonfun$driverClass$1(JDBCOptions.scala:102)
>     at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.$anonfun$driverClass$1$adapted(JDBCOptions.scala:102)
>     at scala.Option.foreach(Option.scala:407)
>     at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:102)
>     at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:38)
>     at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
>     at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:354)
>     at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:326)
>     at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:308)
>     at scala.Option.getOrElse(Option.scala:189)
>     at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:308)
>     at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:226)
>     at org.example.App$.main(App.scala:34)
>     at org.example.App.main(App.scala)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     at java.base/java.lang.reflect.Method.invoke(Unknown Source)
>     at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
>     at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
>     at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>     at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>     at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>     at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 22/04/07 09:12:15 INFO ShutdownHookManager: Shutdown hook called
> 22/04/07 09:12:15 INFO ShutdownHookManager: Deleting directory /tmp/spark-77261146-995b-42e6-b5e8-eb7cbf8eda49
> 22/04/07 09:12:15 INFO ShutdownHookManager: Deleting directory /var/data/spark-5afa86cb-67c6-493a-90f5-ea1c23089c9f/spark-d1c96f2e-c40c-4942-bd99-0451d23159b3
> 22/04/07 09:12:15 INFO AuditProviderFactory: ==> JVMShutdownHook.run()
> 22/04/07 09:12:15 INFO AuditProviderFactory: JVMShutdownHook: Signalling async audit cleanup to start.
> 22/04/07 09:12:15 INFO AuditProviderFactory: JVMShutdownHook: Waiting up to 30 seconds for audit cleanup to finish.
> 22/04/07 09:12:15 INFO AuditProviderFactory: RangerAsyncAuditCleanup: Starting cleanup
> 22/04/07 09:12:15 INFO AuditAsyncQueue: Stop called. name=hiveCLI.async
> 22/04/07 09:12:15 INFO AuditAsyncQueue: Interrupting consumerThread. name=hiveCLI.async, consumer=hiveCLI.async.batch
> 22/04/07 09:12:15 INFO AuditProviderFactory: RangerAsyncAuditCleanup: Done cleanup {code}
> **but, if del ".enableHiveSupport()" code, it will run success.*
> like:
> {code:java}
> def main(args : Array[String]): Unit = {
>     val spark = SparkSession.builder
>       .appName("TestCKJdbc")
>       //.enableHiveSupport()
>       .getOrCreate()
>     spark.read
>       .format("jdbc")
>       .option("driver","ru.yandex.clickhouse.ClickHouseDriver")
>       .option("url", "jdbc:clickhouse://clickhouse-server-svc.admin.svc.cluster.local:8123/abc")
>       .option("dbtable", "A")
>       .option("user", "default")
>       .option("password", "aaa")
>       .load()
>       .show()
>     spark.read
>       .format("jdbc")
>       .option("driver","ru.yandex.clickhouse.ClickHouseDriver")
>       .option("url", "jdbc:clickhouse://clickhouse-server-svc.admin.svc.cluster.local:8123/abc")
>       .option("dbtable", "A")
>       .option("user", "default")
>       .option("password", "aaa")
>       .load()
>       .show()
>     spark.stop()
>   }
> {code}
> It will happen only when enableHive and call multiple jdbc action function.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org