You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "iteblog (Jira)" <ji...@apache.org> on 2019/11/21 03:39:00 UTC

[jira] [Created] (PHOENIX-5582) When we use Phoenix-Spark module to R/W phoenix data in Spark cluster mode, No suitable driver found exception will occur

iteblog created PHOENIX-5582:
--------------------------------

             Summary: When we use Phoenix-Spark module to R/W phoenix data in Spark cluster mode, No suitable driver found exception will occur
                 Key: PHOENIX-5582
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5582
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 5.0.0
            Reporter: iteblog


When we use Phoenix-Spark module to read and write phoenix data in Spark cluster mode, No suitable driver found exception will occur.

Maven dependencies
{code:xml}
<dependency>
      <groupId>org.apache.phoenix</groupId>
      <artifactId>phoenix-spark</artifactId>
      <version>5.0.0-HBase-2.0</version>
</dependency>

<dependency>
      <groupId>org.apache.phoenix</groupId>
      <artifactId>phoenix-core</artifactId>
      <version>5.0.0-HBase-2.0</version>
</dependency>
{code}
My test code is as follows
{code:scala}
package phoenix.datasource

import org.apache.spark.sql.SparkSession

object Test {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession
      .builder()
      .appName("PhoenixDataSource")
      .getOrCreate()

    val zk = "test-master1-001.com,test-master2-001.com,test-master3-001.com:2181"

    val df = spark.read.format("org.apache.phoenix.spark")
      .option("table", "search_info_test").option("zkUrl", zk).load()

    df.selectExpr("ID","NAME").show(20, 100)
  }
}
{code}
If you run the above code in local mode, everything is fine. However, if you run the above code in cluster mode, the following exceptions occur
{code:scala}
19/11/21 11:20:17 ERROR PhoenixInputFormat: Failed to get the query plan with error [No suitable driver found for jdbc:phoenix:test-master1-001.com,test-master2-001.com,test-master3-001.com:2181:/hbase;]
 19/11/21 11:20:17 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
 java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:test-master1-001.com,test-master2-001.com,test-master3-001.com:2181:/hbase;
 at org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:208)
 at org.apache.phoenix.mapreduce.PhoenixInputFormat.createRecordReader(PhoenixInputFormat.java:76)
 at org.apache.spark.rdd.NewHadoopRDD$$anon$1.liftedTree1$1(NewHadoopRDD.scala:197)
 at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:196)
 at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:151)
 at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:70)
 at org.apache.phoenix.spark.PhoenixRDD.compute(PhoenixRDD.scala:64)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
 at org.apache.spark.scheduler.Task.run(Task.scala:121)
 at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
 at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
 at java.lang.Thread.run(Thread.java:834)
 Caused by: java.sql.SQLException: No suitable driver found for jdbc:phoenix:test-master1-001.com,test-master2-001.com,test-master3-001.com:2181:/hbase;
 at java.sql.DriverManager.getConnection(DriverManager.java:699)
 at java.sql.DriverManager.getConnection(DriverManager.java:217)
 at org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:113)
 at org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(ConnectionUtil.java:58)
 at org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:180)
 ... 43 more
{code}

The reason is that the {{org.apache.phoenix.spark.PhoenixRDD}} class only registered the {{PhoenixDriver}} in driver, not on the executor side, which caused the above problem. Using the phoenix-spark module to write data to phoenix in spark cluster mode will also cause this problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)