You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "jaanai (JIRA)" <ji...@apache.org> on 2019/04/01 02:55:05 UTC

[jira] [Commented] (PHOENIX-5222) java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame

    [ https://issues.apache.org/jira/browse/PHOENIX-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16806344#comment-16806344 ] 

jaanai commented on PHOENIX-5222:
---------------------------------

Looks like something is incompatible, Such as the spark version should use 2.0.2, see more: [https://search.maven.org/artifact/org.apache.phoenix/phoenix/4.14.0-HBase-1.1/pom|https://search.maven.org/artifact/org.apache.phoenix/phoenix/4.14.0-HBase-1.1/pom] 

> java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
> --------------------------------------------------------------
>
>                 Key: PHOENIX-5222
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5222
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.14.0
>            Reporter: unknowspeople
>            Priority: Major
>              Labels: phoenix, spark
>   Original Estimate: 2,160h
>  Remaining Estimate: 2,160h
>
> I am running my Spark code to read data from Phoenix which has Spark 2.3.0 installed. Running in IntelliJ, it works not fine that is throwing me this error:
> {color:red}java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame{color}
> spark code:
> {code:java}
> package com.ahct.hbase
> import org.apache.spark.sql._
> object Test1 {
>   def main(args: Array[String]): Unit = {
>     val zkUrl = "192.168.240.101:2181"
>     val spark = SparkSession.builder()
>       .appName("SparkPhoenixTest1")
>       .master("local[2]")
>       .getOrCreate()
>     val df = spark.read.format("org.apache.phoenix.spark")
>       .option("zkurl", zkUrl)
>       .option("table","\"bigdata\".\"tbs1\"")
>       .load()
>     df.show()
>   }
> }
> {code}
> My pom.xml which is correctly mentioning Spark version as 2.3.0:
> {code:java}
>     <properties>
>         <maven.compiler.source>1.8</maven.compiler.source>
>         <maven.compiler.target>1.8</maven.compiler.target>
>         <encoding>UTF-8</encoding>
>         <scala.version>2.11.8</scala.version>
>         <spark.version>2.3.0</spark.version>
>         <hadoop.version>2.6.0-cdh5.14.2</hadoop.version>
>         <hive.version>1.1.0-cdh5.14.2</hive.version>
>     </properties>
>     <dependencies>
>         <dependency>
>             <groupId>org.apache.hadoop</groupId>
>             <artifactId>hadoop-client</artifactId>
>             <version>${hadoop.version}</version>
>         </dependency>
>         <dependency>
>             <groupId>org.apache.spark</groupId>
>             <artifactId>spark-sql_2.11</artifactId>
>             <version>${spark.version}</version>
>         </dependency>
>         <dependency>
>             <groupId>org.scala-lang</groupId>
>             <artifactId>scala-library</artifactId>
>             <version>${scala.version}</version>
>         </dependency>
>         <dependency>
>             <groupId>org.apache.hive</groupId>
>             <artifactId>hive-exec</artifactId>
>             <version>${hive.version}</version>
>         </dependency>
>         <dependency>
>             <groupId>org.apache.hive</groupId>
>             <artifactId>hive-jdbc</artifactId>
>             <version>${hive.version}</version>
>         </dependency>
>         <dependency>
>             <groupId>org.postgresql</groupId>
>             <artifactId>postgresql</artifactId>
>             <version>42.2.5</version>
>         </dependency>
>         <dependency>
>             <groupId>org.apache.phoenix</groupId>
>             <artifactId>phoenix-spark</artifactId>
>             <version>4.14.0-cdh5.14.2</version>
>         </dependency>
>         <dependency>
>             <groupId>org.apache.twill</groupId>
>             <artifactId>twill-api</artifactId>
>             <version>0.8.0</version>
>         </dependency>
>         <dependency>
>             <groupId>joda-time</groupId>
>             <artifactId>joda-time</artifactId>
>             <version>2.9.9</version>
>         </dependency>
>         <!-- Test -->
>         <dependency>
>            <groupId>junit</groupId>
>            <artifactId>junit</artifactId>
>            <version>4.8.1</version>
>            <scope>test</scope>
>         </dependency>
>     </dependencies>
> {code}
> Here is the stacktrace from IntelliJ which shows this error:
> {code:java}
> "C:\Program Files\Java\jdk1.8.0_111\bin\java" "-javaagent:D:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2017.3.5\lib\idea_rt.jar=61050:D:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2017.3.5\bin" -Dfile.encoding=UTF-8 -classpath C:\Users\ZX~1\AppData\Local\Temp\classpath.jar com.ahct.hbase.Test3
> Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
> 19/03/31 19:59:49 INFO SparkContext: Running Spark version 2.3.0
> 19/03/31 19:59:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 19/03/31 19:59:50 INFO SparkContext: Submitted application: SparkPhoenixTest3
> 19/03/31 19:59:50 INFO SecurityManager: Changing view acls to: ZX
> 19/03/31 19:59:51 INFO SecurityManager: Changing modify acls to: ZX
> 19/03/31 19:59:51 INFO SecurityManager: Changing view acls groups to: 
> 19/03/31 19:59:51 INFO SecurityManager: Changing modify acls groups to: 
> 19/03/31 19:59:51 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(ZX); groups with view permissions: Set(); users  with modify permissions: Set(ZX); groups with modify permissions: Set()
> 19/03/31 19:59:53 INFO Utils: Successfully started service 'sparkDriver' on port 61072.
> 19/03/31 19:59:53 INFO SparkEnv: Registering MapOutputTracker
> 19/03/31 19:59:53 INFO SparkEnv: Registering BlockManagerMaster
> 19/03/31 19:59:53 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
> 19/03/31 19:59:53 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
> 19/03/31 19:59:53 INFO DiskBlockManager: Created local directory at C:\Users\ZX\AppData\Local\Temp\blockmgr-7386bf6d-b0f4-40b0-b015-ed0191990e1c
> 19/03/31 19:59:53 INFO MemoryStore: MemoryStore started with capacity 899.7 MB
> 19/03/31 19:59:53 INFO SparkEnv: Registering OutputCommitCoordinator
> 19/03/31 19:59:54 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
> 19/03/31 19:59:54 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
> 19/03/31 19:59:54 INFO Utils: Successfully started service 'SparkUI' on port 4042.
> 19/03/31 19:59:54 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://DESKTOP-7M1BH3H:4042
> 19/03/31 19:59:54 INFO Executor: Starting executor ID driver on host localhost
> 19/03/31 19:59:54 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61085.
> 19/03/31 19:59:54 INFO NettyBlockTransferService: Server created on DESKTOP-7M1BH3H:61085
> 19/03/31 19:59:54 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
> 19/03/31 19:59:54 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None)
> 19/03/31 19:59:54 INFO BlockManagerMasterEndpoint: Registering block manager DESKTOP-7M1BH3H:61085 with 899.7 MB RAM, BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None)
> 19/03/31 19:59:54 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None)
> 19/03/31 19:59:54 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None)
> 19/03/31 19:59:55 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/D:/ahty/AHCT/code/scala-test/spark-warehouse/').
> 19/03/31 19:59:55 INFO SharedState: Warehouse path is 'file:/D:/ahty/AHCT/code/scala-test/spark-warehouse/'.
> 19/03/31 19:59:56 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
> 19/03/31 19:59:58 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 419.0 KB, free 899.3 MB)
> 19/03/31 19:59:59 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 29.4 KB, free 899.3 MB)
> 19/03/31 19:59:59 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on DESKTOP-7M1BH3H:61085 (size: 29.4 KB, free: 899.7 MB)
> 19/03/31 19:59:59 INFO SparkContext: Created broadcast 0 from newAPIHadoopRDD at PhoenixRDD.scala:49
> 19/03/31 19:59:59 INFO deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
> 19/03/31 19:59:59 INFO deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
> 19/03/31 19:59:59 INFO QueryLoggerDisruptor: Starting  QueryLoggerDisruptor for with ringbufferSize=8192, waitStrategy=BlockingWaitStrategy, exceptionHandler=org.apache.phoenix.log.QueryLoggerDefaultExceptionHandler@7b5cc918...
> 19/03/31 19:59:59 INFO ConnectionQueryServicesImpl: An instance of ConnectionQueryServices was created.
> 19/03/31 20:00:00 INFO RecoverableZooKeeper: Process identifier=hconnection-0x1cbc5693 connecting to ZooKeeper ensemble=192.168.240.101:2181
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:zookeeper.version=3.4.5-cdh5.14.2--1, built on 03/27/2018 20:39 GMT
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:host.name=DESKTOP-7M1BH3H
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.version=1.8.0_111
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.vendor=Oracle Corporation
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.home=C:\Program Files\Java\jdk1.8.0_111\jre
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.class.path=C:\Users\ZX~1\AppData\Local\Temp\classpath.jar;D:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2017.3.5\lib\idea_rt.jar
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.library.path=C:\Program Files\Java\jdk1.8.0_111\bin;C:\WINDOWS\Sun\Java\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\Python27\;C:\Python27\Scripts;C:\Program Files (x86)\Intel\iCLS Client\;C:\ProgramData\Oracle\Java\javapath;D:\app\ZX\product\11.2.0\client_1\bin;C:\Program Files (x86)\RSA SecurID Token Common;C:\Program Files\Intel\iCLS Client\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;c:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\;c:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;c:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;D:\Program Files\Git\cmd;C:\Program Files\Mercurial\;D:\Go\bin;C:\TDM-GCC-64\bin;D:\Program Files (x86)\scala\bin;D:\python;D:\python\Scripts;C:\WINDOWS\System32\OpenSSH\;c:\program files\Mozilla Firefox;D:\Program Files\wkhtmltox\bin;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;D:\Program Files\nodejs\;C:\ProgramData\chocolatey\bin;D:\code\ahswww\play5;D:\code\mysql-5.7.24\bin;C:\VisualSVN Server\bin;C:\Users\ZX\AppData\Local\Microsoft\WindowsApps;C:\Program Files (x86)\SSH Communications Security\SSH Secure Shell;C:\Users\ZX\AppData\Local\GitHubDesktop\bin;C:\Users\ZX\AppData\Local\Microsoft\WindowsApps;;D:\Program Files\Microsoft VS Code\bin;C:\Users\ZX\AppData\Roaming\npm;C:\Program Files\JetBrains\PyCharm 2018.3.1\bin;;.
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.io.tmpdir=C:\Users\ZX~1\AppData\Local\Temp\
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.compiler=<NA>
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:os.name=Windows 10
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:os.arch=amd64
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:os.version=10.0
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:user.name=ZX
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:user.home=C:\Users\ZX
> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:user.dir=D:\ahty\AHCT\code\scala-test
> 19/03/31 20:00:00 INFO ZooKeeper: Initiating client connection, connectString=192.168.240.101:2181 sessionTimeout=90000 watcher=hconnection-0x1cbc56930x0, quorum=192.168.240.101:2181, baseZNode=/hbase
> 19/03/31 20:00:00 INFO ClientCnxn: Opening socket connection to server hadoop001.local/192.168.240.101:2181. Will not attempt to authenticate using SASL (unknown error)
> 19/03/31 20:00:00 INFO ClientCnxn: Socket connection established, initiating session, client: /192.168.240.101:61089, server: hadoop001.local/192.168.240.101:2181
> 19/03/31 20:00:00 INFO ClientCnxn: Session establishment complete on server hadoop001.local/192.168.240.101:2181, sessionid = 0x169cc35e45e0013, negotiated timeout = 40000
> 19/03/31 20:00:01 INFO ConnectionQueryServicesImpl: HConnection established. Stacktrace for informational purposes: hconnection-0x1cbc5693 java.lang.Thread.getStackTrace(Thread.java:1556)
> org.apache.phoenix.util.LogUtil.getCallerStackTrace(LogUtil.java:55)
> org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:427)
> org.apache.phoenix.query.ConnectionQueryServicesImpl.access$400(ConnectionQueryServicesImpl.java:267)
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2515)
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2491)
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2491)
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
> org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
> java.sql.DriverManager.getConnection(DriverManager.java:664)
> java.sql.DriverManager.getConnection(DriverManager.java:208)
> org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:113)
> org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(ConnectionUtil.java:58)
> org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil.getSelectColumnMetadataList(PhoenixConfigurationUtil.java:354)
> org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:118)
> org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:60)
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:431)
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
> com.ahct.hbase.Test3$.main(Test3.scala:25)
> com.ahct.hbase.Test3.main(Test3.scala)
> 19/03/31 20:00:02 INFO deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
> Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
> 	at java.lang.Class.getDeclaredMethods0(Native Method)
> 	at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> 	at java.lang.Class.getDeclaredMethod(Class.java:2128)
> 	at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
> 	at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
> 	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
> 	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
> 	at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
> 	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
> 	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> 	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> 	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> 	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> 	at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> 	at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
> 	at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
> 	at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:342)
> 	at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:335)
> 	at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
> 	at org.apache.spark.SparkContext.clean(SparkContext.scala:2292)
> 	at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:371)
> 	at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:370)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
> 	at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
> 	at org.apache.spark.rdd.RDD.map(RDD.scala:370)
> 	at org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:131)
> 	at org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:60)
> 	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:431)
> 	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
> 	at com.ahct.hbase.Test3$.main(Test3.scala:25)
> 	at com.ahct.hbase.Test3.main(Test3.scala)
> Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.DataFrame
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> 	... 36 more
> 19/03/31 20:00:08 INFO SparkContext: Invoking stop() from shutdown hook
> 19/03/31 20:00:08 INFO SparkUI: Stopped Spark web UI at http://DESKTOP-7M1BH3H:4042
> 19/03/31 20:00:08 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
> 19/03/31 20:00:08 INFO MemoryStore: MemoryStore cleared
> 19/03/31 20:00:08 INFO BlockManager: BlockManager stopped
> 19/03/31 20:00:08 INFO BlockManagerMaster: BlockManagerMaster stopped
> 19/03/31 20:00:08 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
> 19/03/31 20:00:08 INFO SparkContext: Successfully stopped SparkContext
> 19/03/31 20:00:08 INFO ShutdownHookManager: Shutdown hook called
> Process finished with exit code 1
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)