You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by mengfei <sa...@outlook.com> on 2016/01/05 11:38:33 UTC
Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by
spark 1.5
hi josh: thank you for your advice,and it did work . i build the client-spark jar refreed the patch with thr CDH code and it succeed. Then i run some code with the "local" mode ,and the result is correct. But when it comes to the "yarn-client" mode ,some error happend:
java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com:2181;
I did try every way i know or i find in the commities: but they don`t help. So i want to get help from you ,Thank you for your patience. My code is :import org.apache.spark.sql.SQLContextimport org.apache.phoenix.spark._import org.apache.spark.SparkContextimport org.apache.spark.sql.SQLContext import org.apache.phoenix.jdbc.PhoenixDriverimport java.sql.DriverManager DriverManager.registerDriver(new PhoenixDriver)val pred = s"MMSI = '002190048'"val rdd = sc.phoenixTableAsRDD( "AIS_WS", Seq("MMSI","LON","LAT","RID"), predicate = Some(pred), zkUrl = Some("cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com"))println(rdd.count())
my scripts are: spark-submit \--master yarn-cluster \--driver-class-path "/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--conf "spark.executor.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--conf "spark.driver.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--jars /data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar \
spark-shell \
--master yarn-client -v \
--driver-class-path "/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--jars /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
ps: i did copied the jar to every node,and give the even the 777 rghts.
sacuba@Outlook.com
From: Josh MahoninDate: 2015-12-30 00:56To: userSubject: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5Hi,
This issue is fixed with the following patch, and using the resulting 'client-spark' JAR after compilation:
https://issues.apache.org/jira/browse/PHOENIX-2503
As an alternative, you may have some luck also including updated com.fasterxml.jackson jackson-databind JARs in your app that are in sync with Spark's versions. Unfortunately the client JAR right now is shipping fasterxml jars that conflict with the Spark runtime.
Another user has also had success by bundling their own Phoenix dependencies, if you want to try that out instead:
http://mail-archives.apache.org/mod_mbox/incubator-phoenix-user/201512.mbox/%3C0F96D592-74D7-431A-B301-015374A6B4BC@sandia.gov%3E
Josh
On Tue, Dec 29, 2015 at 9:11 AM, sacuba@Outlook.com <sa...@outlook.com> wrote:
The error is java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector(Lcom/fasterxml/jackson/databind/introspect/ClassIntrospector;)V at com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32) at com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32) at com.fasterxml.jackson.module.scala.JacksonModule$$anonfun$setupModule$1.apply(JacksonModule.scala:47) …..The scala code is val df = sqlContext.load( "org.apache.phoenix.spark", Map("table" -> "AIS ", "zkUrl" -> "cdhzk1.ccco.com:2181")) Maybe I got the resoon ,the Phoenix 4.5.2 on CDH 5.5.x is build with spark 1.4 ,and cdh5.5`defalut spark version is 1.5.So how could I do? To rebuild a phoenix 4.5.2 version with spark 1.5 Or change the cdh spark to 1.4. Apreantly these are difficult for me . Could someone help me ? Thank you vey much.
Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Posted by "sacuba@Outlook.com" <sa...@Outlook.com>.
hi josh :
I met another two problems when i 'copy ' a table to another with spark. The trouble should be decribed as "some null and empty values when copied from another table "
The main code is as follows:
val sqlContext = new SQLContext(sc)
val pred = s"mmsi like '0%'"
val df = sqlContext.phoenixTableAsDataFrame("ais_mmsi", Array("MMSI","TIME","C1.RID","C3.NAV_STATUS","C2.ROT....(total 13 columns)"),predicate = Some(pred),conf = configuration)
df.saveToPhoenix("ais_area_test3",conf = configuration).
when it finished. I check the table 'ais_area_test3', all the rid seems to be empty,and some of other columns to be null. I really do not know what happened.
And if i use phoenixTableAsDataFrame without any predicate . It will cause another exception " org.apache.phoenix.schema.StaleRegionBoundaryCacheException: ERROR 1108 (XCL08): Cache of region boundaries are out of date." I have search the mailing list and googled it ,there is nothing help to me.
Here is code to generate the two tables(both 13 columns and 3 cfs, the diffrent should be the primary key,and one column):
create table ais_area_test3 (
rid varchar,
time INTEGER(10) not null,
mmsi varchar(9),
c3.nav_status INTEGER(2),.......etc.
create table ais_mmsi (
mmsi varchar(9) not null,
time INTEGER(10) not null,
c1.rid varchar(10),
c3.nav_status INTEGER(2),......etc.
Yours very sincerely
sacuba@Outlook.com
From: Josh Mahonin
Date: 2016-01-08 23:23
To: user
Subject: Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
[Sent this same message from another account, apologies if anyone gets a double-post]
Thanks for the update. I hadn’t even seen the ‘SPARK_DIST_CLASSPATH’ setting until just now, but I suspect for CDH that might be the only way to do it.
The reason for the class path errors you see is that the ‘client’ JAR on its own ships a few extra dependencies (i.e. com.fasterxml.jackson) which are incompatible with newer versions of Spark. The ‘client-spark’ JAR attempts to remove those dependencies which would conflict with Spark, although a more elegant solution will likely come with PHOENIX-2535 (https://issues.apache.org/jira/browse/PHOENIX-2535)
Re: speed, the spark integration should be just about as fast as the MapReduce and Pig integration. At the 3T level, your likely bottleneck is disk IO just to load the data in, although network IO is also a possibility here as well. Assuming you have a sufficient number of Spark workers with enough RAM allocated, once the data is loaded into Spark initially, operations on that dataset should proceed much faster, as much of the data will be available in RAM vs disk.
Best of luck,
Josh
On Fri, Jan 8, 2016 at 7:34 AM, sacuba@Outlook.com <sa...@outlook.com> wrote:
hi josh
Yes ,it is still the same 'No suitable driver' exception.
And my boss may solve the problem. the method is amazing but it did succeed.
he add the "export SPARK_DIST_CLASSPATH=$SPARK_DIST_CLASSPATH:/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client.jar" in
spark-env.sh ,and restart the spark . And later ,everythig seems good. What he add is the "phoenix-1.2.0-client.jar" which cause the exception "java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector" 。 It is incredible. I really appreciate what you have done for me . If you could tell me why it happens, i will be more happy.
Besides ,this afternoon i met anothor problem. I have made a table which include 13 columns, such as (A.B.C.....) and the primary key is (A,B), When i import all the data to this table ,i delete the original data. But now i want to get another table ,which should inculde the same 13 columns ,but the diffrent primary key which should be (B,C). The data is big ,about 3T . Could you tell me how to do it fastly. I have tried to do this by spark, but it seems not fast.
Best wishes for you!
sacuba@Outlook.com
From: Josh Mahonin
Date: 2016-01-06 23:02
To: user
Subject: Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Hi,
Is it still the same 'No suitable driver' exception, or is it something else?
Have you tried using the 'yarn-cluster' mode? I've had success with that personally, although I don't have any experience on the CDH stack.
Josh
On Wed, Jan 6, 2016 at 2:59 AM, sacuba@Outlook.com <sa...@outlook.com> wrote:
hi josh:
I did what you say, and now i can run my codes in spark-shell --master local without any other confs , but when it comes to 'yarn-client' ,the error is as the same .
I try to give you more informations, we have 11 nodes,3 zks,2 masters. I am sure all the 11nodes have the client-spark.jar in the path referred in the spark-defaults.conf.
We use the CDH5.5 with spark1.5. our phoenix version is "http://archive.cloudera.com/cloudera-labs/phoenix/parcels/1.2/" which based phoenix 4.5.2.
I built the spark-client jar in the following steps
1.download the base code in "https://github.com/cloudera-labs/phoenix"
2. path the PHOENIX-2503.patch manualy
3 build
it is amazing that it did work in local ,but not in yarn-client mode . Waiting for your reply eagerly.Thank you very much.
my spark-defaults.conf is
spark.authenticate=false
spark.dynamicAllocation.enabled=true
spark.dynamicAllocation.executorIdleTimeout=60
spark.dynamicAllocation.minExecutors=0
spark.dynamicAllocation.schedulerBacklogTimeout=1
spark.eventLog.dir=hdfs://cdhcluster1/user/spark/applicationHistory
spark.eventLog.enabled=true
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.shuffle.service.enabled=true
spark.shuffle.service.port=7337
spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
spark.yarn.historyServer.address=http://cdhmaster1.boloomo.com:18088
spark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/lib/spark-assembly.jar
spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
spark.yarn.config.gatewayPath=/opt/cloudera/parcels
spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../..
spark.master=yarn-client
And here is my code again
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.jdbc.PhoenixDriver
import java.sql.DriverManager
DriverManager.registerDriver(new PhoenixDriver)
Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
val pred = s"RID like 'wtwb2%' and TIME between 1325440922 and 1336440922"
val rdd = sc.phoenixTableAsRDD(
"AIS_AREA",
Seq("MMSI","LON","LAT","RID"),
predicate = Some(pred),
zkUrl = Some("cdhzk3.boloomo.com:2181"))
and i use only spark-shell to run the code this time.
sacuba@Outlook.com
From: Josh Mahonin
Date: 2016-01-05 23:41
To: user
Subject: Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Hi,
The error "java.sql.SQLException: No suitable driver found..." is typically thrown when the worker nodes can't find Phoenix on the class path.
I'm not certain that passing those values using '--conf' actually works or not with Spark. I tend to set them in my 'spark-defaults.conf' in the Spark configuration folder. I think restarting the master and workers may be required as well.
Josh
On Tue, Jan 5, 2016 at 5:38 AM, mengfei <sa...@outlook.com> wrote:
hi josh:
thank you for your advice,and it did work . i build the client-spark jar refreed the patch with thr CDH code and it succeed.
Then i run some code with the "local" mode ,and the result is correct. But when it comes to the "yarn-client" mode ,some error happend:
java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com:2181;
I did try every way i know or i find in the commities: but they don`t help. So i want to get help from you ,Thank you for your patience.
My code is :
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.jdbc.PhoenixDriver
import java.sql.DriverManager
DriverManager.registerDriver(new PhoenixDriver)
val pred = s"MMSI = '002190048'"
val rdd = sc.phoenixTableAsRDD(
"AIS_WS",
Seq("MMSI","LON","LAT","RID"),
predicate = Some(pred),
zkUrl = Some("cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com"))
println(rdd.count())
my scripts are:
spark-submit \
--master yarn-cluster \--driver-class-path "/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--conf "spark.executor.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--conf "spark.driver.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--jars /data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar \
spark-shell \
--master yarn-client -v \
--driver-class-path "/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--jars /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
ps: i did copied the jar to every node,and give the even the 777 rghts.
sacuba@Outlook.com
From: Josh Mahonin
Date: 2015-12-30 00:56
To: user
Subject: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Hi,
This issue is fixed with the following patch, and using the resulting 'client-spark' JAR after compilation:
https://issues.apache.org/jira/browse/PHOENIX-2503
As an alternative, you may have some luck also including updated com.fasterxml.jackson jackson-databind JARs in your app that are in sync with Spark's versions. Unfortunately the client JAR right now is shipping fasterxml jars that conflict with the Spark runtime.
Another user has also had success by bundling their own Phoenix dependencies, if you want to try that out instead:
http://mail-archives.apache.org/mod_mbox/incubator-phoenix-user/201512.mbox/%3C0F96D592-74D7-431A-B301-015374A6B4BC@sandia.gov%3E
Josh
On Tue, Dec 29, 2015 at 9:11 AM, sacuba@Outlook.com <sa...@outlook.com> wrote:
The error is
java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector(Lcom/fasterxml/jackson/databind/introspect/ClassIntrospector;)V
at com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
at com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
at com.fasterxml.jackson.module.scala.JacksonModule$$anonfun$setupModule$1.apply(JacksonModule.scala:47)
…..
The scala code is
val df = sqlContext.load(
"org.apache.phoenix.spark",
Map("table" -> "AIS ", "zkUrl" -> "cdhzk1.ccco.com:2181")
)
Maybe I got the resoon ,the Phoenix 4.5.2 on CDH 5.5.x is build with spark 1.4 ,and cdh5.5`defalut spark version is 1.5.
So how could I do? To rebuild a phoenix 4.5.2 version with spark 1.5 Or change the cdh spark to 1.4. Apreantly these are difficult for me . Could someone help me ? Thank you vey much.
Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Posted by Josh Mahonin <jm...@gmail.com>.
[Sent this same message from another account, apologies if anyone gets a
double-post]
Thanks for the update. I hadn’t even seen the ‘SPARK_DIST_CLASSPATH’
setting until just now, but I suspect for CDH that might be the only way to
do it.
The reason for the class path errors you see is that the ‘client’ JAR on
its own ships a few extra dependencies (i.e. com.fasterxml.jackson) which
are incompatible with newer versions of Spark. The ‘client-spark’ JAR
attempts to remove those dependencies which would conflict with Spark,
although a more elegant solution will likely come with PHOENIX-2535 (
https://issues.apache.org/jira/browse/PHOENIX-2535)
Re: speed, the spark integration should be just about as fast as the
MapReduce and Pig integration. At the 3T level, your likely bottleneck is
disk IO just to load the data in, although network IO is also a possibility
here as well. Assuming you have a sufficient number of Spark workers with
enough RAM allocated, once the data is loaded into Spark initially,
operations on that dataset should proceed much faster, as much of the data
will be available in RAM vs disk.
Best of luck,
Josh
On Fri, Jan 8, 2016 at 7:34 AM, sacuba@Outlook.com <sa...@outlook.com>
wrote:
> hi josh
>
> Yes ,it is still the same 'No suitable driver' exception.
> And my boss may solve the problem. the method is amazing but
> it did succeed.
> he add the "export SPARK_DIST_CLASSPATH=$SPARK_DIST_CLASSPATH:/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client.jar"
> in
> spark-env.sh ,and restart the spark . And later ,everythig seems good.
> What he add is the "phoenix-1.2.0-client.jar" which cause the exception
> "java.lang.NoSuchMethodError:
> com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector"
> 。 It is incredible. I really appreciate what you have done for me . If
> you could tell me why it happens, i will be more happy.
>
> Besides ,this afternoon i met anothor problem. I have
> made a table which include 13 columns, such as (A.B.C.....) and the
> primary key is (A,B), When i import all the data to this table ,i delete
> the original data. But now i want to get another table ,which should
> inculde the same 13 columns ,but the diffrent primary key which should be
> (B,C). The data is big ,about 3T . Could you tell me how to do it
> fastly. I have tried to do this by spark, but it seems not fast.
>
>
>
> Best wishes for you!
>
>
>
>
>
> ------------------------------
> sacuba@Outlook.com
>
>
> *From:* Josh Mahonin <jm...@gmail.com>
> *Date:* 2016-01-06 23:02
> *To:* user <us...@phoenix.apache.org>
> *Subject:* Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by
> spark 1.5
> Hi,
>
> Is it still the same 'No suitable driver' exception, or is it something
> else?
>
> Have you tried using the 'yarn-cluster' mode? I've had success with that
> personally, although I don't have any experience on the CDH stack.
>
> Josh
>
>
>
> On Wed, Jan 6, 2016 at 2:59 AM, sacuba@Outlook.com <sa...@outlook.com>
> wrote:
>
>> hi josh:
>>
>> I did what you say, and now i can run my codes in
>> spark-shell --master local without any other confs , but when it comes
>> to 'yarn-client' ,the error is as the same .
>>
>> I try to give you more informations, we have 11 nodes,3
>> zks,2 masters. I am sure all the 11nodes have the client-spark.jar in
>> the path referred in the spark-defaults.conf.
>> We use the CDH5.5 with spark1.5. our phoenix version is "
>> http://archive.cloudera.com/cloudera-labs/phoenix/parcels/1.2/"
>> <http://archive.cloudera.com/cloudera-labs/phoenix/parcels/1.2/> which
>> based phoenix 4.5.2.
>> I built the spark-client jar in the following steps
>> 1.download the base code in "
>> https://github.com/cloudera-labs/phoenix"
>> <https://github.com/cloudera-labs/phoenix>
>> 2. path the PHOENIX-2503.patch manualy
>> 3 build
>>
>> it is amazing that it did work in local ,but not in
>> yarn-client mode . Waiting for your reply eagerly.Thank you very much.
>>
>>
>> my spark-defaults.conf is
>>
>> spark.authenticate=false
>> spark.dynamicAllocation.enabled=true
>> spark.dynamicAllocation.executorIdleTimeout=60
>> spark.dynamicAllocation.minExecutors=0
>> spark.dynamicAllocation.schedulerBacklogTimeout=1
>> spark.eventLog.dir=hdfs://cdhcluster1/user/spark/applicationHistory
>> spark.eventLog.enabled=true
>> spark.serializer=org.apache.spark.serializer.KryoSerializer
>> spark.shuffle.service.enabled=true
>> spark.shuffle.service.port=7337
>>
>> spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
>>
>> spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
>> spark.yarn.historyServer.address=http://cdhmaster1.boloomo.com:18088
>>
>> spark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/lib/spark-assembly.jar
>>
>> spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
>>
>> spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
>>
>> spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
>> spark.yarn.config.gatewayPath=/opt/cloudera/parcels
>> spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../..
>> spark.master=yarn-client
>>
>> And here is my code again
>> import org.apache.spark.sql.SQLContext
>> import org.apache.phoenix.spark._
>> import org.apache.spark.SparkContext
>> import org.apache.spark.sql.SQLContext
>> import org.apache.phoenix.jdbc.PhoenixDriver
>> import java.sql.DriverManager
>> DriverManager.registerDriver(new PhoenixDriver)
>> Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
>> val pred = s"RID like 'wtwb2%' and TIME between 1325440922 and 1336440922"
>> val rdd = sc.phoenixTableAsRDD(
>> "AIS_AREA",
>> Seq("MMSI","LON","LAT","RID"),
>> predicate = Some(pred),
>> zkUrl = Some("cdhzk3.boloomo.com:2181"))
>>
>> and i use only spark-shell to run the code this time.
>>
>>
>> ------------------------------
>> sacuba@Outlook.com
>>
>>
>> *From:* Josh Mahonin <jm...@gmail.com>
>> *Date:* 2016-01-05 23:41
>> *To:* user <us...@phoenix.apache.org>
>> *Subject:* Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x
>> by spark 1.5
>> Hi,
>>
>> The error "java.sql.SQLException: No suitable driver found..." is
>> typically thrown when the worker nodes can't find Phoenix on the class path.
>>
>> I'm not certain that passing those values using '--conf' actually works
>> or not with Spark. I tend to set them in my 'spark-defaults.conf' in the
>> Spark configuration folder. I think restarting the master and workers may
>> be required as well.
>>
>> Josh
>>
>> On Tue, Jan 5, 2016 at 5:38 AM, mengfei <sa...@outlook.com> wrote:
>>
>>> hi josh:
>>>
>>> thank you for your advice,and it did work . i build the
>>> client-spark jar refreed the patch with thr CDH code and it succeed.
>>> Then i run some code with the "local" mode ,and the
>>> result is correct. But when it comes to the "yarn-client" mode ,some error
>>> happend:
>>>
>>>
>>> java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:
>>> cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com:2181;
>>>
>>> I did try every way i know or i find in the commities: but
>>> they don`t help. So i want to get help from you ,Thank you for your
>>> patience.
>>>
>>> My code is :
>>>
>>> import org.apache.spark.sql.SQLContext
>>>
>>> import org.apache.phoenix.spark._
>>>
>>> import org.apache.spark.SparkContext
>>>
>>> import org.apache.spark.sql.SQLContext
>>>
>>> import org.apache.phoenix.jdbc.PhoenixDriver
>>>
>>> import java.sql.DriverManager
>>>
>>> DriverManager.registerDriver(new PhoenixDriver)
>>>
>>> val pred = s"MMSI = '002190048'"
>>>
>>> val rdd = sc.phoenixTableAsRDD(
>>>
>>> "AIS_WS",
>>>
>>> Seq("MMSI","LON","LAT","RID"),
>>>
>>> predicate = Some(pred),
>>>
>>> zkUrl = Some("cdhzk1.boloomo.com,cdhzk2.boloomo.com,
>>> cdhzk3.boloomo.com"))
>>>
>>> println(rdd.count())
>>>
>>>
>>> my scripts are:
>>>
>>> spark-submit \
>>>
>>> --master yarn-cluster \
>>>
>>>
>>> --driver-class-path "/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \
>>>
>>>
>>> --conf "spark.executor.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \
>>>
>>>
>>> --conf "spark.driver.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \
>>>
>>> --jars /data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar \
>>>
>>>
>>> spark-shell \
>>> --master yarn-client -v \
>>> --driver-class-path "/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar"
>>> \
>>>
>>> --conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
>>>
>>> --conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
>>> --jars /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
>>>
>>>
>>>
>>> ps: i did copied the jar to every node,and give the even the 777 rghts.
>>>
>>> ------------------------------
>>> sacuba@Outlook.com
>>>
>>>
>>> *From:* Josh Mahonin <jm...@gmail.com>
>>> *Date:* 2015-12-30 00:56
>>> *To:* user <us...@phoenix.apache.org>
>>> *Subject:* Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by
>>> spark 1.5
>>> Hi,
>>>
>>> This issue is fixed with the following patch, and using the resulting
>>> 'client-spark' JAR after compilation:
>>> https://issues.apache.org/jira/browse/PHOENIX-2503
>>>
>>> As an alternative, you may have some luck also including updated
>>> com.fasterxml.jackson jackson-databind JARs in your app that are in sync
>>> with Spark's versions. Unfortunately the client JAR right now is shipping
>>> fasterxml jars that conflict with the Spark runtime.
>>>
>>> Another user has also had success by bundling their own Phoenix
>>> dependencies, if you want to try that out instead:
>>>
>>> http://mail-archives.apache.org/mod_mbox/incubator-phoenix-user/201512.mbox/%3C0F96D592-74D7-431A-B301-015374A6B4BC@sandia.gov%3E
>>>
>>> Josh
>>>
>>>
>>>
>>> On Tue, Dec 29, 2015 at 9:11 AM, sacuba@Outlook.com <sa...@outlook.com>
>>> wrote:
>>>
>>>> The error is
>>>>
>>>> java.lang.NoSuchMethodError:
>>>> com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector(Lcom/fasterxml/jackson/databind/introspect/ClassIntrospector;)V
>>>>
>>>> at
>>>> com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
>>>>
>>>> at
>>>> com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
>>>>
>>>> at
>>>> com.fasterxml.jackson.module.scala.JacksonModule$$anonfun$setupModule$1.apply(JacksonModule.scala:47)
>>>>
>>>> …..
>>>>
>>>> The scala code is
>>>>
>>>> val df = sqlContext.load(
>>>>
>>>> "org.apache.phoenix.spark",
>>>>
>>>> Map("table" -> "AIS ", "zkUrl" -> "cdhzk1.ccco.com:2181")
>>>>
>>>> )
>>>>
>>>>
>>>>
>>>> Maybe I got the resoon ,the Phoenix 4.5.2 on CDH 5.5.x is build with
>>>> spark 1.4 ,and cdh5.5`defalut spark version is 1.5.
>>>>
>>>> So how could I do? To rebuild a phoenix 4.5.2 version with spark 1.5
>>>> Or change the cdh spark to 1.4. Apreantly these are difficult for me .
>>>> Could someone help me ? Thank you vey much.
>>>>
>>>>
>>>>
>>>
>>>
>>
>
Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Posted by "sacuba@Outlook.com" <sa...@Outlook.com>.
hi josh
Yes ,it is still the same 'No suitable driver' exception.
And my boss may solve the problem. the method is amazing but it did succeed.
he add the "export SPARK_DIST_CLASSPATH=$SPARK_DIST_CLASSPATH:/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client.jar" in
spark-env.sh ,and restart the spark . And later ,everythig seems good. What he add is the "phoenix-1.2.0-client.jar" which cause the exception "java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector" 。 It is incredible. I really appreciate what you have done for me . If you could tell me why it happens, i will be more happy.
Besides ,this afternoon i met anothor problem. I have made a table which include 13 columns, such as (A.B.C.....) and the primary key is (A,B), When i import all the data to this table ,i delete the original data. But now i want to get another table ,which should inculde the same 13 columns ,but the diffrent primary key which should be (B,C). The data is big ,about 3T . Could you tell me how to do it fastly. I have tried to do this by spark, but it seems not fast.
Best wishes for you!
sacuba@Outlook.com
From: Josh Mahonin
Date: 2016-01-06 23:02
To: user
Subject: Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Hi,
Is it still the same 'No suitable driver' exception, or is it something else?
Have you tried using the 'yarn-cluster' mode? I've had success with that personally, although I don't have any experience on the CDH stack.
Josh
On Wed, Jan 6, 2016 at 2:59 AM, sacuba@Outlook.com <sa...@outlook.com> wrote:
hi josh:
I did what you say, and now i can run my codes in spark-shell --master local without any other confs , but when it comes to 'yarn-client' ,the error is as the same .
I try to give you more informations, we have 11 nodes,3 zks,2 masters. I am sure all the 11nodes have the client-spark.jar in the path referred in the spark-defaults.conf.
We use the CDH5.5 with spark1.5. our phoenix version is "http://archive.cloudera.com/cloudera-labs/phoenix/parcels/1.2/" which based phoenix 4.5.2.
I built the spark-client jar in the following steps
1.download the base code in "https://github.com/cloudera-labs/phoenix"
2. path the PHOENIX-2503.patch manualy
3 build
it is amazing that it did work in local ,but not in yarn-client mode . Waiting for your reply eagerly.Thank you very much.
my spark-defaults.conf is
spark.authenticate=false
spark.dynamicAllocation.enabled=true
spark.dynamicAllocation.executorIdleTimeout=60
spark.dynamicAllocation.minExecutors=0
spark.dynamicAllocation.schedulerBacklogTimeout=1
spark.eventLog.dir=hdfs://cdhcluster1/user/spark/applicationHistory
spark.eventLog.enabled=true
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.shuffle.service.enabled=true
spark.shuffle.service.port=7337
spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
spark.yarn.historyServer.address=http://cdhmaster1.boloomo.com:18088
spark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/lib/spark-assembly.jar
spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
spark.yarn.config.gatewayPath=/opt/cloudera/parcels
spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../..
spark.master=yarn-client
And here is my code again
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.jdbc.PhoenixDriver
import java.sql.DriverManager
DriverManager.registerDriver(new PhoenixDriver)
Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
val pred = s"RID like 'wtwb2%' and TIME between 1325440922 and 1336440922"
val rdd = sc.phoenixTableAsRDD(
"AIS_AREA",
Seq("MMSI","LON","LAT","RID"),
predicate = Some(pred),
zkUrl = Some("cdhzk3.boloomo.com:2181"))
and i use only spark-shell to run the code this time.
sacuba@Outlook.com
From: Josh Mahonin
Date: 2016-01-05 23:41
To: user
Subject: Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Hi,
The error "java.sql.SQLException: No suitable driver found..." is typically thrown when the worker nodes can't find Phoenix on the class path.
I'm not certain that passing those values using '--conf' actually works or not with Spark. I tend to set them in my 'spark-defaults.conf' in the Spark configuration folder. I think restarting the master and workers may be required as well.
Josh
On Tue, Jan 5, 2016 at 5:38 AM, mengfei <sa...@outlook.com> wrote:
hi josh:
thank you for your advice,and it did work . i build the client-spark jar refreed the patch with thr CDH code and it succeed.
Then i run some code with the "local" mode ,and the result is correct. But when it comes to the "yarn-client" mode ,some error happend:
java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com:2181;
I did try every way i know or i find in the commities: but they don`t help. So i want to get help from you ,Thank you for your patience.
My code is :
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.jdbc.PhoenixDriver
import java.sql.DriverManager
DriverManager.registerDriver(new PhoenixDriver)
val pred = s"MMSI = '002190048'"
val rdd = sc.phoenixTableAsRDD(
"AIS_WS",
Seq("MMSI","LON","LAT","RID"),
predicate = Some(pred),
zkUrl = Some("cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com"))
println(rdd.count())
my scripts are:
spark-submit \
--master yarn-cluster \--driver-class-path "/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--conf "spark.executor.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--conf "spark.driver.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--jars /data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar \
spark-shell \
--master yarn-client -v \
--driver-class-path "/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--jars /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
ps: i did copied the jar to every node,and give the even the 777 rghts.
sacuba@Outlook.com
From: Josh Mahonin
Date: 2015-12-30 00:56
To: user
Subject: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Hi,
This issue is fixed with the following patch, and using the resulting 'client-spark' JAR after compilation:
https://issues.apache.org/jira/browse/PHOENIX-2503
As an alternative, you may have some luck also including updated com.fasterxml.jackson jackson-databind JARs in your app that are in sync with Spark's versions. Unfortunately the client JAR right now is shipping fasterxml jars that conflict with the Spark runtime.
Another user has also had success by bundling their own Phoenix dependencies, if you want to try that out instead:
http://mail-archives.apache.org/mod_mbox/incubator-phoenix-user/201512.mbox/%3C0F96D592-74D7-431A-B301-015374A6B4BC@sandia.gov%3E
Josh
On Tue, Dec 29, 2015 at 9:11 AM, sacuba@Outlook.com <sa...@outlook.com> wrote:
The error is
java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector(Lcom/fasterxml/jackson/databind/introspect/ClassIntrospector;)V
at com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
at com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
at com.fasterxml.jackson.module.scala.JacksonModule$$anonfun$setupModule$1.apply(JacksonModule.scala:47)
…..
The scala code is
val df = sqlContext.load(
"org.apache.phoenix.spark",
Map("table" -> "AIS ", "zkUrl" -> "cdhzk1.ccco.com:2181")
)
Maybe I got the resoon ,the Phoenix 4.5.2 on CDH 5.5.x is build with spark 1.4 ,and cdh5.5`defalut spark version is 1.5.
So how could I do? To rebuild a phoenix 4.5.2 version with spark 1.5 Or change the cdh spark to 1.4. Apreantly these are difficult for me . Could someone help me ? Thank you vey much.
Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Posted by Josh Mahonin <jm...@gmail.com>.
Hi,
Is it still the same 'No suitable driver' exception, or is it something
else?
Have you tried using the 'yarn-cluster' mode? I've had success with that
personally, although I don't have any experience on the CDH stack.
Josh
On Wed, Jan 6, 2016 at 2:59 AM, sacuba@Outlook.com <sa...@outlook.com>
wrote:
> hi josh:
>
> I did what you say, and now i can run my codes in
> spark-shell --master local without any other confs , but when it comes
> to 'yarn-client' ,the error is as the same .
>
> I try to give you more informations, we have 11 nodes,3
> zks,2 masters. I am sure all the 11nodes have the client-spark.jar in
> the path referred in the spark-defaults.conf.
> We use the CDH5.5 with spark1.5. our phoenix version is "
> http://archive.cloudera.com/cloudera-labs/phoenix/parcels/1.2/"
> <http://archive.cloudera.com/cloudera-labs/phoenix/parcels/1.2/> which
> based phoenix 4.5.2.
> I built the spark-client jar in the following steps
> 1.download the base code in "
> https://github.com/cloudera-labs/phoenix"
> <https://github.com/cloudera-labs/phoenix>
> 2. path the PHOENIX-2503.patch manualy
> 3 build
>
> it is amazing that it did work in local ,but not in
> yarn-client mode . Waiting for your reply eagerly.Thank you very much.
>
>
> my spark-defaults.conf is
>
> spark.authenticate=false
> spark.dynamicAllocation.enabled=true
> spark.dynamicAllocation.executorIdleTimeout=60
> spark.dynamicAllocation.minExecutors=0
> spark.dynamicAllocation.schedulerBacklogTimeout=1
> spark.eventLog.dir=hdfs://cdhcluster1/user/spark/applicationHistory
> spark.eventLog.enabled=true
> spark.serializer=org.apache.spark.serializer.KryoSerializer
> spark.shuffle.service.enabled=true
> spark.shuffle.service.port=7337
>
> spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
>
> spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
> spark.yarn.historyServer.address=http://cdhmaster1.boloomo.com:18088
>
> spark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/lib/spark-assembly.jar
>
> spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
>
> spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
>
> spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
> spark.yarn.config.gatewayPath=/opt/cloudera/parcels
> spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../..
> spark.master=yarn-client
>
> And here is my code again
> import org.apache.spark.sql.SQLContext
> import org.apache.phoenix.spark._
> import org.apache.spark.SparkContext
> import org.apache.spark.sql.SQLContext
> import org.apache.phoenix.jdbc.PhoenixDriver
> import java.sql.DriverManager
> DriverManager.registerDriver(new PhoenixDriver)
> Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
> val pred = s"RID like 'wtwb2%' and TIME between 1325440922 and 1336440922"
> val rdd = sc.phoenixTableAsRDD(
> "AIS_AREA",
> Seq("MMSI","LON","LAT","RID"),
> predicate = Some(pred),
> zkUrl = Some("cdhzk3.boloomo.com:2181"))
>
> and i use only spark-shell to run the code this time.
>
>
> ------------------------------
> sacuba@Outlook.com
>
>
> *From:* Josh Mahonin <jm...@gmail.com>
> *Date:* 2016-01-05 23:41
> *To:* user <us...@phoenix.apache.org>
> *Subject:* Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by
> spark 1.5
> Hi,
>
> The error "java.sql.SQLException: No suitable driver found..." is
> typically thrown when the worker nodes can't find Phoenix on the class path.
>
> I'm not certain that passing those values using '--conf' actually works or
> not with Spark. I tend to set them in my 'spark-defaults.conf' in the Spark
> configuration folder. I think restarting the master and workers may be
> required as well.
>
> Josh
>
> On Tue, Jan 5, 2016 at 5:38 AM, mengfei <sa...@outlook.com> wrote:
>
>> hi josh:
>>
>> thank you for your advice,and it did work . i build the
>> client-spark jar refreed the patch with thr CDH code and it succeed.
>> Then i run some code with the "local" mode ,and the
>> result is correct. But when it comes to the "yarn-client" mode ,some error
>> happend:
>>
>>
>> java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:
>> cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com:2181;
>>
>> I did try every way i know or i find in the commities: but
>> they don`t help. So i want to get help from you ,Thank you for your
>> patience.
>>
>> My code is :
>>
>> import org.apache.spark.sql.SQLContext
>>
>> import org.apache.phoenix.spark._
>>
>> import org.apache.spark.SparkContext
>>
>> import org.apache.spark.sql.SQLContext
>>
>> import org.apache.phoenix.jdbc.PhoenixDriver
>>
>> import java.sql.DriverManager
>>
>> DriverManager.registerDriver(new PhoenixDriver)
>>
>> val pred = s"MMSI = '002190048'"
>>
>> val rdd = sc.phoenixTableAsRDD(
>>
>> "AIS_WS",
>>
>> Seq("MMSI","LON","LAT","RID"),
>>
>> predicate = Some(pred),
>>
>> zkUrl = Some("cdhzk1.boloomo.com,cdhzk2.boloomo.com,
>> cdhzk3.boloomo.com"))
>>
>> println(rdd.count())
>>
>>
>> my scripts are:
>>
>> spark-submit \
>>
>> --master yarn-cluster \
>>
>>
>> --driver-class-path "/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \
>>
>>
>> --conf "spark.executor.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \
>>
>>
>> --conf "spark.driver.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \
>>
>> --jars /data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar \
>>
>>
>> spark-shell \
>> --master yarn-client -v \
>> --driver-class-path "/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar"
>> \
>>
>> --conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
>>
>> --conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
>> --jars /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
>>
>>
>>
>> ps: i did copied the jar to every node,and give the even the 777 rghts.
>>
>> ------------------------------
>> sacuba@Outlook.com
>>
>>
>> *From:* Josh Mahonin <jm...@gmail.com>
>> *Date:* 2015-12-30 00:56
>> *To:* user <us...@phoenix.apache.org>
>> *Subject:* Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by
>> spark 1.5
>> Hi,
>>
>> This issue is fixed with the following patch, and using the resulting
>> 'client-spark' JAR after compilation:
>> https://issues.apache.org/jira/browse/PHOENIX-2503
>>
>> As an alternative, you may have some luck also including updated
>> com.fasterxml.jackson jackson-databind JARs in your app that are in sync
>> with Spark's versions. Unfortunately the client JAR right now is shipping
>> fasterxml jars that conflict with the Spark runtime.
>>
>> Another user has also had success by bundling their own Phoenix
>> dependencies, if you want to try that out instead:
>>
>> http://mail-archives.apache.org/mod_mbox/incubator-phoenix-user/201512.mbox/%3C0F96D592-74D7-431A-B301-015374A6B4BC@sandia.gov%3E
>>
>> Josh
>>
>>
>>
>> On Tue, Dec 29, 2015 at 9:11 AM, sacuba@Outlook.com <sa...@outlook.com>
>> wrote:
>>
>>> The error is
>>>
>>> java.lang.NoSuchMethodError:
>>> com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector(Lcom/fasterxml/jackson/databind/introspect/ClassIntrospector;)V
>>>
>>> at
>>> com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
>>>
>>> at
>>> com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
>>>
>>> at
>>> com.fasterxml.jackson.module.scala.JacksonModule$$anonfun$setupModule$1.apply(JacksonModule.scala:47)
>>>
>>> …..
>>>
>>> The scala code is
>>>
>>> val df = sqlContext.load(
>>>
>>> "org.apache.phoenix.spark",
>>>
>>> Map("table" -> "AIS ", "zkUrl" -> "cdhzk1.ccco.com:2181")
>>>
>>> )
>>>
>>>
>>>
>>> Maybe I got the resoon ,the Phoenix 4.5.2 on CDH 5.5.x is build with
>>> spark 1.4 ,and cdh5.5`defalut spark version is 1.5.
>>>
>>> So how could I do? To rebuild a phoenix 4.5.2 version with spark 1.5
>>> Or change the cdh spark to 1.4. Apreantly these are difficult for me .
>>> Could someone help me ? Thank you vey much.
>>>
>>>
>>>
>>
>>
>
Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Posted by "sacuba@Outlook.com" <sa...@Outlook.com>.
hi josh:
I did what you say, and now i can run my codes in spark-shell --master local without any other confs , but when it comes to 'yarn-client' ,the error is as the same .
I try to give you more informations, we have 11 nodes,3 zks,2 masters. I am sure all the 11nodes have the client-spark.jar in the path referred in the spark-defaults.conf.
We use the CDH5.5 with spark1.5. our phoenix version is "http://archive.cloudera.com/cloudera-labs/phoenix/parcels/1.2/" which based phoenix 4.5.2.
I built the spark-client jar in the following steps
1.download the base code in "https://github.com/cloudera-labs/phoenix"
2. path the PHOENIX-2503.patch manualy
3 build
it is amazing that it did work in local ,but not in yarn-client mode . Waiting for your reply eagerly.Thank you very much.
my spark-defaults.conf is
spark.authenticate=false
spark.dynamicAllocation.enabled=true
spark.dynamicAllocation.executorIdleTimeout=60
spark.dynamicAllocation.minExecutors=0
spark.dynamicAllocation.schedulerBacklogTimeout=1
spark.eventLog.dir=hdfs://cdhcluster1/user/spark/applicationHistory
spark.eventLog.enabled=true
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.shuffle.service.enabled=true
spark.shuffle.service.port=7337
spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
spark.yarn.historyServer.address=http://cdhmaster1.boloomo.com:18088
spark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/lib/spark-assembly.jar
spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hadoop/lib/native
spark.yarn.config.gatewayPath=/opt/cloudera/parcels
spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../..
spark.master=yarn-client
And here is my code again
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.jdbc.PhoenixDriver
import java.sql.DriverManager
DriverManager.registerDriver(new PhoenixDriver)
Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
val pred = s"RID like 'wtwb2%' and TIME between 1325440922 and 1336440922"
val rdd = sc.phoenixTableAsRDD(
"AIS_AREA",
Seq("MMSI","LON","LAT","RID"),
predicate = Some(pred),
zkUrl = Some("cdhzk3.boloomo.com:2181"))
and i use only spark-shell to run the code this time.
sacuba@Outlook.com
From: Josh Mahonin
Date: 2016-01-05 23:41
To: user
Subject: Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Hi,
The error "java.sql.SQLException: No suitable driver found..." is typically thrown when the worker nodes can't find Phoenix on the class path.
I'm not certain that passing those values using '--conf' actually works or not with Spark. I tend to set them in my 'spark-defaults.conf' in the Spark configuration folder. I think restarting the master and workers may be required as well.
Josh
On Tue, Jan 5, 2016 at 5:38 AM, mengfei <sa...@outlook.com> wrote:
hi josh:
thank you for your advice,and it did work . i build the client-spark jar refreed the patch with thr CDH code and it succeed.
Then i run some code with the "local" mode ,and the result is correct. But when it comes to the "yarn-client" mode ,some error happend:
java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com:2181;
I did try every way i know or i find in the commities: but they don`t help. So i want to get help from you ,Thank you for your patience.
My code is :
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.jdbc.PhoenixDriver
import java.sql.DriverManager
DriverManager.registerDriver(new PhoenixDriver)
val pred = s"MMSI = '002190048'"
val rdd = sc.phoenixTableAsRDD(
"AIS_WS",
Seq("MMSI","LON","LAT","RID"),
predicate = Some(pred),
zkUrl = Some("cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com"))
println(rdd.count())
my scripts are:
spark-submit \
--master yarn-cluster \--driver-class-path "/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--conf "spark.executor.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--conf "spark.driver.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \--jars /data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar \
spark-shell \
--master yarn-client -v \
--driver-class-path "/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
--jars /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
ps: i did copied the jar to every node,and give the even the 777 rghts.
sacuba@Outlook.com
From: Josh Mahonin
Date: 2015-12-30 00:56
To: user
Subject: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Hi,
This issue is fixed with the following patch, and using the resulting 'client-spark' JAR after compilation:
https://issues.apache.org/jira/browse/PHOENIX-2503
As an alternative, you may have some luck also including updated com.fasterxml.jackson jackson-databind JARs in your app that are in sync with Spark's versions. Unfortunately the client JAR right now is shipping fasterxml jars that conflict with the Spark runtime.
Another user has also had success by bundling their own Phoenix dependencies, if you want to try that out instead:
http://mail-archives.apache.org/mod_mbox/incubator-phoenix-user/201512.mbox/%3C0F96D592-74D7-431A-B301-015374A6B4BC@sandia.gov%3E
Josh
On Tue, Dec 29, 2015 at 9:11 AM, sacuba@Outlook.com <sa...@outlook.com> wrote:
The error is
java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector(Lcom/fasterxml/jackson/databind/introspect/ClassIntrospector;)V
at com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
at com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
at com.fasterxml.jackson.module.scala.JacksonModule$$anonfun$setupModule$1.apply(JacksonModule.scala:47)
…..
The scala code is
val df = sqlContext.load(
"org.apache.phoenix.spark",
Map("table" -> "AIS ", "zkUrl" -> "cdhzk1.ccco.com:2181")
)
Maybe I got the resoon ,the Phoenix 4.5.2 on CDH 5.5.x is build with spark 1.4 ,and cdh5.5`defalut spark version is 1.5.
So how could I do? To rebuild a phoenix 4.5.2 version with spark 1.5 Or change the cdh spark to 1.4. Apreantly these are difficult for me . Could someone help me ? Thank you vey much.
Re: Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by spark 1.5
Posted by Josh Mahonin <jm...@gmail.com>.
Hi,
The error "java.sql.SQLException: No suitable driver found..." is typically
thrown when the worker nodes can't find Phoenix on the class path.
I'm not certain that passing those values using '--conf' actually works or
not with Spark. I tend to set them in my 'spark-defaults.conf' in the Spark
configuration folder. I think restarting the master and workers may be
required as well.
Josh
On Tue, Jan 5, 2016 at 5:38 AM, mengfei <sa...@outlook.com> wrote:
> hi josh:
>
> thank you for your advice,and it did work . i build the
> client-spark jar refreed the patch with thr CDH code and it succeed.
> Then i run some code with the "local" mode ,and the
> result is correct. But when it comes to the "yarn-client" mode ,some error
> happend:
>
>
> java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:
> cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com:2181;
>
> I did try every way i know or i find in the commities: but
> they don`t help. So i want to get help from you ,Thank you for your
> patience.
>
> My code is :
>
> import org.apache.spark.sql.SQLContext
>
> import org.apache.phoenix.spark._
>
> import org.apache.spark.SparkContext
>
> import org.apache.spark.sql.SQLContext
>
> import org.apache.phoenix.jdbc.PhoenixDriver
>
> import java.sql.DriverManager
>
> DriverManager.registerDriver(new PhoenixDriver)
>
> val pred = s"MMSI = '002190048'"
>
> val rdd = sc.phoenixTableAsRDD(
>
> "AIS_WS",
>
> Seq("MMSI","LON","LAT","RID"),
>
> predicate = Some(pred),
>
> zkUrl = Some("cdhzk1.boloomo.com,cdhzk2.boloomo.com,cdhzk3.boloomo.com
> "))
>
> println(rdd.count())
>
>
> my scripts are:
>
> spark-submit \
>
> --master yarn-cluster \
>
>
> --driver-class-path "/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \
>
>
> --conf "spark.executor.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \
>
>
> --conf "spark.driver.extraClassPath=/data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar" \
>
> --jars /data/public/mengfei/lib/phoenix-1.2.0-client-spark.jar \
>
>
> spark-shell \
> --master yarn-client -v \
> --driver-class-path "/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar"
> \
>
> --conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
>
> --conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar" \
> --jars /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-1.2.0-client-spark.jar
>
>
>
> ps: i did copied the jar to every node,and give the even the 777 rghts.
>
> ------------------------------
> sacuba@Outlook.com
>
>
> *From:* Josh Mahonin <jm...@gmail.com>
> *Date:* 2015-12-30 00:56
> *To:* user <us...@phoenix.apache.org>
> *Subject:* Re: error when get data from Phoenix 4.5.2 on CDH 5.5.x by
> spark 1.5
> Hi,
>
> This issue is fixed with the following patch, and using the resulting
> 'client-spark' JAR after compilation:
> https://issues.apache.org/jira/browse/PHOENIX-2503
>
> As an alternative, you may have some luck also including updated
> com.fasterxml.jackson jackson-databind JARs in your app that are in sync
> with Spark's versions. Unfortunately the client JAR right now is shipping
> fasterxml jars that conflict with the Spark runtime.
>
> Another user has also had success by bundling their own Phoenix
> dependencies, if you want to try that out instead:
>
> http://mail-archives.apache.org/mod_mbox/incubator-phoenix-user/201512.mbox/%3C0F96D592-74D7-431A-B301-015374A6B4BC@sandia.gov%3E
>
> Josh
>
>
>
> On Tue, Dec 29, 2015 at 9:11 AM, sacuba@Outlook.com <sa...@outlook.com>
> wrote:
>
>> The error is
>>
>> java.lang.NoSuchMethodError:
>> com.fasterxml.jackson.databind.Module$SetupContext.setClassIntrospector(Lcom/fasterxml/jackson/databind/introspect/ClassIntrospector;)V
>>
>> at
>> com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
>>
>> at
>> com.fasterxml.jackson.module.scala.introspect.ScalaClassIntrospectorModule$$anonfun$1.apply(ScalaClassIntrospector.scala:32)
>>
>> at
>> com.fasterxml.jackson.module.scala.JacksonModule$$anonfun$setupModule$1.apply(JacksonModule.scala:47)
>>
>> …..
>>
>> The scala code is
>>
>> val df = sqlContext.load(
>>
>> "org.apache.phoenix.spark",
>>
>> Map("table" -> "AIS ", "zkUrl" -> "cdhzk1.ccco.com:2181")
>>
>> )
>>
>>
>>
>> Maybe I got the resoon ,the Phoenix 4.5.2 on CDH 5.5.x is build with
>> spark 1.4 ,and cdh5.5`defalut spark version is 1.5.
>>
>> So how could I do? To rebuild a phoenix 4.5.2 version with spark 1.5 Or
>> change the cdh spark to 1.4. Apreantly these are difficult for me . Could
>> someone help me ? Thank you vey much.
>>
>>
>>
>
>