You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by morfious902002 <an...@gmail.com> on 2017/05/31 14:51:50 UTC

Creating Dataframe by querying Impala

Hi,
I am trying to create a Dataframe by querying Impala Table. It works fine in
my local environment but when I try to run it in cluster I either get 

Error:java.lang.ClassNotFoundException: com.cloudera.impala.jdbc41.Driver

or

No Suitable Driver found. 

Can someone help me or direct me to how I can accomplish this?

I am using Spark 1.6.1. Here is my command (No Suitable Driver found error)
:-
'/appserver/spark/spark-1.6.1-bin-hadoop2.6/bin/spark-submit' '--master'
'yarn' '--deploy-mode' 'cluster' '--name' 'Livy' '--jars'
"hdfs:///user/lib/ImpalaJDBC41.jar,hdfs:///user/lib/TCLIServiceClient.jar,hdfs:///user/lib/libfb303-0.9.0.jar,hdfs:///user/lib/libthrift-0.9.0.jar,hdfs:///user/lib/hive_metastore.jar,hdfs:///user/lib/hive_service.jar"
'--class' 'Main.class' '--driver-memory' '5G' '--driver-cores' '2'
'--executor-memory' '8G' '--driver-cores' '2' '--executor-cores' '3'
'--num-executors' '2' 'my.jar' 'arg' 'arg' 'arg'



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Creating-Dataframe-by-querying-Impala-tp28723.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: Creating Dataframe by querying Impala

Posted by Anubhav Agarwal <an...@gmail.com>.

The issue seems to be with primordial class loader. I cannot load the
drivers to all the nodes at the same location but have loaded the jars to
HDFS. I have tried SPARK_YARN_DIST_FILES as well as SPARK_CLASSPATH on the
edge node with no luck. Is there another way to load these jars
through  primordial class loader in YARN CLUSTER mode ? or do I have to add
them inside SPARK assembly jar?

Thank You for all the help.

On Wed, May 31, 2017 at 10:51 AM, morfious902002 <an...@gmail.com>
wrote:

> Hi,
> I am trying to create a Dataframe by querying Impala Table. It works fine
> in
> my local environment but when I try to run it in cluster I either get
>
> Error:java.lang.ClassNotFoundException: com.cloudera.impala.jdbc41.Driver
>
> or
>
> No Suitable Driver found.
>
> Can someone help me or direct me to how I can accomplish this?
>
> I am using Spark 1.6.1. Here is my command (No Suitable Driver found error)
> :-
> '/appserver/spark/spark-1.6.1-bin-hadoop2.6/bin/spark-submit' '--master'
> 'yarn' '--deploy-mode' 'cluster' '--name' 'Livy' '--jars'
> "hdfs:///user/lib/ImpalaJDBC41.jar,hdfs:///user/lib/TCLIServiceClient.jar,
> hdfs:///user/lib/libfb303-0.9.0.jar,hdfs:///user/lib/
> libthrift-0.9.0.jar,hdfs:///user/lib/hive_metastore.jar,
> hdfs:///user/lib/hive_service.jar"
> '--class' 'Main.class' '--driver-memory' '5G' '--driver-cores' '2'
> '--executor-memory' '8G' '--driver-cores' '2' '--executor-cores' '3'
> '--num-executors' '2' 'my.jar' 'arg' 'arg' 'arg'
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Creating-Dataframe-by-querying-Impala-tp28723.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Re: Creating Dataframe by querying Impala

Posted by morfious902002 <an...@gmail.com>.

The issue seems to be with primordial class loader. I cannot load the drivers
to all the nodes at the same location but have loaded the jars to HDFS. I
have tried SPARK_YARN_DIST_FILES as well as SPARK_CLASSPATH on the edge node
with no luck. Is there another way to load these jars through  primordial
class loader in YARN CLUSTER mode ? or do I have to add them inside SPARK
assembly jar?

Thank You for all the help.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Creating-Dataframe-by-querying-Impala-tp28723p28731.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org