You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by "michal.petrov@mtrust.eu" <mi...@mtrust.eu> on 2016/05/19 20:41:34 UTC

pyspark cassandra - problem with python imports

Hi, 
i'm trying to use pyspark notebook with cassandra connector
(https://github.com/TargetHolding/pyspark-cassandra). 

To load the cassandra connector you need to add the package via --packages
(--packages TargetHolding:pyspark-cassandra:0.3.4). And then you just import
pyspark_cassandra which adds methods like cassandraTable to your
SparkContext. Everything is working when I try it to run via spark-submit. 

Now I want to try it with Zeppelin. So I added 
export SPARK_SUBMIT_OPTIONS="--packages
TargetHolding:pyspark-cassandra:0.3.4" 
into conf/zeppelin-env.sh. 

Now when I start notebook with %pyspark I can see from logs that Zeppelin
connects to my spark master and the cassandra connector is loaded. The
problem occures when I try to import pyspark_cassandra, I get error
ImportError: No module named pyspark_cassandra. 

I also tried some other option like using %dep or setting depnecies in
/conf/spark-defaults.conf but without any luck. I'm using Zeppeling 0.5.6. 



--
View this message in context: http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/pyspark-cassandra-problem-with-python-imports-tp3072.html
Sent from the Apache Zeppelin Users (incubating) mailing list mailing list archive at Nabble.com.