You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zeppelin.apache.org by "Felix Cheung (JIRA)" <ji...@apache.org> on 2015/10/31 05:30:27 UTC

[jira] [Created] (ZEPPELIN-378) Clarify uses of spark.home property vs SPARK_HOME env var

Felix Cheung created ZEPPELIN-378:
-------------------------------------

             Summary: Clarify uses of spark.home property vs SPARK_HOME env var
                 Key: ZEPPELIN-378
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-378
             Project: Zeppelin
          Issue Type: Bug
          Components: Interpreters
    Affects Versions: 0.6.0
            Reporter: Felix Cheung
            Priority: Minor


interpreter property 'spark.home' is little bit confusing with SPARK_HOME.
At the moment, defining SPARK_HOME in conf/zeppelin-env.sh is recommended instead of spark.home.

Best,
moon

On Fri, Oct 30, 2015 at 2:44 AM Jeff Steinmetz <je...@gmail.com> wrote:
That’s a good pointer.
Question still stands, how do you load libraries (jars) for %pyspark?

Its clear how to do it for %spark (scala) via %dep.

Looking for the equivalent of:

./bin/pyspark --master local[2] --jars jars/elasticsearch-hadoop-2.1.0.Beta2.jar


From: Matt Sochor
Reply-To: <us...@zeppelin.incubator.apache.org>
Date: Thursday, October 29, 2015 at 3:19 PM
To: <us...@zeppelin.incubator.apache.org>
Subject: Re: pyspark with jar

I actually *just* figured it out.  Zeppelin has sqlContext "already created and exposed" (https://zeppelin.incubator.apache.org/docs/interpreter/spark.html).

So when I do "sqlContext = SQLContext(sc)" I overwrite sqlContext.  Then Zeppelin cannot see this new sqlContext.

Anyway, anyone out there experiencing this problem, do NOT initialize sqlContext and it works fine.  

On Thu, Oct 29, 2015 at 6:10 PM Jeff Steinmetz <je...@gmail.com> wrote:
In zeppelin, what is the equivalent to adding jars in a pyspark call?

Such as running pyspark with the elasticsearch-hadoop jar

./bin/pyspark --master local[2] --jars jars/elasticsearch-hadoop-2.1.0.Beta2.jar

My assumption is that loading something like this inside a %dep is pointless, since those dependencies would only live in the %spark scala world (the spark jvm).  In zeppelin - pyspark spawns a separate process.

Also how is the interpreters “spark.home” used?  How is it different that the  “SPARK_HOME” zeppelin-env.sh
And finally – how are args used in the interpreter?  (what uses them)?

Thank you.
Jeff



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)