You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Ruslan Dautkhanov (JIRA)" <ji...@apache.org> on 2016/11/29 20:04:58 UTC
[jira] [Created] (ZEPPELIN-1728) Assigning HiveContext(sc) to a variable 2nd time gives errors

Ruslan Dautkhanov created ZEPPELIN-1728:
-------------------------------------------

             Summary: Assigning HiveContext(sc) to a variable 2nd time gives errors
                 Key: ZEPPELIN-1728
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1728
             Project: Zeppelin
          Issue Type: Bug
          Components: Core, pySpark, zeppelin-server
    Affects Versions: 0.6.2
         Environment: Spark 1.6 that comes with CDH 5.8.3. 
Zeppelin 0.6.2, downloaded as zeppelin-0.6.2-bin-all.tgz from from apache.org

            Reporter: Ruslan Dautkhanov


Assigning HiveContext(sc) to a variable 2nd time gives "You must build Spark with Hive. Export 'SPARK_HIVE=true'"

It's only fixable by restarting Zeppelin.

Getting 
You must build Spark with Hive. Export 'SPARK_HIVE=true'
See full stack (2) below.

I'm using Spark 1.6 that comes with CDH 5.8.3. 
So it's definitely compiled with Hive.
We use Jupyter notebooks without problems in the same environment.

Using Zeppelin 0.6.2, downloaded as zeppelin-0.6.2-bin-all.tgz from from apache.org

Is Zeppelin compiled with Hive too? I guess so.
Not sure what else is missing.

Tried to play with ZEPPELIN_SPARK_USEHIVECONTEXT but it does not make difference.


(1)
{noformat}
$ cat zeppelin-env.sh
export JAVA_HOME=/usr/java/java7
export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export SPARK_SUBMIT_OPTIONS="--principal xxxx --keytab yyy --conf spark.driver.memory=7g --conf spark.executor.cores=2 --conf spark.executor.memory=8g"
export SPARK_APP_NAME="Zeppelin notebook"
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HIVE_CONF_DIR=/etc/hive/conf
export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
export PYSPARK_PYTHON="/opt/cloudera/parcels/Anaconda/bin/python2"
export PYTHONPATH="/opt/cloudera/parcels/CDH/lib/spark/python:/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip"
export MASTER="yarn-client"
export ZEPPELIN_SPARK_USEHIVECONTEXT=true
{noformat}



(2)
{noformat}
You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt assembly
Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 267, in <module>
    raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 265, in <module>
    exec(code)
  File "<stdin>", line 9, in <module>
  File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py", line 580, in sql
{noformat}

(3)
{noformat}
Also have correct symlinks in zeppelin_home/conf for
- hive-site.xml
- hdfs-site.xml
- core-site.xml
- yarn-site.xml
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)