You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Chanh Le <gi...@gmail.com> on 2017/06/12 11:14:29 UTC

SPARK environment settings issue when deploying a custom distribution

Hi everyone,

Recently I discovered an issue when processing csv of spark. So I decided
to fix it following this https://issues.apache.org/jira/browse/SPARK-21024 I
built a custom distribution for internal uses. I built it in my local
machine then upload the distribution to server.

server's *~/.bashrc*

# added by Anaconda2 4.3.1 installer
export PATH="/opt/etl/anaconda/anaconda2/bin:$PATH"
export SPARK_HOME="/opt/etl/spark-2.1.0-bin-hadoop2.7"
export
PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$PYTHONPATH

What I did on server was:
export SPARK_HOME=/home/etladmin/spark-2.2.1-SNAPSHOT-bin-custom

$SPARK_HOME/bin/spark-submit --version
It print out version *2.1.1* which* is not* the version I built (2.2.1)


I did set *SPARK_HOME* in my local machine (MACOS) for this distribution
and it's working well, print out the version *2.2.1*

I need the way to investigate the invisible environment variable.

Do you have any suggestions?
Thank in advance.

Regards,
Chanh

-- 
Regards,
Chanh

Re: SPARK environment settings issue when deploying a custom distribution

Posted by Chanh Le <gi...@gmail.com>.
Just add more information how I build the custom distribution.
I clone spark repo then switch to branch 2.2 then make distribution that
following.

λ ~/workspace/big_data/spark/ branch-2.2*
λ ~/workspace/big_data/spark/ ./dev/make-distribution.sh --name custom
--tgz -Phadoop-2.7 -Dhadoop.version=2.7.0 -Phive -Phive-thriftserver
-Pmesos -Pyarn



On Mon, Jun 12, 2017 at 6:14 PM Chanh Le <gi...@gmail.com> wrote:

> Hi everyone,
>
> Recently I discovered an issue when processing csv of spark. So I decided
> to fix it following this https://issues.apache.org/jira/browse/SPARK-21024 I
> built a custom distribution for internal uses. I built it in my local
> machine then upload the distribution to server.
>
> server's *~/.bashrc*
>
> # added by Anaconda2 4.3.1 installer
> export PATH="/opt/etl/anaconda/anaconda2/bin:$PATH"
> export SPARK_HOME="/opt/etl/spark-2.1.0-bin-hadoop2.7"
> export
> PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$PYTHONPATH
>
> What I did on server was:
> export SPARK_HOME=/home/etladmin/spark-2.2.1-SNAPSHOT-bin-custom
>
> $SPARK_HOME/bin/spark-submit --version
> It print out version *2.1.1* which* is not* the version I built (2.2.1)
>
>
> I did set *SPARK_HOME* in my local machine (MACOS) for this distribution
> and it's working well, print out the version *2.2.1*
>
> I need the way to investigate the invisible environment variable.
>
> Do you have any suggestions?
> Thank in advance.
>
> Regards,
> Chanh
>
> --
> Regards,
> Chanh
>
-- 
Regards,
Chanh