You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Taeyun Kim <ta...@innowireless.com> on 2015/03/02 01:05:46 UTC

RE: Is SPARK_CLASSPATH really deprecated?

spark.executor.extraClassPath is especially useful when the output is
written to HBase, since the data nodes on the cluster have HBase library
jars.

-----Original Message-----
From: Patrick Wendell [mailto:pwendell@gmail.com] 
Sent: Friday, February 27, 2015 5:22 PM
To: Kannan Rajah
Cc: Marcelo Vanzin; user@spark.apache.org
Subject: Re: Is SPARK_CLASSPATH really deprecated?

I think we need to just update the docs, it is a bit unclear right now. At
the time, we made it worded fairly sternly because we really wanted people
to use --jars when we deprecated SPARK_CLASSPATH. But there are other types
of deployments where there is a legitimate need to augment the classpath of
every executor.

I think it should probably say something more like

"Extra classpath entries to append to the classpath of executors. This is
sometimes used in deployment environments where dependencies of Spark are
present in a specific place on all nodes".

Kannan - if you want to submit I patch I can help review it.

On Thu, Feb 26, 2015 at 8:24 PM, Kannan Rajah <kr...@maprtech.com> wrote:
> Thanks Marcelo. Do you think it would be useful to make 
> spark.executor.extraClassPath be made to pick up some environment 
> variable that can be set from spark-env.sh? Here is a example.
>
> spark-env.sh
> ------------------
> executor_extra_cp = get_hbase_jars_for_cp export executor_extra_cp
>
> spark-defaults.conf
> ---------------------
> spark.executor.extraClassPath = ${executor_extra_cp}
>
> This will let us add logic inside get_hbase_jars_for_cp function to 
> pick the right version hbase jars. There could be multiple versions 
> installed on the node.
>
>
>
> --
> Kannan
>
> On Thu, Feb 26, 2015 at 6:08 PM, Marcelo Vanzin <va...@cloudera.com>
wrote:
>>
>> On Thu, Feb 26, 2015 at 5:12 PM, Kannan Rajah <kr...@maprtech.com>
wrote:
>> > Also, I would like to know if there is a localization overhead when 
>> > we use spark.executor.extraClassPath. Again, in the case of hbase, 
>> > these jars would be typically available on all nodes. So there is 
>> > no need to localize them from the node where job was submitted. I 
>> > am wondering if we use the SPARK_CLASSPATH approach, then it would 
>> > not do localization. That would be an added benefit.
>> > Please clarify.
>>
>> spark.executor.extraClassPath doesn't localize anything. It just 
>> prepends those classpath entries to the usual classpath used to 
>> launch the executor. There's no copying of files or anything, so 
>> they're expected to exist on the nodes.
>>
>> It's basically exactly the same as SPARK_CLASSPATH, but broken down 
>> to two options (one for the executors, and one for the driver).
>>
>> --
>> Marcelo
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org For additional
commands, e-mail: user-help@spark.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org