You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Akhil Das <ak...@sigmoidanalytics.com> on 2014/11/02 17:26:44 UTC

Re: properties file on a spark cluster

The problem here is, when you run a spark program in cluster mode, it will
look for the file in the worker machine. Best approach would be to put the
file in hdfs and use it instead of local path. Another approach would be to
create the same file in the same path on all worker machines and hopefully
it will pick it up from there.

Thanks
Best Regards

On Fri, Oct 31, 2014 at 10:32 PM, Daniel Takabayashi <
takabayashi@scanboo.com.br> wrote:

> Hi Guys,
>
> I'm trying to execute a spark job using python, running on a cluster of
> Yarn (managed by cloudera manager). The python script is using a set of
> python programs installed in each member of cluster. This set of programs
> need an property file found by a local system path.
>
> My problem is:  When this script is sent, using spark-submit, the programs
> can't find this properties file. Running locally as stand-alone job, is no
> problem, the properties file is found.
>
> My questions is:
>
> 1 - What is the problem here ?
> 2 - In this scenario (an script running on a spark yarn cluster that use
> python programs that share same properties file) what is the best approach ?
>
>
> Thank's
> taka
>