You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Keith Chapman <ke...@gmail.com> on 2017/01/20 02:57:42 UTC

Hi ,

Is it possible for an executor (or slave) to know when an actual job ends?
I'm running spark on a cluster (with yarn) and my workers create some
temporary files that I would like to clean up once the job ends. Is there a
way for the worker to detect that a job has finished? I tried doing it in
the JobProgressListener but it does not seem to work in a cluster. The
event is not triggered in the worker.

Regards,
Keith.

http://keith-chapman.com

Re:

Posted by Naveen <ha...@gmail.com>.
Hi Keith,

Can you try including a clean-up step at the end of job, before driver is
out of SparkContext, to clean the necessary files through some regex
patterns or so, on all nodes in your cluster by default. If files are not
available on few nodes, that should not be a problem, isnnt?



On Sun, Jan 22, 2017 at 1:26 AM, Mark Hamstra <ma...@clearstorydata.com>
wrote:

> I wouldn't say that Executors are dumb, but there are some pretty clear
> divisions of concepts and responsibilities across the different pieces of
> the Spark architecture. A Job is a concept that is completely unknown to an
> Executor, which deals instead with just the Tasks that it is given.  So you
> are correct, Jacek, that any notification of a Job end has to come from the
> Driver.
>
> On Sat, Jan 21, 2017 at 2:10 AM, Jacek Laskowski <ja...@japila.pl> wrote:
>
>> Executors are "dumb", i.e. they execute TaskRunners for tasks
>> and...that's it.
>>
>> Your logic should be on the driver that can intercept events
>> and...trigger cleanup.
>>
>> I don't think there's another way to do it.
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Fri, Jan 20, 2017 at 10:47 PM, Keith Chapman <ke...@gmail.com>
>> wrote:
>> > Hi Jacek,
>> >
>> > I've looked at SparkListener and tried it, I see it getting fired on the
>> > master but I don't see it getting fired on the workers in a cluster.
>> >
>> > Regards,
>> > Keith.
>> >
>> > http://keith-chapman.com
>> >
>> > On Fri, Jan 20, 2017 at 11:09 AM, Jacek Laskowski <ja...@japila.pl>
>> wrote:
>> >>
>> >> Hi,
>> >>
>> >> (redirecting to users as it has nothing to do with Spark project
>> >> development)
>> >>
>> >> Monitor jobs and stages using SparkListener and submit cleanup jobs
>> where
>> >> a condition holds.
>> >>
>> >> Jacek
>> >>
>> >> On 20 Jan 2017 3:57 a.m., "Keith Chapman" <ke...@gmail.com>
>> wrote:
>> >>>
>> >>> Hi ,
>> >>>
>> >>> Is it possible for an executor (or slave) to know when an actual job
>> >>> ends? I'm running spark on a cluster (with yarn) and my workers
>> create some
>> >>> temporary files that I would like to clean up once the job ends. Is
>> there a
>> >>> way for the worker to detect that a job has finished? I tried doing
>> it in
>> >>> the JobProgressListener but it does not seem to work in a cluster.
>> The event
>> >>> is not triggered in the worker.
>> >>>
>> >>> Regards,
>> >>> Keith.
>> >>>
>> >>> http://keith-chapman.com
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>

Re:

Posted by Mark Hamstra <ma...@clearstorydata.com>.
I wouldn't say that Executors are dumb, but there are some pretty clear
divisions of concepts and responsibilities across the different pieces of
the Spark architecture. A Job is a concept that is completely unknown to an
Executor, which deals instead with just the Tasks that it is given.  So you
are correct, Jacek, that any notification of a Job end has to come from the
Driver.

On Sat, Jan 21, 2017 at 2:10 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> Executors are "dumb", i.e. they execute TaskRunners for tasks and...that's
> it.
>
> Your logic should be on the driver that can intercept events
> and...trigger cleanup.
>
> I don't think there's another way to do it.
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Fri, Jan 20, 2017 at 10:47 PM, Keith Chapman <ke...@gmail.com>
> wrote:
> > Hi Jacek,
> >
> > I've looked at SparkListener and tried it, I see it getting fired on the
> > master but I don't see it getting fired on the workers in a cluster.
> >
> > Regards,
> > Keith.
> >
> > http://keith-chapman.com
> >
> > On Fri, Jan 20, 2017 at 11:09 AM, Jacek Laskowski <ja...@japila.pl>
> wrote:
> >>
> >> Hi,
> >>
> >> (redirecting to users as it has nothing to do with Spark project
> >> development)
> >>
> >> Monitor jobs and stages using SparkListener and submit cleanup jobs
> where
> >> a condition holds.
> >>
> >> Jacek
> >>
> >> On 20 Jan 2017 3:57 a.m., "Keith Chapman" <ke...@gmail.com>
> wrote:
> >>>
> >>> Hi ,
> >>>
> >>> Is it possible for an executor (or slave) to know when an actual job
> >>> ends? I'm running spark on a cluster (with yarn) and my workers create
> some
> >>> temporary files that I would like to clean up once the job ends. Is
> there a
> >>> way for the worker to detect that a job has finished? I tried doing it
> in
> >>> the JobProgressListener but it does not seem to work in a cluster. The
> event
> >>> is not triggered in the worker.
> >>>
> >>> Regards,
> >>> Keith.
> >>>
> >>> http://keith-chapman.com
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Re:

Posted by Jacek Laskowski <ja...@japila.pl>.
Executors are "dumb", i.e. they execute TaskRunners for tasks and...that's it.

Your logic should be on the driver that can intercept events
and...trigger cleanup.

I don't think there's another way to do it.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Fri, Jan 20, 2017 at 10:47 PM, Keith Chapman <ke...@gmail.com> wrote:
> Hi Jacek,
>
> I've looked at SparkListener and tried it, I see it getting fired on the
> master but I don't see it getting fired on the workers in a cluster.
>
> Regards,
> Keith.
>
> http://keith-chapman.com
>
> On Fri, Jan 20, 2017 at 11:09 AM, Jacek Laskowski <ja...@japila.pl> wrote:
>>
>> Hi,
>>
>> (redirecting to users as it has nothing to do with Spark project
>> development)
>>
>> Monitor jobs and stages using SparkListener and submit cleanup jobs where
>> a condition holds.
>>
>> Jacek
>>
>> On 20 Jan 2017 3:57 a.m., "Keith Chapman" <ke...@gmail.com> wrote:
>>>
>>> Hi ,
>>>
>>> Is it possible for an executor (or slave) to know when an actual job
>>> ends? I'm running spark on a cluster (with yarn) and my workers create some
>>> temporary files that I would like to clean up once the job ends. Is there a
>>> way for the worker to detect that a job has finished? I tried doing it in
>>> the JobProgressListener but it does not seem to work in a cluster. The event
>>> is not triggered in the worker.
>>>
>>> Regards,
>>> Keith.
>>>
>>> http://keith-chapman.com
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re:

Posted by Keith Chapman <ke...@gmail.com>.
Hi Jacek,

I've looked at SparkListener and tried it, I see it getting fired on the
master but I don't see it getting fired on the workers in a cluster.

Regards,
Keith.

http://keith-chapman.com

On Fri, Jan 20, 2017 at 11:09 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> (redirecting to users as it has nothing to do with Spark project
> development)
>
> Monitor jobs and stages using SparkListener and submit cleanup jobs where
> a condition holds.
>
> Jacek
>
> On 20 Jan 2017 3:57 a.m., "Keith Chapman" <ke...@gmail.com> wrote:
>
>> Hi ,
>>
>> Is it possible for an executor (or slave) to know when an actual job
>> ends? I'm running spark on a cluster (with yarn) and my workers create some
>> temporary files that I would like to clean up once the job ends. Is there a
>> way for the worker to detect that a job has finished? I tried doing it in
>> the JobProgressListener but it does not seem to work in a cluster. The
>> event is not triggered in the worker.
>>
>> Regards,
>> Keith.
>>
>> http://keith-chapman.com
>>
>

Re:

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,

(redirecting to users as it has nothing to do with Spark project
development)

Monitor jobs and stages using SparkListener and submit cleanup jobs where a
condition holds.

Jacek

On 20 Jan 2017 3:57 a.m., "Keith Chapman" <ke...@gmail.com> wrote:

> Hi ,
>
> Is it possible for an executor (or slave) to know when an actual job ends?
> I'm running spark on a cluster (with yarn) and my workers create some
> temporary files that I would like to clean up once the job ends. Is there a
> way for the worker to detect that a job has finished? I tried doing it in
> the JobProgressListener but it does not seem to work in a cluster. The
> event is not triggered in the worker.
>
> Regards,
> Keith.
>
> http://keith-chapman.com
>