You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2013/07/02 01:14:22 UTC

YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by John Lilley <jo...@redpoint.net>.

Thanks, that answers my question.  I am trying to explore alternatives to a YARN auxiliary service, but apparently this isn’t an option.
John

From: Ravi Prakash [mailto:ravihoo@ymail.com]
Sent: Tuesday, July 02, 2013 9:55 AM
To: user@hadoop.apache.org
Subject: Re: YARN tasks and child processes

Nopes! The node manager kills the entire process tree when the task reports that it is done. Now if you were able to figure out a way for one of the children to break out of the process tree, maybe?

However your approach is obviously not recommended. You would be stealing from the resources that YARN should have available.

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Tuesday, July 2, 2013 10:41 AM
Subject: RE: YARN tasks and child processes

Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devaraj.k@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e ‘yarn.nodemanager.local-dirs’ configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by John Lilley <jo...@redpoint.net>.

Thanks, that answers my question.  I am trying to explore alternatives to a YARN auxiliary service, but apparently this isn’t an option.
John

From: Ravi Prakash [mailto:ravihoo@ymail.com]
Sent: Tuesday, July 02, 2013 9:55 AM
To: user@hadoop.apache.org
Subject: Re: YARN tasks and child processes

Nopes! The node manager kills the entire process tree when the task reports that it is done. Now if you were able to figure out a way for one of the children to break out of the process tree, maybe?

However your approach is obviously not recommended. You would be stealing from the resources that YARN should have available.

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Tuesday, July 2, 2013 10:41 AM
Subject: RE: YARN tasks and child processes

Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devaraj.k@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e ‘yarn.nodemanager.local-dirs’ configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by John Lilley <jo...@redpoint.net>.

Thanks, that answers my question.  I am trying to explore alternatives to a YARN auxiliary service, but apparently this isn’t an option.
John

From: Ravi Prakash [mailto:ravihoo@ymail.com]
Sent: Tuesday, July 02, 2013 9:55 AM
To: user@hadoop.apache.org
Subject: Re: YARN tasks and child processes

Nopes! The node manager kills the entire process tree when the task reports that it is done. Now if you were able to figure out a way for one of the children to break out of the process tree, maybe?

However your approach is obviously not recommended. You would be stealing from the resources that YARN should have available.

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Tuesday, July 2, 2013 10:41 AM
Subject: RE: YARN tasks and child processes

Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devaraj.k@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e ‘yarn.nodemanager.local-dirs’ configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by John Lilley <jo...@redpoint.net>.

Thanks, that answers my question.  I am trying to explore alternatives to a YARN auxiliary service, but apparently this isn’t an option.
John

From: Ravi Prakash [mailto:ravihoo@ymail.com]
Sent: Tuesday, July 02, 2013 9:55 AM
To: user@hadoop.apache.org
Subject: Re: YARN tasks and child processes

Nopes! The node manager kills the entire process tree when the task reports that it is done. Now if you were able to figure out a way for one of the children to break out of the process tree, maybe?

However your approach is obviously not recommended. You would be stealing from the resources that YARN should have available.

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Tuesday, July 2, 2013 10:41 AM
Subject: RE: YARN tasks and child processes

Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devaraj.k@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e ‘yarn.nodemanager.local-dirs’ configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

Re: YARN tasks and child processes

Posted by Ravi Prakash <ra...@ymail.com>.

Nopes! The node manager kills the entire process tree when the task reports that it is done. Now if you were able to figure out a way for one of the children to break out of the process tree, maybe?

However your approach is obviously not recommended. You would be stealing from the resources that YARN should have available.





________________________________
 From: John Lilley <jo...@redpoint.net>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org> 
Sent: Tuesday, July 2, 2013 10:41 AM
Subject: RE: YARN tasks and child processes
 


 
Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john
 
From:Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org
Subject: RE: YARN tasks and child processes
 
It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e ‘yarn.nodemanager.local-dirs’ configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.
 
 
Thanks
Devaraj k
 
From:John Lilley [mailto:john.lilley@redpoint.net] 
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org
Subject: YARN tasks and child processes
 
Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

Re: YARN tasks and child processes

Posted by Ravi Prakash <ra...@ymail.com>.

Nopes! The node manager kills the entire process tree when the task reports that it is done. Now if you were able to figure out a way for one of the children to break out of the process tree, maybe?

However your approach is obviously not recommended. You would be stealing from the resources that YARN should have available.





________________________________
 From: John Lilley <jo...@redpoint.net>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org> 
Sent: Tuesday, July 2, 2013 10:41 AM
Subject: RE: YARN tasks and child processes
 


 
Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john
 
From:Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org
Subject: RE: YARN tasks and child processes
 
It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e ‘yarn.nodemanager.local-dirs’ configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.
 
 
Thanks
Devaraj k
 
From:John Lilley [mailto:john.lilley@redpoint.net] 
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org
Subject: YARN tasks and child processes
 
Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

Re: YARN tasks and child processes

Posted by Ravi Prakash <ra...@ymail.com>.

Nopes! The node manager kills the entire process tree when the task reports that it is done. Now if you were able to figure out a way for one of the children to break out of the process tree, maybe?

However your approach is obviously not recommended. You would be stealing from the resources that YARN should have available.





________________________________
 From: John Lilley <jo...@redpoint.net>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org> 
Sent: Tuesday, July 2, 2013 10:41 AM
Subject: RE: YARN tasks and child processes
 


 
Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john
 
From:Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org
Subject: RE: YARN tasks and child processes
 
It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e ‘yarn.nodemanager.local-dirs’ configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.
 
 
Thanks
Devaraj k
 
From:John Lilley [mailto:john.lilley@redpoint.net] 
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org
Subject: YARN tasks and child processes
 
Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

Re: YARN tasks and child processes

Posted by Ravi Prakash <ra...@ymail.com>.

Nopes! The node manager kills the entire process tree when the task reports that it is done. Now if you were able to figure out a way for one of the children to break out of the process tree, maybe?

However your approach is obviously not recommended. You would be stealing from the resources that YARN should have available.





________________________________
 From: John Lilley <jo...@redpoint.net>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org> 
Sent: Tuesday, July 2, 2013 10:41 AM
Subject: RE: YARN tasks and child processes
 


 
Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john
 
From:Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org
Subject: RE: YARN tasks and child processes
 
It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e ‘yarn.nodemanager.local-dirs’ configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.
 
 
Thanks
Devaraj k
 
From:John Lilley [mailto:john.lilley@redpoint.net] 
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org
Subject: YARN tasks and child processes
 
Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by John Lilley <jo...@redpoint.net>.

Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devaraj.k@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e 'yarn.nodemanager.local-dirs' configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by John Lilley <jo...@redpoint.net>.

Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devaraj.k@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e 'yarn.nodemanager.local-dirs' configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by John Lilley <jo...@redpoint.net>.

Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devaraj.k@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e 'yarn.nodemanager.local-dirs' configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by John Lilley <jo...@redpoint.net>.

Devaraj,
Thanks, this is also good information.  But I was really asking if a child *process* that was spawned by a task can persist, in addition to the data.
john

From: Devaraj k [mailto:devaraj.k@huawei.com]
Sent: Monday, July 01, 2013 11:50 PM
To: user@hadoop.apache.org
Subject: RE: YARN tasks and child processes

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e 'yarn.nodemanager.local-dirs' configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by Devaraj k <de...@huawei.com>.

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e 'yarn.nodemanager.local-dirs' configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by Devaraj k <de...@huawei.com>.

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e 'yarn.nodemanager.local-dirs' configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by Devaraj k <de...@huawei.com>.

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e 'yarn.nodemanager.local-dirs' configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John

RE: YARN tasks and child processes

Posted by Devaraj k <de...@huawei.com>.

It is possible to persist the data by YARN task, you can choose whichever place you want to persist.
If you choose to persist in HDFS, you need to take care deleting the data after using it.  If you choose to write in local dir, you may write the data into the nm local dirs (i.e 'yarn.nodemanager.local-dirs' configuration) accordingly with the app id & container id, and this will be cleaned up after the app completion.  You need to make use of this persisted data before completing the application.

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 04:44
To: user@hadoop.apache.org
Subject: YARN tasks and child processes

Is it possible for a child process of a YARN task to persist after the task is complete?  I am looking at an alternative to a YARN auxiliary process that may be simpler to implement, if I can have a task spawn a process that persists for some time after the task finishes.
Thanks,
John