You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Chris Douglas (JIRA)" <ji...@apache.org> on 2015/06/29 21:29:05 UTC
[jira] [Commented] (YARN-3784) Indicate preemption timout along
with the list of containers to AM (preemption message)
[ https://issues.apache.org/jira/browse/YARN-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606205#comment-14606205 ]
Chris Douglas commented on YARN-3784:
-------------------------------------
Minor:
- Docs for timeout don't include units
- Many whitespace changes in {{FiCaSchedulerApp}}
- change nested if to {{&&}} at:
{noformat}
+ if (this.preemptionTimeout != 0) {
+ if (timeout > this.preemptionTimeout) {
{noformat}
- Would it be possible to test more than the timeout reported is non-zero? If this used a {{Clock}} instead of calling {{System.currentTimeMillis}} directly, the unit test could be easier to write...
If containers are preempted for multiple causes (e.g., over-capacity, NM decommission), then the time to preempt could vary widely. The ProportionalCPP also limits the preempted capacity per round, so a global timeout will be very pessimistic. Would it make sense to change {{timeout}} to be {{nextkill}}? More general solutions would be significantly more work...
> Indicate preemption timout along with the list of containers to AM (preemption message)
> ---------------------------------------------------------------------------------------
>
> Key: YARN-3784
> URL: https://issues.apache.org/jira/browse/YARN-3784
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Sunil G
> Assignee: Sunil G
> Attachments: 0001-YARN-3784.patch
>
>
> Currently during preemption, AM is notified with a list of containers which are marked for preemption. Introducing a timeout duration also along with this container list so that AM can know how much time it will get to do a graceful shutdown to its containers (assuming one of preemption policy is loaded in AM).
> This will help in decommissioning NM scenarios, where NM will be decommissioned after a timeout (also killing containers on it). This timeout will be helpful to indicate AM that those containers can be killed by RM forcefully after the timeout.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)