You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2020/12/22 08:48:50 UTC

[GitHub] [incubator-yunikorn-core] yangwwei commented on a change in pull request #230: [YUNIKORN-484] Added waiting state timeout

yangwwei commented on a change in pull request #230:
URL: https://github.com/apache/incubator-yunikorn-core/pull/230#discussion_r547137672



##########
File path: pkg/scheduler/objects/application.go
##########
@@ -200,6 +204,17 @@ func (sa *Application) timeOutStarting() {
 	}
 }
 
+func (sa *Application) timeOutWaiting() {
+	if sa.IsWaiting() {
+		log.Logger().Warn("Application in waiting state timed out: marking it as completed",
+			zap.String("applicationID", sa.ApplicationID),
+			zap.String("state", sa.stateMachine.Current()))

Review comment:
       I don't think this is a WARN. We can change this to DEBUG safely

##########
File path: pkg/scheduler/objects/application.go
##########
@@ -42,6 +42,7 @@ import (
 var (
 	reservationDelay = 2 * time.Second
 	startingTimeout  = 5 * time.Minute
+	waitingTimeout   = 5 * time.Minute

Review comment:
       5 minutes is too long, which means a job finishes and the user needs to wait for 5 minutes until they see the job is completed.  I think we should make this more aggressive, e.g 30s.  Otherwise, users will continue to complain about the issue.

##########
File path: pkg/scheduler/objects/application.go
##########
@@ -61,7 +62,8 @@ type Application struct {
 	allocatedResource *resources.Resource    // total allocated resources
 	allocations       map[string]*Allocation // list of all allocations
 	stateMachine      *fsm.FSM               // application state machine
-	stateTimer        *time.Timer            // timer for state time
+	startingTimer     *time.Timer            // timer for state time
+	waitingTimer      *time.Timer            // timer for state time

Review comment:
       Can we use 1 timer? I like the previous stateTimer better.
   since this is a state machine, each time the app must be in one and only one state
   we just need to make sure the timer is set properly when entering the state and reset when it exits from the state. we can set different duration for the different state.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org