You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2022/03/08 05:40:40 UTC

[GitHub] [incubator-yunikorn-core] wilfred-s commented on a change in pull request #380: [Yunikorn 1093] Track rejected applications

wilfred-s commented on a change in pull request #380:
URL: https://github.com/apache/incubator-yunikorn-core/pull/380#discussion_r821333550



##########
File path: pkg/scheduler/objects/application.go
##########
@@ -1507,3 +1509,22 @@ func (sa *Application) IsAllocationAssignedToApp(alloc *Allocation) bool {
 	_, ok := sa.allocations[alloc.UUID]
 	return ok
 }
+
+func (sa *Application) SetRejectionMessage(rejectionMessage string) {
+	sa.Lock()
+	defer sa.Unlock()
+	sa.rejectionMessage = rejectionMessage
+}
+
+func (sa *Application) GetRejectionMessage() string {
+	sa.RLock()
+	defer sa.RUnlock()
+	return sa.rejectionMessage
+}
+
+// only for rejected application
+func (sa *Application) SetFinishedTime() {
+	sa.Lock()
+	defer sa.Unlock()
+	sa.finishedTime = time.Now()

Review comment:
       Add test coverage in `TestFinishedTime()`

##########
File path: pkg/scheduler/objects/application.go
##########
@@ -1507,3 +1509,22 @@ func (sa *Application) IsAllocationAssignedToApp(alloc *Allocation) bool {
 	_, ok := sa.allocations[alloc.UUID]
 	return ok
 }
+
+func (sa *Application) SetRejectionMessage(rejectionMessage string) {
+	sa.Lock()
+	defer sa.Unlock()
+	sa.rejectionMessage = rejectionMessage
+}
+
+func (sa *Application) GetRejectionMessage() string {
+	sa.RLock()
+	defer sa.RUnlock()
+	return sa.rejectionMessage

Review comment:
       Should add test coverage, specially when setting the message from the state change

##########
File path: pkg/webservice/handlers.go
##########
@@ -129,6 +129,9 @@ func getApplicationsInfo(w http.ResponseWriter, r *http.Request) {
 		for _, app := range partition.GetCompletedApplications() {
 			addDao(app)
 		}
+		for _, app := range partition.GetRejectedApplications() {
+			addDao(app)
+		}

Review comment:
       Can we add a test in the `handler_test` to cover this?

##########
File path: pkg/scheduler/context.go
##########
@@ -503,6 +503,11 @@ func (cc *ClusterContext) handleRMUpdateApplicationEvent(event *rmevent.RMUpdate
 				ApplicationID: app.ApplicationID,
 				Reason:        err.Error(),
 			})
+			rejectedApp := objects.NewApplication(app, ugi, cc.rmEventHandler, request.RmID)
+			stateChangeErr := partition.addRejectedApplication(rejectedApp, err.Error())

Review comment:
       We do not need to store this in a local var, just pass in directly into the call.
   Also see comment below around the signature for `addRejectedApplication`

##########
File path: pkg/scheduler/partition.go
##########
@@ -1445,3 +1490,19 @@ func (pc *PartitionContext) hasUnlimitedNode() bool {
 	}
 	return false
 }
+
+func (pc *PartitionContext) addRejectedApplication(rejectedApplication *objects.Application, rejectionMessage string) error {
+	if err := rejectedApplication.HandleApplicationEvent(objects.RejectApplication); err != nil {
+		log.Logger().Warn("Application state not changed to Rejected",
+			zap.String("currentState", rejectedApplication.CurrentState()),
+			zap.Error(err))
+		return err
+	}

Review comment:
       Why are you returning this error? There is nothing that we can do with the error outside of this function. It also looks like it break processing of the message in the calling function.
   This change should also never fail. Change this to something like `log.Logger().Warn("BUG: Unexpected failure...` and continue processing. Do not return the error.
   
   NOTE: calling the event system will also call `OnStateChange()` as we enter a new state. That triggers an event to the shim to be created. That event has the wrong info in it as we do not want pass it back as an updated application. We need to filter out RejectApplication state changes.

##########
File path: pkg/scheduler/partition.go
##########
@@ -443,6 +444,13 @@ func (pc *PartitionContext) getApplication(appID string) *objects.Application {
 	return pc.applications[appID]
 }
 
+func (pc *PartitionContext) getRejectedApplication(appID string) *objects.Application {
+	pc.RLock()
+	defer pc.RUnlock()
+
+	return pc.rejectedApplications[appID]
+}
+

Review comment:
       This is just used once in a test, in the line above in the test we directly check the map. Accessing the map in the test directly should be OK.

##########
File path: pkg/scheduler/partition.go
##########
@@ -1445,3 +1490,19 @@ func (pc *PartitionContext) hasUnlimitedNode() bool {
 	}
 	return false
 }
+
+func (pc *PartitionContext) addRejectedApplication(rejectedApplication *objects.Application, rejectionMessage string) error {
+	if err := rejectedApplication.HandleApplicationEvent(objects.RejectApplication); err != nil {
+		log.Logger().Warn("Application state not changed to Rejected",
+			zap.String("currentState", rejectedApplication.CurrentState()),
+			zap.Error(err))
+		return err
+	}
+	rejectedApplication.SetFinishedTime()
+	rejectedApplication.SetRejectionMessage(rejectionMessage)

Review comment:
       Why do we not wrap this up in the state change call?
   You already set the state timer in that call why not add these two to that action.
   We also would not need functions to set these values in that case as the application object can not be accessed by anything yet.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org