You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2021/05/05 22:39:09 UTC

[GitHub] [incubator-yunikorn-k8shim] yuchaoran-apple opened a new pull request #258: Shim side changes for YUNIKORN-657

yuchaoran-apple opened a new pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258


   ### What is this PR for?
   When an application fails or is rejected by the scheduler, the pods currently stay in pending forever without visibility on what happened to them. This PR propagates the scheduling error or failure to the pods so that the user will be able to see the reason of failure. If third-party operators (e.g. Spark K8s Operator) are used, they can also react according to the pod status change and update the status of the corresponding CRD and possibly retry.
   
   ### What type of PR is it?
   * [ ] - Bug Fix
   * [ ] - Improvement
   * [ ] - Feature
   * [ ] - Documentation
   * [ ] - Hot Fix
   * [ ] - Refactoring
   
   ### Todos
   * [ ] - Task
   
   ### What is the Jira issue?
   https://issues.apache.org/jira/browse/YUNIKORN-657
   
   ### How should this be tested?
   Added unit tests to cover this change. When running in a cluster, tested that scheduling failures will be reflected on the pods:
   
   For example, if an application is rejected, its pods will be marked as failed and its `status` field will be populated with the following information:
   ```
   status:
     message: ' application ''spark-<ID>'' rejected, cannot
       create queue ''root.spark'' without placement rules'
     phase: Failed
     reason: ApplicationRejected
   ```
   Another example is that if a gang-scheduled application cannot successfully run all placeholders before they expire, the app pods' `status` field will show:
   ```
   status:
     message: Scheduling has timed out due to insufficient resources
     phase: Failed
     reason: InsufficientResources
   ```
   
   ### Screenshots (if appropriate)
   
   ### Questions:
   * [ ] - The licenses files need update.
   * [ ] - There is breaking changes for older versions.
   * [ ] - It needs documentation.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yuchaoran-apple commented on a change in pull request #258: Shim side changes for YUNIKORN-657

Posted by GitBox <gi...@apache.org>.
yuchaoran-apple commented on a change in pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258#discussion_r628816727



##########
File path: pkg/shim/scheduler.go
##########
@@ -234,7 +234,17 @@ func (ss *KubernetesShim) GetSchedulerState() string {
 func (ss *KubernetesShim) handle(se events.SchedulerEvent) error {
 	ss.lock.Lock()
 	defer ss.lock.Unlock()
-	err := ss.stateMachine.Event(string(se.GetEvent()))
+	var err error
+	if len(se.GetArgs()) >= 1 {
+		eventArgs := make([]string, 2)
+		if err = events.GetEventArgsAsStrings(eventArgs, se.GetArgs()); err != nil {
+			log.Logger().Error("fail to parse event arg", zap.Error(err))
+		}
+		message := eventArgs[1]
+		err = ss.stateMachine.Event(string(se.GetEvent()), message)
+	} else {
+		err = ss.stateMachine.Event(string(se.GetEvent()))
+	}
 	if err != nil && err.Error() == "no transition" {

Review comment:
       Good point. Updated accordingly. Now it's more generic and cleaner πŸ‘ 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] commented on pull request #258: Shim side changes for YUNIKORN-657

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258#issuecomment-833173736


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#258](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (23891c4) into [master](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/commit/c47ed51f075c5af5910f71da40e7e68699a9abae?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c47ed51) will **increase** coverage by `0.51%`.
   > The diff coverage is `64.20%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #258      +/-   ##
   ==========================================
   + Coverage   59.75%   60.26%   +0.51%     
   ==========================================
     Files          35       36       +1     
     Lines        3133     3350     +217     
   ==========================================
   + Hits         1872     2019     +147     
   - Misses       1180     1246      +66     
   - Partials       81       85       +4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Ξ” | |
   |---|---|---|
   | [pkg/appmgmt/appmgmt\_recovery.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2FwcG1nbXQvYXBwbWdtdF9yZWNvdmVyeS5nbw==) | `67.50% <0.00%> (-8.18%)` | :arrow_down: |
   | [pkg/cache/amprotocol\_mock.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NhY2hlL2FtcHJvdG9jb2xfbW9jay5nbw==) | `0.00% <0.00%> (ΓΈ)` | |
   | [pkg/cache/node.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NhY2hlL25vZGUuZ28=) | `84.55% <ΓΈ> (ΓΈ)` | |
   | [pkg/cache/nodes.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NhY2hlL25vZGVzLmdv) | `79.80% <ΓΈ> (ΓΈ)` | |
   | [pkg/common/events/recorder\_mock.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbW1vbi9ldmVudHMvcmVjb3JkZXJfbW9jay5nbw==) | `0.00% <0.00%> (ΓΈ)` | |
   | [pkg/common/resource.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbW1vbi9yZXNvdXJjZS5nbw==) | `90.72% <0.00%> (-9.28%)` | :arrow_down: |
   | [pkg/common/utils/gang\_utils.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbW1vbi91dGlscy9nYW5nX3V0aWxzLmdv) | `65.43% <0.00%> (-16.11%)` | :arrow_down: |
   | [pkg/controller/application/app\_controller.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbnRyb2xsZXIvYXBwbGljYXRpb24vYXBwX2NvbnRyb2xsZXIuZ28=) | `71.05% <ΓΈ> (-0.26%)` | :arrow_down: |
   | [...missioncontrollers/webhook/admission\_controller.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL3BsdWdpbi9hZG1pc3Npb25jb250cm9sbGVycy93ZWJob29rL2FkbWlzc2lvbl9jb250cm9sbGVyLmdv) | `33.74% <0.00%> (+1.00%)` | :arrow_up: |
   | [pkg/shim/main.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL3NoaW0vbWFpbi5nbw==) | `0.00% <ΓΈ> (ΓΈ)` | |
   | ... and [20 more](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Ξ” = absolute <relative> (impact)`, `ΓΈ = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [54c9b31...23891c4](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #258: Shim side changes for YUNIKORN-657

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258#issuecomment-833173736


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#258](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (23891c4) into [master](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/commit/c47ed51f075c5af5910f71da40e7e68699a9abae?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c47ed51) will **increase** coverage by `0.51%`.
   > The diff coverage is `64.20%`.
   
   > :exclamation: Current head 23891c4 differs from pull request most recent head 5b5c36d. Consider uploading reports for the commit 5b5c36d to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #258      +/-   ##
   ==========================================
   + Coverage   59.75%   60.26%   +0.51%     
   ==========================================
     Files          35       36       +1     
     Lines        3133     3350     +217     
   ==========================================
   + Hits         1872     2019     +147     
   - Misses       1180     1246      +66     
   - Partials       81       85       +4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Ξ” | |
   |---|---|---|
   | [pkg/appmgmt/appmgmt\_recovery.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2FwcG1nbXQvYXBwbWdtdF9yZWNvdmVyeS5nbw==) | `67.50% <0.00%> (-8.18%)` | :arrow_down: |
   | [pkg/cache/amprotocol\_mock.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NhY2hlL2FtcHJvdG9jb2xfbW9jay5nbw==) | `0.00% <0.00%> (ΓΈ)` | |
   | [pkg/cache/node.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NhY2hlL25vZGUuZ28=) | `84.55% <ΓΈ> (ΓΈ)` | |
   | [pkg/cache/nodes.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NhY2hlL25vZGVzLmdv) | `79.80% <ΓΈ> (ΓΈ)` | |
   | [pkg/common/events/recorder\_mock.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbW1vbi9ldmVudHMvcmVjb3JkZXJfbW9jay5nbw==) | `0.00% <0.00%> (ΓΈ)` | |
   | [pkg/common/resource.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbW1vbi9yZXNvdXJjZS5nbw==) | `90.72% <0.00%> (-9.28%)` | :arrow_down: |
   | [pkg/common/utils/gang\_utils.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbW1vbi91dGlscy9nYW5nX3V0aWxzLmdv) | `65.43% <0.00%> (-16.11%)` | :arrow_down: |
   | [pkg/controller/application/app\_controller.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbnRyb2xsZXIvYXBwbGljYXRpb24vYXBwX2NvbnRyb2xsZXIuZ28=) | `71.05% <ΓΈ> (-0.26%)` | :arrow_down: |
   | [...missioncontrollers/webhook/admission\_controller.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL3BsdWdpbi9hZG1pc3Npb25jb250cm9sbGVycy93ZWJob29rL2FkbWlzc2lvbl9jb250cm9sbGVyLmdv) | `33.74% <0.00%> (+1.00%)` | :arrow_up: |
   | [pkg/shim/main.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL3NoaW0vbWFpbi5nbw==) | `0.00% <ΓΈ> (ΓΈ)` | |
   | ... and [20 more](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Ξ” = absolute <relative> (impact)`, `ΓΈ = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [54c9b31...5b5c36d](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #258: Shim side changes for YUNIKORN-657

Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258#discussion_r628802489



##########
File path: pkg/shim/scheduler.go
##########
@@ -234,7 +234,17 @@ func (ss *KubernetesShim) GetSchedulerState() string {
 func (ss *KubernetesShim) handle(se events.SchedulerEvent) error {
 	ss.lock.Lock()
 	defer ss.lock.Unlock()
-	err := ss.stateMachine.Event(string(se.GetEvent()))
+	var err error
+	if len(se.GetArgs()) >= 1 {
+		eventArgs := make([]string, 2)
+		if err = events.GetEventArgsAsStrings(eventArgs, se.GetArgs()); err != nil {
+			log.Logger().Error("fail to parse event arg", zap.Error(err))
+		}
+		message := eventArgs[1]
+		err = ss.stateMachine.Event(string(se.GetEvent()), message)
+	} else {
+		err = ss.stateMachine.Event(string(se.GetEvent()))
+	}
 	if err != nil && err.Error() == "no transition" {

Review comment:
       Can we use a more general form? The interface looks like
   
   ```
   type SchedulerEvent interface {
     GetEvent() SchedulerEventType
     GetArgs() []interface{}
   }
   ```
   
   Can we leverage the args here:
   
   ```
   err = ss.stateMachine.Event(string(se.GetEvent()), se.GetArgs())
   ```

##########
File path: pkg/common/utils/utils.go
##########
@@ -58,6 +58,10 @@ func NeedRecovery(pod *v1.Pod) (bool, error) {
 		return true, nil
 	}
 
+	if pod.Status.Phase == v1.PodFailed && pod.Status.Reason == constants.ApplicationRejectedFailure || pod.Status.Reason == constants.ApplicationInsufficientResourcesFailure {
+		return false, nil
+	}

Review comment:
       Can we simply return false here when pod phase is Failed? 
   Do we have to check the reason?
   

##########
File path: pkg/client/kubeclient.go
##########
@@ -118,3 +118,28 @@ func (nc SchedulerKubeClient) Delete(pod *v1.Pod) error {
 	}
 	return nil
 }
+
+func (nc SchedulerKubeClient) Get(podNamespace string, podName string) (*v1.Pod, error) {
+	pod, err := nc.clientSet.CoreV1().Pods(podNamespace).Get(podName, apis.GetOptions{})
+	if err != nil {
+		log.Logger().Warn("failed to get pod",
+			zap.String("namespace", pod.Namespace),
+			zap.String("podName", pod.Name),
+			zap.Error(err))
+		return nil, err
+	}
+	return pod, nil
+}
+
+func (nc SchedulerKubeClient) Update(pod *v1.Pod) (*v1.Pod, error) {
+	var updatedPod *v1.Pod
+	var err error
+	if updatedPod, err = nc.clientSet.CoreV1().Pods(pod.Namespace).UpdateStatus(pod); err != nil {
+		log.Logger().Warn("failed to update pod",
+			zap.String("namespace", pod.Namespace),
+			zap.String("podName", pod.Name),
+			zap.Error(err))
+		return pod, err
+	}
+	return updatedPod, nil
+}

Review comment:
       The function is called "Update" but the actual impl is "UpdateStatus".
   I think we should either re-implement this with "nc.clientSet.CoreV1().Pods(pod.Namespace).Update()" or change this function name to "UpdateStatus" to avoid mis-usage.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #258: Shim side changes for YUNIKORN-657

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258#issuecomment-833173736


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#258](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5b5c36d) into [master](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/commit/c47ed51f075c5af5910f71da40e7e68699a9abae?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c47ed51) will **increase** coverage by `0.90%`.
   > The diff coverage is `65.15%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #258      +/-   ##
   ==========================================
   + Coverage   59.75%   60.65%   +0.90%     
   ==========================================
     Files          35       36       +1     
     Lines        3133     3342     +209     
   ==========================================
   + Hits         1872     2027     +155     
   - Misses       1180     1232      +52     
   - Partials       81       83       +2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Ξ” | |
   |---|---|---|
   | [pkg/appmgmt/appmgmt\_recovery.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2FwcG1nbXQvYXBwbWdtdF9yZWNvdmVyeS5nbw==) | `67.50% <0.00%> (-8.18%)` | :arrow_down: |
   | [pkg/cache/amprotocol\_mock.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NhY2hlL2FtcHJvdG9jb2xfbW9jay5nbw==) | `0.00% <0.00%> (ΓΈ)` | |
   | [pkg/cache/node.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NhY2hlL25vZGUuZ28=) | `84.55% <ΓΈ> (ΓΈ)` | |
   | [pkg/cache/nodes.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NhY2hlL25vZGVzLmdv) | `79.80% <ΓΈ> (ΓΈ)` | |
   | [pkg/common/events/recorder\_mock.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbW1vbi9ldmVudHMvcmVjb3JkZXJfbW9jay5nbw==) | `0.00% <0.00%> (ΓΈ)` | |
   | [pkg/common/resource.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbW1vbi9yZXNvdXJjZS5nbw==) | `90.72% <0.00%> (-9.28%)` | :arrow_down: |
   | [pkg/common/utils/gang\_utils.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbW1vbi91dGlscy9nYW5nX3V0aWxzLmdv) | `65.43% <0.00%> (-16.11%)` | :arrow_down: |
   | [pkg/controller/application/app\_controller.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL2NvbnRyb2xsZXIvYXBwbGljYXRpb24vYXBwX2NvbnRyb2xsZXIuZ28=) | `71.05% <ΓΈ> (-0.26%)` | :arrow_down: |
   | [...missioncontrollers/webhook/admission\_controller.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL3BsdWdpbi9hZG1pc3Npb25jb250cm9sbGVycy93ZWJob29rL2FkbWlzc2lvbl9jb250cm9sbGVyLmdv) | `33.74% <0.00%> (+1.00%)` | :arrow_up: |
   | [pkg/shim/main.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL3NoaW0vbWFpbi5nbw==) | `0.00% <ΓΈ> (ΓΈ)` | |
   | ... and [20 more](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Ξ” = absolute <relative> (impact)`, `ΓΈ = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [54c9b31...5b5c36d](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/258?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yuchaoran-apple commented on a change in pull request #258: Shim side changes for YUNIKORN-657

Posted by GitBox <gi...@apache.org>.
yuchaoran-apple commented on a change in pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258#discussion_r628816618



##########
File path: pkg/client/kubeclient.go
##########
@@ -118,3 +118,28 @@ func (nc SchedulerKubeClient) Delete(pod *v1.Pod) error {
 	}
 	return nil
 }
+
+func (nc SchedulerKubeClient) Get(podNamespace string, podName string) (*v1.Pod, error) {
+	pod, err := nc.clientSet.CoreV1().Pods(podNamespace).Get(podName, apis.GetOptions{})
+	if err != nil {
+		log.Logger().Warn("failed to get pod",
+			zap.String("namespace", pod.Namespace),
+			zap.String("podName", pod.Name),
+			zap.Error(err))
+		return nil, err
+	}
+	return pod, nil
+}
+
+func (nc SchedulerKubeClient) Update(pod *v1.Pod) (*v1.Pod, error) {
+	var updatedPod *v1.Pod
+	var err error
+	if updatedPod, err = nc.clientSet.CoreV1().Pods(pod.Namespace).UpdateStatus(pod); err != nil {
+		log.Logger().Warn("failed to update pod",
+			zap.String("namespace", pod.Namespace),
+			zap.String("podName", pod.Name),
+			zap.Error(err))
+		return pod, err
+	}
+	return updatedPod, nil
+}

Review comment:
       Good catch. Changed to `UpdateStatus`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yuchaoran-apple commented on a change in pull request #258: Shim side changes for YUNIKORN-657

Posted by GitBox <gi...@apache.org>.
yuchaoran-apple commented on a change in pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258#discussion_r628816656



##########
File path: pkg/common/utils/utils.go
##########
@@ -58,6 +58,10 @@ func NeedRecovery(pod *v1.Pod) (bool, error) {
 		return true, nil
 	}
 
+	if pod.Status.Phase == v1.PodFailed && pod.Status.Reason == constants.ApplicationRejectedFailure || pod.Status.Reason == constants.ApplicationInsufficientResourcesFailure {
+		return false, nil
+	}

Review comment:
       You are right. Removed reason check




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei merged pull request #258: [YUNIKORN-657] Expose application failure reasons

Posted by GitBox <gi...@apache.org>.
yangwwei merged pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on pull request #258: Shim side changes for YUNIKORN-657

Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #258:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/258#issuecomment-835541512


   Overall looks good, thanks for improving this! Please see some minor comments, thank you @yuchaoran-apple .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org