You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2021/03/19 23:41:52 UTC

[GitHub] [incubator-yunikorn-k8shim] yangwwei opened a new pull request #246: [YUNIKORN-584] App recovery is skipped when applicationID is not set in pods' label.

yangwwei opened a new pull request #246:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/246


   Credits to Chaoran who finds this bug and figured out the root cause.
   The symptom of this issue is described in this issue: https://issues.apache.org/jira/browse/YUNIKORN-584.
   The problem is when we list applications, we have a filter that checks label `applicationId` exists in pods, pods without this label will be excluded in this scan because [these lines.](https://github.com/apache/incubator-yunikorn-k8shim/blob/fd69f44d0028d09918715cd222775901400f936d/pkg/appmgmt/general/general.go#L297-L302). When we recover Spark jobs, spark pods do not necessarily have this label, we can recognize them using spark-app-selector. This caused the job to get excluded from the recovery app list, and then subsequent node recovery also failed because it finds the existing allocation for the job but the job was not recovered in the scheduler. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] kingamarton closed pull request #246: [YUNIKORN-584] App recovery is skipped when applicationID is not set in pods' label.

Posted by GitBox <gi...@apache.org>.
kingamarton closed pull request #246:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/246


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] wilfred-s commented on a change in pull request #246: [YUNIKORN-584] App recovery is skipped when applicationID is not set in pods' label.

Posted by GitBox <gi...@apache.org>.
wilfred-s commented on a change in pull request #246:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/246#discussion_r598378697



##########
File path: pkg/appmgmt/general/general_test.go
##########
@@ -526,3 +527,269 @@ func TestGetOwnerReferences(t *testing.T) {
 	assert.Equal(t, returnedOwnerRefs[0].Kind, "Pod", "Unexpected owner reference Kind")
 	assert.Equal(t, returnedOwnerRefs[0].APIVersion, v1.SchemeGroupVersion.String(), "Unexpected owner reference Kind")
 }
+
+// nolint: funlen
+func TestListApplication(t *testing.T) {
+	var app01, app02, app03, app04, app05, app06 = "app00001",
+		"app00002", "app00003", "app00004", "app00005", "app00006"
+	var queue01, queue02 = "root.queue01", "root.queue02"
+	var ns01, ns02 = "namespace01", "namespace02"
+
+	// mock the pod lister for this test
+	mockedAPIProvider := client.NewMockedAPIProvider()
+	mockedPodLister := test.NewPodListerMock()
+	mockedAPIProvider.SetPodLister(mockedPodLister)
+
+	// app01 pods, running in namespace01 and queue01
+	// all pods are having applicationID and queue name specified
+	// 2 pods have assigned nodes, 1 pod is pending for scheduling
+	// allocated pod
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00002",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00003",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	// app02 pods, running in queue02 and namespace02
+	// 2 pods are having applicationID and queue name specified
+	// both 2 pods are pending
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app02pod0001",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app02,
+				constants.LabelQueueName:     queue02,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app02pod0002",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app02,
+				constants.LabelQueueName:     queue02,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	// app03 pods, running in queue02 and namespace02
+	// 2 pods do not have label applicationID specified, but have spark-app-selector
+	// both 2 pods are allocated
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app03pod0001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.SparkLabelAppID: app03,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app03pod0002",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.SparkLabelAppID: app03,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})
+
+	// app04 pods, running in queue01 and namespace01
+	// app04 has 2 pods which only has annotation yunikorn.apache.org/app-id specified
+	// both 2 pods are allocated
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app04pod0001",
+			Namespace: ns01,
+			Annotations: map[string]string{
+				constants.AnnotationApplicationID: app04,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app04pod0002",
+			Namespace: ns01,
+			Annotations: map[string]string{
+				constants.AnnotationApplicationID: app04,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})
+
+	// app05 pods, running in queue01 and namespace01
+	// app05 pod has no label or annotation specified
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app05pod0001",
+			Namespace: ns01,
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})
+
+	// app06 pods, running in queue01 and namespace01
+	// pod has spark-app-selector set and it is allocated but not scheduled by yunikorn
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app06pod0001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.SparkLabelAppID: app06,
+			},
+		},
+		Spec: v1.PodSpec{
+			NodeName: "some-node",
+		},
+	})
+
+	// init the app manager and run listApp
+	am := NewManager(cache.NewMockedAMProtocol(), mockedAPIProvider)
+	apps, err := am.ListApplications()
+	assert.NilError(t, err)
+	assert.Equal(t, len(apps), 3)
+	_, exist := apps[app01]
+	assert.Equal(t, exist, true,
+		"app01 should be included in the list because "+
+			"it has applicationID and queue namespace specified in the"+
+			"queue and it has 2 pods allocated.")

Review comment:
       We do not have line length limits so do not split text over multiple lines there is no need.

##########
File path: pkg/appmgmt/general/general.go
##########
@@ -293,15 +292,8 @@ func (os *Manager) deletePod(obj interface{}) {
 }
 
 func (os *Manager) ListApplications() (map[string]interfaces.ApplicationMetadata, error) {
-	// selector: applicationID exist
-	slt := labels.NewSelector()
-	req, err := labels.NewRequirement(constants.LabelApplicationID, selection.Exists, nil)
-	if err != nil {
-		return nil, err
-	}
-	slt = slt.Add(*req)
-
 	// list all pods on this cluster
+	slt := labels.NewSelector()
 	appPods, err := os.apiProvider.GetAPIs().PodInformer.Lister().List(slt)

Review comment:
       With the new behaviour there is no need to store the selector.
   Can be simplified to `List(labels.NewSelector())` as it returns a typed nil.

##########
File path: pkg/appmgmt/general/general_test.go
##########
@@ -526,3 +527,269 @@ func TestGetOwnerReferences(t *testing.T) {
 	assert.Equal(t, returnedOwnerRefs[0].Kind, "Pod", "Unexpected owner reference Kind")
 	assert.Equal(t, returnedOwnerRefs[0].APIVersion, v1.SchemeGroupVersion.String(), "Unexpected owner reference Kind")
 }
+
+// nolint: funlen
+func TestListApplication(t *testing.T) {
+	var app01, app02, app03, app04, app05, app06 = "app00001",
+		"app00002", "app00003", "app00004", "app00005", "app00006"
+	var queue01, queue02 = "root.queue01", "root.queue02"
+	var ns01, ns02 = "namespace01", "namespace02"
+
+	// mock the pod lister for this test
+	mockedAPIProvider := client.NewMockedAPIProvider()
+	mockedPodLister := test.NewPodListerMock()
+	mockedAPIProvider.SetPodLister(mockedPodLister)
+
+	// app01 pods, running in namespace01 and queue01
+	// all pods are having applicationID and queue name specified
+	// 2 pods have assigned nodes, 1 pod is pending for scheduling
+	// allocated pod
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00002",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})

Review comment:
       This is a duplicate from app01pod00001 and does not test anything new, should be removed.

##########
File path: pkg/appmgmt/general/general_test.go
##########
@@ -526,3 +527,269 @@ func TestGetOwnerReferences(t *testing.T) {
 	assert.Equal(t, returnedOwnerRefs[0].Kind, "Pod", "Unexpected owner reference Kind")
 	assert.Equal(t, returnedOwnerRefs[0].APIVersion, v1.SchemeGroupVersion.String(), "Unexpected owner reference Kind")
 }
+
+// nolint: funlen
+func TestListApplication(t *testing.T) {
+	var app01, app02, app03, app04, app05, app06 = "app00001",
+		"app00002", "app00003", "app00004", "app00005", "app00006"
+	var queue01, queue02 = "root.queue01", "root.queue02"
+	var ns01, ns02 = "namespace01", "namespace02"
+
+	// mock the pod lister for this test
+	mockedAPIProvider := client.NewMockedAPIProvider()
+	mockedPodLister := test.NewPodListerMock()
+	mockedAPIProvider.SetPodLister(mockedPodLister)
+
+	// app01 pods, running in namespace01 and queue01
+	// all pods are having applicationID and queue name specified
+	// 2 pods have assigned nodes, 1 pod is pending for scheduling
+	// allocated pod
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00002",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00003",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	// app02 pods, running in queue02 and namespace02
+	// 2 pods are having applicationID and queue name specified
+	// both 2 pods are pending
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app02pod0001",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app02,
+				constants.LabelQueueName:     queue02,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})

Review comment:
       This is a duplicate from app01pod00003 and does not test anything new, should be removed.

##########
File path: pkg/appmgmt/general/general_test.go
##########
@@ -526,3 +527,269 @@ func TestGetOwnerReferences(t *testing.T) {
 	assert.Equal(t, returnedOwnerRefs[0].Kind, "Pod", "Unexpected owner reference Kind")
 	assert.Equal(t, returnedOwnerRefs[0].APIVersion, v1.SchemeGroupVersion.String(), "Unexpected owner reference Kind")
 }
+
+// nolint: funlen
+func TestListApplication(t *testing.T) {
+	var app01, app02, app03, app04, app05, app06 = "app00001",
+		"app00002", "app00003", "app00004", "app00005", "app00006"
+	var queue01, queue02 = "root.queue01", "root.queue02"
+	var ns01, ns02 = "namespace01", "namespace02"
+
+	// mock the pod lister for this test
+	mockedAPIProvider := client.NewMockedAPIProvider()
+	mockedPodLister := test.NewPodListerMock()
+	mockedAPIProvider.SetPodLister(mockedPodLister)
+
+	// app01 pods, running in namespace01 and queue01
+	// all pods are having applicationID and queue name specified
+	// 2 pods have assigned nodes, 1 pod is pending for scheduling
+	// allocated pod
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00002",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00003",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	// app02 pods, running in queue02 and namespace02
+	// 2 pods are having applicationID and queue name specified
+	// both 2 pods are pending
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app02pod0001",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app02,
+				constants.LabelQueueName:     queue02,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app02pod0002",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app02,
+				constants.LabelQueueName:     queue02,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})

Review comment:
       This is a duplicate from app01pod00003 and app02pod0001 and does not test anything new, should be removed.

##########
File path: pkg/appmgmt/general/general_test.go
##########
@@ -526,3 +527,269 @@ func TestGetOwnerReferences(t *testing.T) {
 	assert.Equal(t, returnedOwnerRefs[0].Kind, "Pod", "Unexpected owner reference Kind")
 	assert.Equal(t, returnedOwnerRefs[0].APIVersion, v1.SchemeGroupVersion.String(), "Unexpected owner reference Kind")
 }
+
+// nolint: funlen
+func TestListApplication(t *testing.T) {
+	var app01, app02, app03, app04, app05, app06 = "app00001",
+		"app00002", "app00003", "app00004", "app00005", "app00006"
+	var queue01, queue02 = "root.queue01", "root.queue02"
+	var ns01, ns02 = "namespace01", "namespace02"
+
+	// mock the pod lister for this test
+	mockedAPIProvider := client.NewMockedAPIProvider()
+	mockedPodLister := test.NewPodListerMock()
+	mockedAPIProvider.SetPodLister(mockedPodLister)
+
+	// app01 pods, running in namespace01 and queue01
+	// all pods are having applicationID and queue name specified
+	// 2 pods have assigned nodes, 1 pod is pending for scheduling
+	// allocated pod
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00002",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00003",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	// app02 pods, running in queue02 and namespace02
+	// 2 pods are having applicationID and queue name specified
+	// both 2 pods are pending
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app02pod0001",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app02,
+				constants.LabelQueueName:     queue02,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app02pod0002",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app02,
+				constants.LabelQueueName:     queue02,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	// app03 pods, running in queue02 and namespace02
+	// 2 pods do not have label applicationID specified, but have spark-app-selector
+	// both 2 pods are allocated
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app03pod0001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.SparkLabelAppID: app03,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app03pod0002",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.SparkLabelAppID: app03,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})

Review comment:
       This is a duplicate from app03pod0001 and does not test anything new, should be removed.

##########
File path: pkg/appmgmt/general/general_test.go
##########
@@ -526,3 +527,269 @@ func TestGetOwnerReferences(t *testing.T) {
 	assert.Equal(t, returnedOwnerRefs[0].Kind, "Pod", "Unexpected owner reference Kind")
 	assert.Equal(t, returnedOwnerRefs[0].APIVersion, v1.SchemeGroupVersion.String(), "Unexpected owner reference Kind")
 }
+
+// nolint: funlen
+func TestListApplication(t *testing.T) {
+	var app01, app02, app03, app04, app05, app06 = "app00001",
+		"app00002", "app00003", "app00004", "app00005", "app00006"
+	var queue01, queue02 = "root.queue01", "root.queue02"
+	var ns01, ns02 = "namespace01", "namespace02"
+
+	// mock the pod lister for this test
+	mockedAPIProvider := client.NewMockedAPIProvider()
+	mockedPodLister := test.NewPodListerMock()
+	mockedAPIProvider.SetPodLister(mockedPodLister)
+
+	// app01 pods, running in namespace01 and queue01
+	// all pods are having applicationID and queue name specified
+	// 2 pods have assigned nodes, 1 pod is pending for scheduling
+	// allocated pod
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00002",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "allocated-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app01pod00003",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app01,
+				constants.LabelQueueName:     queue01,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	// app02 pods, running in queue02 and namespace02
+	// 2 pods are having applicationID and queue name specified
+	// both 2 pods are pending
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app02pod0001",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app02,
+				constants.LabelQueueName:     queue02,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app02pod0002",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.LabelApplicationID: app02,
+				constants.LabelQueueName:     queue02,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+		},
+	})
+
+	// app03 pods, running in queue02 and namespace02
+	// 2 pods do not have label applicationID specified, but have spark-app-selector
+	// both 2 pods are allocated
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app03pod0001",
+			Namespace: ns01,
+			Labels: map[string]string{
+				constants.SparkLabelAppID: app03,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app03pod0002",
+			Namespace: ns02,
+			Labels: map[string]string{
+				constants.SparkLabelAppID: app03,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})
+
+	// app04 pods, running in queue01 and namespace01
+	// app04 has 2 pods which only has annotation yunikorn.apache.org/app-id specified
+	// both 2 pods are allocated
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app04pod0001",
+			Namespace: ns01,
+			Annotations: map[string]string{
+				constants.AnnotationApplicationID: app04,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})
+
+	mockedPodLister.AddPod(&v1.Pod{
+		TypeMeta: apis.TypeMeta{
+			Kind:       "Pod",
+			APIVersion: "v1",
+		},
+		ObjectMeta: apis.ObjectMeta{
+			Name:      "app04pod0002",
+			Namespace: ns01,
+			Annotations: map[string]string{
+				constants.AnnotationApplicationID: app04,
+			},
+		},
+		Spec: v1.PodSpec{
+			SchedulerName: constants.SchedulerName,
+			NodeName:      "some-node",
+		},
+	})

Review comment:
       This is a duplicate from app04pod0001 and does not test anything new, should be removed.

##########
File path: pkg/appmgmt/general/general_test.go
##########
@@ -526,3 +527,269 @@ func TestGetOwnerReferences(t *testing.T) {
 	assert.Equal(t, returnedOwnerRefs[0].Kind, "Pod", "Unexpected owner reference Kind")
 	assert.Equal(t, returnedOwnerRefs[0].APIVersion, v1.SchemeGroupVersion.String(), "Unexpected owner reference Kind")
 }
+
+// nolint: funlen
+func TestListApplication(t *testing.T) {

Review comment:
       This should be a table driven test.
   We have 4 things that can change: Labels, Annotations, SchedulerName and NodeName. The outcome is known: app exists or not. Setting up a table with those 4 inputs and the expected output. Create a new pod lister or add a reset function on the mocked pod lister and run the checks in a loop. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] kingamarton commented on a change in pull request #246: [YUNIKORN-584] App recovery is skipped when applicationID is not set in pods' label.

Posted by GitBox <gi...@apache.org>.
kingamarton commented on a change in pull request #246:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/246#discussion_r598683618



##########
File path: pkg/appmgmt/general/general_test.go
##########
@@ -526,3 +527,269 @@ func TestGetOwnerReferences(t *testing.T) {
 	assert.Equal(t, returnedOwnerRefs[0].Kind, "Pod", "Unexpected owner reference Kind")
 	assert.Equal(t, returnedOwnerRefs[0].APIVersion, v1.SchemeGroupVersion.String(), "Unexpected owner reference Kind")
 }
+
+// nolint: funlen
+func TestListApplication(t *testing.T) {

Review comment:
       I agree with @wilfred-s. When possible we should use table driven tests. Is much easier to read, and it will be shorter as well, so we might not need to skip the lint funlen check, if we do, then maybe it worth to split the test instead of just ignoring it. 

##########
File path: pkg/appmgmt/general/general.go
##########
@@ -293,15 +292,8 @@ func (os *Manager) deletePod(obj interface{}) {
 }
 
 func (os *Manager) ListApplications() (map[string]interfaces.ApplicationMetadata, error) {
-	// selector: applicationID exist
-	slt := labels.NewSelector()
-	req, err := labels.NewRequirement(constants.LabelApplicationID, selection.Exists, nil)
-	if err != nil {
-		return nil, err
-	}
-	slt = slt.Add(*req)

Review comment:
       Instead of removing the label selector I think we should modify the selector to check for either appID, or spark-app-selector. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on pull request #246: [YUNIKORN-584] App recovery is skipped when applicationID is not set in pods' label.

Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #246:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/246#issuecomment-804182482


   The review comments about the UT could not coverage, it needs some more time to get fully addressed.
   To unblock the 0.10 build, I suggest getting this fix in and I have filed a follow-up JIRA: https://issues.apache.org/jira/browse/YUNIKORN-593.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] kingamarton commented on a change in pull request #246: [YUNIKORN-584] App recovery is skipped when applicationID is not set in pods' label.

Posted by GitBox <gi...@apache.org>.
kingamarton commented on a change in pull request #246:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/246#discussion_r598686935



##########
File path: pkg/appmgmt/general/general.go
##########
@@ -293,15 +292,8 @@ func (os *Manager) deletePod(obj interface{}) {
 }
 
 func (os *Manager) ListApplications() (map[string]interfaces.ApplicationMetadata, error) {
-	// selector: applicationID exist
-	slt := labels.NewSelector()
-	req, err := labels.NewRequirement(constants.LabelApplicationID, selection.Exists, nil)
-	if err != nil {
-		return nil, err
-	}
-	slt = slt.Add(*req)

Review comment:
       Instead of removing the label selector I think we should modify the selector to check for either appID, or spark-app-selector. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] commented on pull request #246: [YUNIKORN-584] App recovery is skipped when applicationID is not set in pods' label.

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #246:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/246#issuecomment-803198706


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246?src=pr&el=h1) Report
   > Merging [#246](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246?src=pr&el=desc) (d0c8de5) into [master](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/commit/c47ed51f075c5af5910f71da40e7e68699a9abae?el=desc) (c47ed51) will **increase** coverage by `0.21%`.
   > The diff coverage is `59.65%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #246      +/-   ##
   ==========================================
   + Coverage   59.75%   59.96%   +0.21%     
   ==========================================
     Files          35       35              
     Lines        3133     3232      +99     
   ==========================================
   + Hits         1872     1938      +66     
   - Misses       1180     1212      +32     
   - Partials       81       82       +1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [pkg/appmgmt/appmgmt\_recovery.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL2FwcG1nbXQvYXBwbWdtdF9yZWNvdmVyeS5nbw==) | `67.50% <0.00%> (-8.18%)` | :arrow_down: |
   | [pkg/cache/amprotocol\_mock.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL2NhY2hlL2FtcHJvdG9jb2xfbW9jay5nbw==) | `0.00% <0.00%> (ø)` | |
   | [pkg/cache/task.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL2NhY2hlL3Rhc2suZ28=) | `74.40% <ø> (ø)` | |
   | [pkg/common/resource.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL2NvbW1vbi9yZXNvdXJjZS5nbw==) | `90.72% <0.00%> (-9.28%)` | :arrow_down: |
   | [pkg/common/utils/gang\_utils.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL2NvbW1vbi91dGlscy9nYW5nX3V0aWxzLmdv) | `67.94% <0.00%> (-13.59%)` | :arrow_down: |
   | [pkg/controller/application/app\_controller.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL2NvbnRyb2xsZXIvYXBwbGljYXRpb24vYXBwX2NvbnRyb2xsZXIuZ28=) | `71.05% <ø> (-0.26%)` | :arrow_down: |
   | [...missioncontrollers/webhook/admission\_controller.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL3BsdWdpbi9hZG1pc3Npb25jb250cm9sbGVycy93ZWJob29rL2FkbWlzc2lvbl9jb250cm9sbGVyLmdv) | `33.74% <0.00%> (+1.00%)` | :arrow_up: |
   | [pkg/cache/application\_events.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL2NhY2hlL2FwcGxpY2F0aW9uX2V2ZW50cy5nbw==) | `45.00% <8.33%> (-8.07%)` | :arrow_down: |
   | [pkg/cache/application.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL2NhY2hlL2FwcGxpY2F0aW9uLmdv) | `74.11% <68.00%> (-2.64%)` | :arrow_down: |
   | [pkg/common/si\_helper.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree#diff-cGtnL2NvbW1vbi9zaV9oZWxwZXIuZ28=) | `63.15% <80.00%> (ø)` | |
   | ... and [9 more](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246?src=pr&el=footer). Last update [fd69f44...d0c8de5](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/246?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org