You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2020/10/28 06:59:36 UTC

[GitHub] [incubator-yunikorn-k8shim] yangwwei opened a new pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

yangwwei opened a new pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203


   This PR addresses the following tasks:
   1. Add taskGroups to apps/tasks
   2. Add `Placeholder` and `PlaceholderManager` to manage placeholders
   3. Add `Reserving` state in apps to handle taskGroups
   3. Related UTs


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#issuecomment-717784125


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base (`YUNIKORN-2@6eba8fd`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             YUNIKORN-2     #203   +/-   ##
   =============================================
     Coverage              ?   58.62%           
   =============================================
     Files                 ?       36           
     Lines                 ?     3031           
     Branches              ?        0           
   =============================================
     Hits                  ?     1777           
     Misses                ?     1181           
     Partials              ?       73           
   ```
   
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=footer). Last update [6eba8fd...19590cb](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#discussion_r514614697



##########
File path: pkg/cache/placeholder_manager.go
##########
@@ -0,0 +1,84 @@
+/*
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/
+
+package cache
+
+import (
+	"sync"
+
+	"go.uber.org/zap"
+
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/client"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/utils"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/log"
+)
+
+// placeholder manager is a service to manage the lifecycle of app placeholders
+type PlaceholderManager struct {
+	clients *client.Clients
+	sync.RWMutex
+}
+
+var placeholderMgr *PlaceholderManager
+var once sync.Once
+
+func GetPlaceholderManager() *PlaceholderManager {
+	once.Do(func() {
+		placeholderMgr = &PlaceholderManager{}

Review comment:
       Thanks for pointing this out. That is not included in this patch yet. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#discussion_r514615489



##########
File path: pkg/cache/application_test.go
##########
@@ -181,19 +190,192 @@ func TestGetNonTerminatedTaskAlias(t *testing.T) {
 	assert.Assert(t, is.Contains(res, "/test-00001"))
 	assert.Assert(t, is.Contains(res, "/test-00002"))
 
-	//set two tasks to terminated states
+	// set two tasks to terminated states
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Rejected)
 	// check the tasks both in terminated states
 	// res should retuen empty
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 0)
 
-	//set two tasks to one is terminated, another is non-terminated
+	// set two tasks to one is terminated, another is non-terminated
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Allocated)
 	// check the task, should only return task2's alias
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 1)
 	assert.Equal(t, res[0], "/test-00002")
 }
+
+func TestSetTaskGroupsAndSchedulingPolicy(t *testing.T) {
+	app := NewApplication("app01", "root.a", "test-user", map[string]string{}, newMockSchedulerAPI())
+	assert.Assert(t, app.getSchedulingPolicy() == nil)
+	assert.Assert(t, app.getTaskGroups() == nil)
+
+	app.setSchedulingPolicy(&v1alpha1.SchedulingPolicy{
+		Type: v1alpha1.TryReserve,
+		Parameters: map[string]string{
+			"option-1": "value-1",
+			"option-2": "value-2",
+		},
+	})
+
+	assert.Equal(t, app.getSchedulingPolicy().Type, v1alpha1.TryReserve)
+	assert.Equal(t, len(app.getSchedulingPolicy().Parameters), 2)
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-1"], "value-1")
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-2"], "value-2")

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#discussion_r514616281



##########
File path: pkg/cache/application_test.go
##########
@@ -181,19 +190,192 @@ func TestGetNonTerminatedTaskAlias(t *testing.T) {
 	assert.Assert(t, is.Contains(res, "/test-00001"))
 	assert.Assert(t, is.Contains(res, "/test-00002"))
 
-	//set two tasks to terminated states
+	// set two tasks to terminated states
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Rejected)
 	// check the tasks both in terminated states
 	// res should retuen empty
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 0)
 
-	//set two tasks to one is terminated, another is non-terminated
+	// set two tasks to one is terminated, another is non-terminated
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Allocated)
 	// check the task, should only return task2's alias
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 1)
 	assert.Equal(t, res[0], "/test-00002")
 }
+
+func TestSetTaskGroupsAndSchedulingPolicy(t *testing.T) {
+	app := NewApplication("app01", "root.a", "test-user", map[string]string{}, newMockSchedulerAPI())
+	assert.Assert(t, app.getSchedulingPolicy() == nil)
+	assert.Assert(t, app.getTaskGroups() == nil)
+
+	app.setSchedulingPolicy(&v1alpha1.SchedulingPolicy{
+		Type: v1alpha1.TryReserve,
+		Parameters: map[string]string{
+			"option-1": "value-1",
+			"option-2": "value-2",
+		},
+	})
+
+	assert.Equal(t, app.getSchedulingPolicy().Type, v1alpha1.TryReserve)
+	assert.Equal(t, len(app.getSchedulingPolicy().Parameters), 2)
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-1"], "value-1")
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-2"], "value-2")
+
+	duration := int64(3000)
+	app.setTaskGroups([]*v1alpha1.TaskGroup{
+		{
+			Name:      "test-group-1",
+			MinMember: 10,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("500m"),
+				"memory": resource.MustParse("500Mi"),
+			},
+		},
+		{
+			Name:      "test-group-2",
+			MinMember: 20,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("1000m"),
+				"memory": resource.MustParse("1000Mi"),

Review comment:
       I've changed to `v1.ResourceMemory.String()`.
   This is because in yunikorn we are calling "vcore", not "cpu".
   that is slight different that K8s.
   And here, since the annotation is user-facing, we should follow the K8s convention.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] commented on pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#issuecomment-717784125


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base (`YUNIKORN-2@6eba8fd`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             YUNIKORN-2     #203   +/-   ##
   =============================================
     Coverage              ?   58.62%           
   =============================================
     Files                 ?       36           
     Lines                 ?     3031           
     Branches              ?        0           
   =============================================
     Hits                  ?     1777           
     Misses                ?     1181           
     Partials              ?       73           
   ```
   
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=footer). Last update [6eba8fd...19590cb](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#discussion_r514615234



##########
File path: pkg/cache/placeholder_manager.go
##########
@@ -0,0 +1,84 @@
+/*
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/
+
+package cache
+
+import (
+	"sync"
+
+	"go.uber.org/zap"
+
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/client"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/utils"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/log"
+)
+
+// placeholder manager is a service to manage the lifecycle of app placeholders
+type PlaceholderManager struct {
+	clients *client.Clients
+	sync.RWMutex
+}
+
+var placeholderMgr *PlaceholderManager
+var once sync.Once
+
+func GetPlaceholderManager() *PlaceholderManager {
+	once.Do(func() {
+		placeholderMgr = &PlaceholderManager{}
+	})
+	return placeholderMgr
+}
+
+func (mgr *PlaceholderManager) createAppPlaceholders(app *Application) error {
+	mgr.Lock()
+	defer mgr.Unlock()
+
+	// iterate all task groups, create placeholders for all the min members
+	for _, tg := range app.getTaskGroups() {
+		for i := int32(0); i < tg.MinMember; i++ {
+			placeholderName := utils.GeneratePlaceholderName(tg.Name, app.GetApplicationID(), i)
+			placeholder := newPlaceholder(placeholderName, app, tg)
+			// create the placeholder on K8s
+			_, err := mgr.clients.KubeClient.Create(placeholder.pod)
+			if err != nil {
+				// if failed to create the place holder pod
+				// caller should handle this error
+				log.Logger().Error("failed to create placeholder pod",
+					zap.Error(err))
+				return err
+			}
+			log.Logger().Info("placeholder created",
+				zap.String("placeholder", placeholder.String()))
+		}
+	}
+
+	return nil
+}
+
+// recycle all the placeholders for an application
+func (mgr *PlaceholderManager) Recycle(appID string) {
+	log.Logger().Info("start to recycle app placeholders",
+		zap.String("appID", appID))

Review comment:
       yes




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#issuecomment-717784125


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base (`YUNIKORN-2@6eba8fd`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             YUNIKORN-2     #203   +/-   ##
   =============================================
     Coverage              ?   58.95%           
   =============================================
     Files                 ?       36           
     Lines                 ?     3089           
     Branches              ?        0           
   =============================================
     Hits                  ?     1821           
     Misses                ?     1190           
     Partials              ?       78           
   ```
   
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=footer). Last update [6eba8fd...4b1b658](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei merged pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
yangwwei merged pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#issuecomment-717784125


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base (`YUNIKORN-2@6eba8fd`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             YUNIKORN-2     #203   +/-   ##
   =============================================
     Coverage              ?   58.62%           
   =============================================
     Files                 ?       36           
     Lines                 ?     3031           
     Branches              ?        0           
   =============================================
     Hits                  ?     1777           
     Misses                ?     1181           
     Partials              ?       73           
   ```
   
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=footer). Last update [6eba8fd...4b1b658](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#issuecomment-717784125


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base (`YUNIKORN-2@6eba8fd`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             YUNIKORN-2     #203   +/-   ##
   =============================================
     Coverage              ?   59.33%           
   =============================================
     Files                 ?       36           
     Lines                 ?     3089           
     Branches              ?        0           
   =============================================
     Hits                  ?     1833           
     Misses                ?     1179           
     Partials              ?       77           
   ```
   
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=footer). Last update [6eba8fd...24c13ab](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#issuecomment-717784125


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base (`YUNIKORN-2@6eba8fd`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             YUNIKORN-2     #203   +/-   ##
   =============================================
     Coverage              ?   58.95%           
   =============================================
     Files                 ?       36           
     Lines                 ?     3089           
     Branches              ?        0           
   =============================================
     Hits                  ?     1821           
     Misses                ?     1190           
     Partials              ?       78           
   ```
   
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=footer). Last update [6eba8fd...24c13ab](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] kingamarton commented on a change in pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
kingamarton commented on a change in pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#discussion_r514282487



##########
File path: pkg/cache/application.go
##########
@@ -359,6 +402,50 @@ func (app *Application) handleRecoverApplicationEvent(event *fsm.Event) {
 	}
 }
 
+func (app *Application) postAppAccepted(event *fsm.Event) {
+	// if app has taskGroups defined, it goes to the Reserving state before getting to Running
+	var ev events.SchedulingEvent
+	if app.taskGroups != nil {
+		ev = NewSimpleApplicationEvent(app.applicationID, events.TryReserve)
+		dispatcher.Dispatch(ev)
+	} else {
+		ev = NewRunApplicationEvent(app.applicationID)
+	}
+	dispatcher.Dispatch(ev)
+}
+
+func (app *Application) onReserving(event *fsm.Event) {
+	go func() {
+		// while doing reserving
+		if err := GetPlaceholderManager().createAppPlaceholders(app); err != nil {
+			// creating placeholder failed
+			// put the app into recycling queue and turn the app to running state
+			GetPlaceholderManager().Recycle(app.applicationID)
+			ev := NewRunApplicationEvent(app.applicationID)

Review comment:
       If the placeholder creation failed, why do we transit the application into running state? 

##########
File path: pkg/cache/placeholder_manager.go
##########
@@ -0,0 +1,84 @@
+/*
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/
+
+package cache
+
+import (
+	"sync"
+
+	"go.uber.org/zap"
+
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/client"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/utils"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/log"
+)
+
+// placeholder manager is a service to manage the lifecycle of app placeholders
+type PlaceholderManager struct {
+	clients *client.Clients
+	sync.RWMutex
+}
+
+var placeholderMgr *PlaceholderManager
+var once sync.Once
+
+func GetPlaceholderManager() *PlaceholderManager {
+	once.Do(func() {
+		placeholderMgr = &PlaceholderManager{}

Review comment:
       I can't see where the clients got its value

##########
File path: pkg/cache/application_test.go
##########
@@ -181,19 +190,192 @@ func TestGetNonTerminatedTaskAlias(t *testing.T) {
 	assert.Assert(t, is.Contains(res, "/test-00001"))
 	assert.Assert(t, is.Contains(res, "/test-00002"))
 
-	//set two tasks to terminated states
+	// set two tasks to terminated states
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Rejected)
 	// check the tasks both in terminated states
 	// res should retuen empty
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 0)
 
-	//set two tasks to one is terminated, another is non-terminated
+	// set two tasks to one is terminated, another is non-terminated
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Allocated)
 	// check the task, should only return task2's alias
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 1)
 	assert.Equal(t, res[0], "/test-00002")
 }
+
+func TestSetTaskGroupsAndSchedulingPolicy(t *testing.T) {
+	app := NewApplication("app01", "root.a", "test-user", map[string]string{}, newMockSchedulerAPI())
+	assert.Assert(t, app.getSchedulingPolicy() == nil)
+	assert.Assert(t, app.getTaskGroups() == nil)
+
+	app.setSchedulingPolicy(&v1alpha1.SchedulingPolicy{
+		Type: v1alpha1.TryReserve,
+		Parameters: map[string]string{
+			"option-1": "value-1",
+			"option-2": "value-2",
+		},
+	})
+
+	assert.Equal(t, app.getSchedulingPolicy().Type, v1alpha1.TryReserve)
+	assert.Equal(t, len(app.getSchedulingPolicy().Parameters), 2)
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-1"], "value-1")
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-2"], "value-2")
+
+	duration := int64(3000)
+	app.setTaskGroups([]*v1alpha1.TaskGroup{
+		{
+			Name:      "test-group-1",
+			MinMember: 10,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("500m"),
+				"memory": resource.MustParse("500Mi"),

Review comment:
       use constants.CPU and constants.Memory

##########
File path: pkg/cache/placeholder_manager_test.go
##########
@@ -0,0 +1,88 @@
+/*
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/
+
+package cache
+
+import (
+	"fmt"
+	"sync"
+	"testing"
+
+	"gotest.tools/assert"
+	v1 "k8s.io/api/core/v1"
+	"k8s.io/apimachinery/pkg/api/resource"
+
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/apis/yunikorn.apache.org/v1alpha1"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/client"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/constants"
+)
+
+func TestCreateAppPlaceholders(t *testing.T) {
+	const (
+		appID     = "app01"
+		queue     = "root.default"
+		namespace = "test"
+	)
+	mockedSchedulerAPI := newMockSchedulerAPI()
+	app := NewApplication(appID, queue,
+		"bob", map[string]string{constants.AppTagNamespace: namespace}, mockedSchedulerAPI)
+	app.setTaskGroups([]*v1alpha1.TaskGroup{
+		{
+			Name:      "test-group-1",
+			MinMember: 10,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("500m"),
+				"memory": resource.MustParse("1024M"),

Review comment:
       Use constants

##########
File path: pkg/cache/placeholder_test.go
##########
@@ -0,0 +1,67 @@
+/*
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/
+
+package cache
+
+import (
+	"testing"
+
+	"gotest.tools/assert"
+	"k8s.io/apimachinery/pkg/api/resource"
+
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/apis/yunikorn.apache.org/v1alpha1"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/constants"
+)
+
+func TestNewPlaceholder(t *testing.T) {
+	const (
+		appID     = "app01"
+		queue     = "root.default"
+		namespace = "test"
+	)
+	mockedSchedulerAPI := newMockSchedulerAPI()
+	app := NewApplication(appID, queue,
+		"bob", map[string]string{constants.AppTagNamespace: namespace}, mockedSchedulerAPI)
+	app.setTaskGroups([]*v1alpha1.TaskGroup{
+		{
+			Name:      "test-group-1",
+			MinMember: 10,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("500m"),
+				"memory": resource.MustParse("1024M"),

Review comment:
       use constants

##########
File path: pkg/cache/application.go
##########
@@ -359,6 +402,50 @@ func (app *Application) handleRecoverApplicationEvent(event *fsm.Event) {
 	}
 }
 
+func (app *Application) postAppAccepted(event *fsm.Event) {
+	// if app has taskGroups defined, it goes to the Reserving state before getting to Running
+	var ev events.SchedulingEvent
+	if app.taskGroups != nil {

Review comment:
       isn't it better to use len(app.taskGroups) == 0? This way the empty task group will be covered as well.

##########
File path: pkg/cache/placeholder_manager.go
##########
@@ -0,0 +1,84 @@
+/*
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/
+
+package cache
+
+import (
+	"sync"
+
+	"go.uber.org/zap"
+
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/client"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/utils"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/log"
+)
+
+// placeholder manager is a service to manage the lifecycle of app placeholders
+type PlaceholderManager struct {
+	clients *client.Clients
+	sync.RWMutex
+}
+
+var placeholderMgr *PlaceholderManager
+var once sync.Once
+
+func GetPlaceholderManager() *PlaceholderManager {
+	once.Do(func() {
+		placeholderMgr = &PlaceholderManager{}
+	})
+	return placeholderMgr
+}
+
+func (mgr *PlaceholderManager) createAppPlaceholders(app *Application) error {
+	mgr.Lock()
+	defer mgr.Unlock()
+
+	// iterate all task groups, create placeholders for all the min members
+	for _, tg := range app.getTaskGroups() {
+		for i := int32(0); i < tg.MinMember; i++ {
+			placeholderName := utils.GeneratePlaceholderName(tg.Name, app.GetApplicationID(), i)
+			placeholder := newPlaceholder(placeholderName, app, tg)
+			// create the placeholder on K8s
+			_, err := mgr.clients.KubeClient.Create(placeholder.pod)
+			if err != nil {
+				// if failed to create the place holder pod
+				// caller should handle this error
+				log.Logger().Error("failed to create placeholder pod",
+					zap.Error(err))
+				return err
+			}
+			log.Logger().Info("placeholder created",
+				zap.String("placeholder", placeholder.String()))
+		}
+	}
+
+	return nil
+}
+
+// recycle all the placeholders for an application
+func (mgr *PlaceholderManager) Recycle(appID string) {
+	log.Logger().Info("start to recycle app placeholders",
+		zap.String("appID", appID))

Review comment:
       Here will come the implementation in some of the next subtasks, right?

##########
File path: pkg/cache/application.go
##########
@@ -28,25 +28,29 @@ import (
 	v1 "k8s.io/api/core/v1"
 
 	"github.com/apache/incubator-yunikorn-core/pkg/api"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/apis/yunikorn.apache.org/v1alpha1"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/appmgmt/interfaces"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/constants"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/events"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/utils"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/conf"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/log"
 	"github.com/apache/incubator-yunikorn-scheduler-interface/lib/go/si"
 )
 
 type Application struct {
-	applicationID string
-	queue         string
-	partition     string
-	user          string
-	taskMap       map[string]*Task
-	tags          map[string]string
-	sm            *fsm.FSM
-	lock          *sync.RWMutex
-	schedulerAPI  api.SchedulerAPI
+	applicationID    string
+	queue            string
+	partition        string
+	user             string
+	taskMap          map[string]*Task
+	tags             map[string]string
+	schedulingPolicy *v1alpha1.SchedulingPolicy
+	taskGroups       []*v1alpha1.TaskGroup

Review comment:
       This 2 fields are from the Application CRD. I cannot decide whether it is good to reuse them in a general code. It might work,  but I would rather create a new structure for it here, even if it will be a duplicated code, but at least we will not depend on the CRD structure. What do you think @yangwwei?

##########
File path: pkg/cache/placeholder.go
##########
@@ -0,0 +1,85 @@
+/*
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/
+
+package cache
+
+import (
+	"fmt"
+
+	v1 "k8s.io/api/core/v1"
+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/apis/yunikorn.apache.org/v1alpha1"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/constants"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/utils"
+)
+
+type Placeholder struct {
+	appID         string
+	taskGroupName string
+	pod           *v1.Pod
+	stage         PlaceholderState
+}
+
+type PlaceholderState string
+
+const (
+	Acquiring PlaceholderState = "Acquiring"
+	Acquired  PlaceholderState = "Acquired"
+)
+
+func newPlaceholder(placeholderName string, app *Application, taskGroup *v1alpha1.TaskGroup) *Placeholder {
+	placeholderPod := &v1.Pod{

Review comment:
       On K8s side, how we can differentiate a placeholder from a real pod? Shouldn't we add a label or something to differentiate it from the real pod?

##########
File path: pkg/cache/application_test.go
##########
@@ -181,19 +190,192 @@ func TestGetNonTerminatedTaskAlias(t *testing.T) {
 	assert.Assert(t, is.Contains(res, "/test-00001"))
 	assert.Assert(t, is.Contains(res, "/test-00002"))
 
-	//set two tasks to terminated states
+	// set two tasks to terminated states
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Rejected)
 	// check the tasks both in terminated states
 	// res should retuen empty
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 0)
 
-	//set two tasks to one is terminated, another is non-terminated
+	// set two tasks to one is terminated, another is non-terminated
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Allocated)
 	// check the task, should only return task2's alias
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 1)
 	assert.Equal(t, res[0], "/test-00002")
 }
+
+func TestSetTaskGroupsAndSchedulingPolicy(t *testing.T) {
+	app := NewApplication("app01", "root.a", "test-user", map[string]string{}, newMockSchedulerAPI())
+	assert.Assert(t, app.getSchedulingPolicy() == nil)
+	assert.Assert(t, app.getTaskGroups() == nil)
+
+	app.setSchedulingPolicy(&v1alpha1.SchedulingPolicy{
+		Type: v1alpha1.TryReserve,
+		Parameters: map[string]string{
+			"option-1": "value-1",
+			"option-2": "value-2",
+		},
+	})
+
+	assert.Equal(t, app.getSchedulingPolicy().Type, v1alpha1.TryReserve)
+	assert.Equal(t, len(app.getSchedulingPolicy().Parameters), 2)
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-1"], "value-1")
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-2"], "value-2")
+
+	duration := int64(3000)
+	app.setTaskGroups([]*v1alpha1.TaskGroup{
+		{
+			Name:      "test-group-1",
+			MinMember: 10,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("500m"),
+				"memory": resource.MustParse("500Mi"),
+			},
+		},
+		{
+			Name:      "test-group-2",
+			MinMember: 20,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("1000m"),
+				"memory": resource.MustParse("1000Mi"),

Review comment:
       use constants.CPU and constants.Memory

##########
File path: pkg/cache/application_test.go
##########
@@ -181,19 +190,192 @@ func TestGetNonTerminatedTaskAlias(t *testing.T) {
 	assert.Assert(t, is.Contains(res, "/test-00001"))
 	assert.Assert(t, is.Contains(res, "/test-00002"))
 
-	//set two tasks to terminated states
+	// set two tasks to terminated states
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Rejected)
 	// check the tasks both in terminated states
 	// res should retuen empty
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 0)
 
-	//set two tasks to one is terminated, another is non-terminated
+	// set two tasks to one is terminated, another is non-terminated
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Allocated)
 	// check the task, should only return task2's alias
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 1)
 	assert.Equal(t, res[0], "/test-00002")
 }
+
+func TestSetTaskGroupsAndSchedulingPolicy(t *testing.T) {
+	app := NewApplication("app01", "root.a", "test-user", map[string]string{}, newMockSchedulerAPI())
+	assert.Assert(t, app.getSchedulingPolicy() == nil)
+	assert.Assert(t, app.getTaskGroups() == nil)
+
+	app.setSchedulingPolicy(&v1alpha1.SchedulingPolicy{
+		Type: v1alpha1.TryReserve,
+		Parameters: map[string]string{
+			"option-1": "value-1",
+			"option-2": "value-2",
+		},
+	})
+
+	assert.Equal(t, app.getSchedulingPolicy().Type, v1alpha1.TryReserve)
+	assert.Equal(t, len(app.getSchedulingPolicy().Parameters), 2)
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-1"], "value-1")
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-2"], "value-2")
+
+	duration := int64(3000)
+	app.setTaskGroups([]*v1alpha1.TaskGroup{
+		{
+			Name:      "test-group-1",
+			MinMember: 10,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("500m"),
+				"memory": resource.MustParse("500Mi"),
+			},
+		},
+		{
+			Name:      "test-group-2",
+			MinMember: 20,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("1000m"),
+				"memory": resource.MustParse("1000Mi"),
+			},
+			NodeSelector: metav1.LabelSelector{
+				MatchExpressions: []metav1.LabelSelectorRequirement{
+					{
+						Key:      "security",
+						Operator: metav1.LabelSelectorOpIn,
+						Values:   []string{"S1", "value2"},
+					},
+				},
+			},
+			Tolerations: []v1.Toleration{
+				{
+					Key:               "nodeType",
+					Operator:          v1.TolerationOpEqual,
+					Value:             "infra",
+					Effect:            v1.TaintEffectNoSchedule,
+					TolerationSeconds: &duration,
+				},
+			},
+		},
+	})
+
+	assert.Assert(t, app.getTaskGroups() != nil)
+	assert.Equal(t, len(app.getTaskGroups()), 2)
+
+	// sort the slice to give us a stable order
+	sort.Slice(app.getTaskGroups(), func(i, j int) bool {
+		return strings.Compare(app.getTaskGroups()[i].Name, app.getTaskGroups()[j].Name) < 0
+	})
+
+	tg1 := app.getTaskGroups()[0]
+	assert.Equal(t, tg1.Name, "test-group-1")
+	assert.Equal(t, tg1.MinMember, int32(10))
+	assert.Equal(t, tg1.MinResource["cpu"], resource.MustParse("500m"))
+	assert.Equal(t, tg1.MinResource["memory"], resource.MustParse("500Mi"))

Review comment:
       use constants.CPU and constants.Memory

##########
File path: pkg/cache/placeholder_manager_test.go
##########
@@ -0,0 +1,88 @@
+/*
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/
+
+package cache
+
+import (
+	"fmt"
+	"sync"
+	"testing"
+
+	"gotest.tools/assert"
+	v1 "k8s.io/api/core/v1"
+	"k8s.io/apimachinery/pkg/api/resource"
+
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/apis/yunikorn.apache.org/v1alpha1"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/client"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/constants"
+)
+
+func TestCreateAppPlaceholders(t *testing.T) {
+	const (
+		appID     = "app01"
+		queue     = "root.default"
+		namespace = "test"
+	)
+	mockedSchedulerAPI := newMockSchedulerAPI()
+	app := NewApplication(appID, queue,
+		"bob", map[string]string{constants.AppTagNamespace: namespace}, mockedSchedulerAPI)
+	app.setTaskGroups([]*v1alpha1.TaskGroup{
+		{
+			Name:      "test-group-1",
+			MinMember: 10,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("500m"),
+				"memory": resource.MustParse("1024M"),
+			},
+		},
+		{
+			Name:      "test-group-2",
+			MinMember: 20,
+			MinResource: map[string]resource.Quantity{
+				"cpu":    resource.MustParse("1000m"),
+				"memory": resource.MustParse("2048M"),

Review comment:
       Use constants
   

##########
File path: pkg/cache/application_test.go
##########
@@ -181,19 +190,192 @@ func TestGetNonTerminatedTaskAlias(t *testing.T) {
 	assert.Assert(t, is.Contains(res, "/test-00001"))
 	assert.Assert(t, is.Contains(res, "/test-00002"))
 
-	//set two tasks to terminated states
+	// set two tasks to terminated states
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Rejected)
 	// check the tasks both in terminated states
 	// res should retuen empty
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 0)
 
-	//set two tasks to one is terminated, another is non-terminated
+	// set two tasks to one is terminated, another is non-terminated
 	task1.sm.SetState(events.States().Task.Rejected)
 	task2.sm.SetState(events.States().Task.Allocated)
 	// check the task, should only return task2's alias
 	res = app.getNonTerminatedTaskAlias()
 	assert.Equal(t, len(res), 1)
 	assert.Equal(t, res[0], "/test-00002")
 }
+
+func TestSetTaskGroupsAndSchedulingPolicy(t *testing.T) {
+	app := NewApplication("app01", "root.a", "test-user", map[string]string{}, newMockSchedulerAPI())
+	assert.Assert(t, app.getSchedulingPolicy() == nil)
+	assert.Assert(t, app.getTaskGroups() == nil)
+
+	app.setSchedulingPolicy(&v1alpha1.SchedulingPolicy{
+		Type: v1alpha1.TryReserve,
+		Parameters: map[string]string{
+			"option-1": "value-1",
+			"option-2": "value-2",
+		},
+	})
+
+	assert.Equal(t, app.getSchedulingPolicy().Type, v1alpha1.TryReserve)
+	assert.Equal(t, len(app.getSchedulingPolicy().Parameters), 2)
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-1"], "value-1")
+	assert.Equal(t, app.getSchedulingPolicy().Parameters["option-2"], "value-2")

Review comment:
       nit: add assertion message 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#discussion_r514610959



##########
File path: pkg/cache/application.go
##########
@@ -359,6 +402,50 @@ func (app *Application) handleRecoverApplicationEvent(event *fsm.Event) {
 	}
 }
 
+func (app *Application) postAppAccepted(event *fsm.Event) {
+	// if app has taskGroups defined, it goes to the Reserving state before getting to Running
+	var ev events.SchedulingEvent
+	if app.taskGroups != nil {

Review comment:
       yes, that is better. Let me change that way. Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#issuecomment-717784125


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base (`YUNIKORN-2@6eba8fd`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             YUNIKORN-2     #203   +/-   ##
   =============================================
     Coverage              ?   59.33%           
   =============================================
     Files                 ?       36           
     Lines                 ?     3089           
     Branches              ?        0           
   =============================================
     Hits                  ?     1833           
     Misses                ?     1179           
     Partials              ?       77           
   ```
   
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=footer). Last update [6eba8fd...24c13ab](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#discussion_r514614533



##########
File path: pkg/cache/application.go
##########
@@ -359,6 +402,50 @@ func (app *Application) handleRecoverApplicationEvent(event *fsm.Event) {
 	}
 }
 
+func (app *Application) postAppAccepted(event *fsm.Event) {
+	// if app has taskGroups defined, it goes to the Reserving state before getting to Running
+	var ev events.SchedulingEvent
+	if app.taskGroups != nil {
+		ev = NewSimpleApplicationEvent(app.applicationID, events.TryReserve)
+		dispatcher.Dispatch(ev)
+	} else {
+		ev = NewRunApplicationEvent(app.applicationID)
+	}
+	dispatcher.Dispatch(ev)
+}
+
+func (app *Application) onReserving(event *fsm.Event) {
+	go func() {
+		// while doing reserving
+		if err := GetPlaceholderManager().createAppPlaceholders(app); err != nil {
+			// creating placeholder failed
+			// put the app into recycling queue and turn the app to running state
+			GetPlaceholderManager().Recycle(app.applicationID)
+			ev := NewRunApplicationEvent(app.applicationID)

Review comment:
       If the placeholder creation failed, that means something goes wrong on the cluster. An error handling for this scenario is to delete all existing placeholders and move the app to the next stage. In the Running state, we run app's normal pods, try to schedule them without creating any placeholders. This way, we won't be **stuck** at the Reserving state forever, which is wasting cluster resources.
   
   A possible improvement for this is to do some **retry** while creating placeholders. Retries for some time before completely giving up.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#discussion_r514615124



##########
File path: pkg/cache/placeholder.go
##########
@@ -0,0 +1,85 @@
+/*
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/
+
+package cache
+
+import (
+	"fmt"
+
+	v1 "k8s.io/api/core/v1"
+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/apis/yunikorn.apache.org/v1alpha1"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/constants"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/utils"
+)
+
+type Placeholder struct {
+	appID         string
+	taskGroupName string
+	pod           *v1.Pod
+	stage         PlaceholderState
+}
+
+type PlaceholderState string
+
+const (
+	Acquiring PlaceholderState = "Acquiring"
+	Acquired  PlaceholderState = "Acquired"
+)
+
+func newPlaceholder(placeholderName string, app *Application, taskGroup *v1alpha1.TaskGroup) *Placeholder {
+	placeholderPod := &v1.Pod{

Review comment:
       Yes, the idea is to have an **annotation** field in the placeholder pod to differentiate from the normal ones. The annotation is "yunikorn.apache.org/task-group-name". We use this to check if a pod is a placeholder, and which task group it belongs to.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#discussion_r514610276



##########
File path: pkg/cache/application.go
##########
@@ -28,25 +28,29 @@ import (
 	v1 "k8s.io/api/core/v1"
 
 	"github.com/apache/incubator-yunikorn-core/pkg/api"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/apis/yunikorn.apache.org/v1alpha1"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/appmgmt/interfaces"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/constants"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/events"
+	"github.com/apache/incubator-yunikorn-k8shim/pkg/common/utils"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/conf"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher"
 	"github.com/apache/incubator-yunikorn-k8shim/pkg/log"
 	"github.com/apache/incubator-yunikorn-scheduler-interface/lib/go/si"
 )
 
 type Application struct {
-	applicationID string
-	queue         string
-	partition     string
-	user          string
-	taskMap       map[string]*Task
-	tags          map[string]string
-	sm            *fsm.FSM
-	lock          *sync.RWMutex
-	schedulerAPI  api.SchedulerAPI
+	applicationID    string
+	queue            string
+	partition        string
+	user             string
+	taskMap          map[string]*Task
+	tags             map[string]string
+	schedulingPolicy *v1alpha1.SchedulingPolicy
+	taskGroups       []*v1alpha1.TaskGroup

Review comment:
       Hi @kingamarton I think we should depend on the CRD types. The CRD types are defined in the API package, they are supposed to represent some high-level common fields for YK. TaskGroups is one of them. Having intermittent structs for them seems not necessary for me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #203: [YUNIKORN-453] add taskGroup info to apps/tasks

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #203:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/203#issuecomment-717784125


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base (`YUNIKORN-2@6eba8fd`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             YUNIKORN-2     #203   +/-   ##
   =============================================
     Coverage              ?   58.95%           
   =============================================
     Files                 ?       36           
     Lines                 ?     3089           
     Branches              ?        0           
   =============================================
     Hits                  ?     1821           
     Misses                ?     1190           
     Partials              ?       78           
   ```
   
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=footer). Last update [6eba8fd...4b1b658](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/203?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org