You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@openwhisk.apache.org by GitBox <gi...@apache.org> on 2022/11/02 12:50:58 UTC

[GitHub] [openwhisk] style95 opened a new pull request, #5348: Cleanup container data

style95 opened a new pull request, #5348:
URL: https://github.com/apache/openwhisk/pull/5348

   <!--- Provide a concise summary of your changes in the Title -->
   
   ## Description
   This is to fix the regression introduced from https://github.com/apache/openwhisk/pull/5338
   
   ## Related issue and scope
   <!--- Please include a link to a related issue if there is one. -->
   - [ ] I opened an issue to propose and discuss this change (#????)
   
   ## My changes affect the following components
   <!--- Select below all system components are affected by your change. -->
   <!--- Enter an `x` in all applicable boxes. -->
   - [ ] API
   - [ ] Controller
   - [ ] Message Bus (e.g., Kafka)
   - [ ] Loadbalancer
   - [ ] Scheduler
   - [x] Invoker
   - [ ] Intrinsic actions (e.g., sequences, conductors)
   - [ ] Data stores (e.g., CouchDB)
   - [ ] Tests
   - [x] Deployment
   - [ ] CLI
   - [ ] General tooling
   - [ ] Documentation
   
   ## Types of changes
   <!--- What types of changes does your code introduce? Use `x` in all the boxes that apply: -->
   - [x] Bug fix (generally a non-breaking change which closes an issue).
   - [ ] Enhancement or new feature (adds new functionality).
   - [ ] Breaking change (a bug fix or enhancement which changes existing behavior).
   
   ## Checklist:
   <!--- Please review the points below which help you make sure you've covered all aspects of the change you're making. -->
   
   - [x] I signed an [Apache CLA](https://github.com/apache/openwhisk/blob/master/CONTRIBUTING.md).
   - [x] I reviewed the [style guides](https://github.com/apache/openwhisk/blob/master/CONTRIBUTING.md#coding-standards) and followed the recommendations (Travis CI will check :).
   - [ ] I added tests to cover my changes.
   - [ ] My changes require further changes to the documentation.
   - [ ] I updated the documentation where necessary.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 commented on a diff in pull request #5348: Handle container cleanup from ActivationClient shutdown gracefully

Posted by GitBox <gi...@apache.org>.
style95 commented on code in PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#discussion_r1012518939


##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/ActivationClientProxy.scala:
##########
@@ -164,9 +166,12 @@ class ActivationClientProxy(
           stay()
 
         case _: ActionMismatch =>
-          logging.error(this, s"[${containerId.asString}] action version does not match: $action")
+          val errorMsg = s"[${containerId.asString}] action version does not match: $action"
+          logging.error(this, errorMsg)
           c.activationClient.close().andThen {
-            case _ => self ! ClientClosed
+            case _ =>

Review Comment:
   I think so.
   As I stated [here](https://github.com/apache/openwhisk/pull/5348/files/997f8a93eaadf32815e42c93252b70dfdaddd834#r1011720630), when the ActivationClientProxy initiates the clean-up process, it didn't let the ContainerProxy know. The ETCD data clean-up is handled by ContainerProxy, accordingly, the data was not removed.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on pull request #5348: Handle container cleanup from ActivationClient shutdown gracefully

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#issuecomment-1302816982

   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#discussion_r1012215339


##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala:
##########
@@ -518,6 +505,8 @@ class FunctionPullingContainerProxy(
         data.action.fullyQualifiedName(withVersion = true),
         data.action.rev,
         Some(data.clientProxy))
+
+    case x: Event if x.event != PingCache => delay

Review Comment:
   My concern with having a stash message for all events and then unstash it on state transition, is that it was the root cause of the edge case with the orphaned etcd data for pausing / unpausing containers where a generic `FailureMessage` would be stashed and then when it transitioned to Running it would unstash the `FailureMessage` leading to the bug / unexpected behavior.
   
   I just want to make sure that we're 100% sure a catch all here will not have any unknown side effects.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] codecov-commenter commented on pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#issuecomment-1300600658

   # [Codecov](https://codecov.io/gh/apache/openwhisk/pull/5348?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#5348](https://codecov.io/gh/apache/openwhisk/pull/5348?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (997f8a9) into [master](https://codecov.io/gh/apache/openwhisk/commit/077fb6d24f0132e7755ea47d7ee9b35f0966daf3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (077fb6d) will **decrease** coverage by `50.57%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@             Coverage Diff             @@
   ##           master    #5348       +/-   ##
   ===========================================
   - Coverage   81.02%   30.45%   -50.58%     
   ===========================================
     Files         240      240               
     Lines       14391    14393        +2     
     Branches      605      603        -2     
   ===========================================
   - Hits        11660     4383     -7277     
   - Misses       2731    10010     +7279     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/openwhisk/pull/5348?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../core/containerpool/v2/ActivationClientProxy.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29yZS9pbnZva2VyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvcmUvY29udGFpbmVycG9vbC92Mi9BY3RpdmF0aW9uQ2xpZW50UHJveHkuc2NhbGE=) | `0.00% <0.00%> (-78.09%)` | :arrow_down: |
   | [...ntainerpool/v2/FunctionPullingContainerProxy.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29yZS9pbnZva2VyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvcmUvY29udGFpbmVycG9vbC92Mi9GdW5jdGlvblB1bGxpbmdDb250YWluZXJQcm94eS5zY2FsYQ==) | `0.00% <0.00%> (-78.68%)` | :arrow_down: |
   | [...k/core/containerpool/v2/InvokerHealthManager.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29yZS9pbnZva2VyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvcmUvY29udGFpbmVycG9vbC92Mi9JbnZva2VySGVhbHRoTWFuYWdlci5zY2FsYQ==) | `0.00% <0.00%> (-75.76%)` | :arrow_down: |
   | [...core/scheduler/queue/SchedulingDecisionMaker.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29yZS9zY2hlZHVsZXIvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9vcGVud2hpc2svY29yZS9zY2hlZHVsZXIvcXVldWUvU2NoZWR1bGluZ0RlY2lzaW9uTWFrZXIuc2NhbGE=) | `0.00% <0.00%> (-97.90%)` | :arrow_down: |
   | [.../main/scala/org/apache/openwhisk/core/WarmUp.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvcmUvV2FybVVwLnNjYWxh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...n/scala/org/apache/openwhisk/utils/JsHelpers.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL3V0aWxzL0pzSGVscGVycy5zY2FsYQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...scala/org/apache/openwhisk/common/time/Clock.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvbW1vbi90aW1lL0Nsb2NrLnNjYWxh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/openwhisk/common/ConfigMapValue.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvbW1vbi9Db25maWdNYXBWYWx1ZS5zY2FsYQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...g/apache/openwhisk/common/ResizableSemaphore.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvbW1vbi9SZXNpemFibGVTZW1hcGhvcmUuc2NhbGE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...pache/openwhisk/http/LenientSprayJsonSupport.scala](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2h0dHAvTGVuaWVudFNwcmF5SnNvblN1cHBvcnQuc2NhbGE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [179 more](https://codecov.io/gh/apache/openwhisk/pull/5348/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 commented on a diff in pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
style95 commented on code in PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#discussion_r1011720630


##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/ActivationClientProxy.scala:
##########
@@ -164,9 +166,12 @@ class ActivationClientProxy(
           stay()
 
         case _: ActionMismatch =>
-          logging.error(this, s"[${containerId.asString}] action version does not match: $action")
+          val errorMsg = s"[${containerId.asString}] action version does not match: $action"
+          logging.error(this, errorMsg)
           c.activationClient.close().andThen {
-            case _ => self ! ClientClosed
+            case _ =>
+              context.parent ! FailureMessage(new RuntimeException(errorMsg))

Review Comment:
   As `GracefulShutdown` is introduced in the ContainerProxy layer, generally a clean-up process is initiated by ContainerProxy. ContainerProxy cleans up ETCD data first and sends GracefulShutdown to ActivationClientPRoxy and waits for the `ClientClosed` message.
   But in some cases like this, there is a need for ActivationClientProxy to initiate the clean-up process.
   In these cases, ActivationClientProxy should let ContainerProxy(parent) know the situation and let it start the clean-up process immediately while it also shut down.
   
   So now, we need to send a `FailureMessage` to the parent, then the parent will clean up ETCD data and remove containers immediately by this kind of logic.
   https://github.com/apache/openwhisk/pull/5348/files#diff-23fc1c1634cd8a2e99b4cfbf342527248a53f5987911af80ab5d910ce7864d70R368
   



##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala:
##########
@@ -518,6 +505,8 @@ class FunctionPullingContainerProxy(
         data.action.fullyQualifiedName(withVersion = true),
         data.action.rev,
         Some(data.clientProxy))
+
+    case x: Event if x.event != PingCache => delay

Review Comment:
   This will make sure `GracefulShutdown` is properly handled in all states.
   



##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala:
##########
@@ -272,8 +272,7 @@ class FunctionPullingContainerProxy(
               job.rpcPort,
               container.containerId)) match {
             case Success(clientProxy) =>
-              clientProxy ! StartClient

Review Comment:
   Previously, this chaining of the future causes the timing issue.
   As a result, we needed this case.
   ```scala
       case Event(ClientCreationCompleted(proxy), _: NonexistentData) =>
         akka.pattern.after(3.milliseconds, actorSystem.scheduler) {
           self ! ClientCreationCompleted(proxy.orElse(Some(sender())))
           Future.successful({})
         }
   ```
   https://github.com/apache/openwhisk/pull/5348/files#diff-23fc1c1634cd8a2e99b4cfbf342527248a53f5987911af80ab5d910ce7864d70L341
   
   It had made the logic deterministic and less efficient, so I refactored it.
   



##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala:
##########
@@ -334,41 +335,27 @@ class FunctionPullingContainerProxy(
 
   when(CreatingClient) {
     // wait for client creation when cold start
-    case Event(job: ContainerCreatedData, _: NonexistentData) =>
-      stay() using job
+    case Event(job: InitializedData, _) =>

Review Comment:
   Now, it sends the `StartClient` to the ActivationClientProxy only after it receives the `InitializedData`.
   So there would be no timing issue.



##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/ActivationClientProxy.scala:
##########
@@ -36,7 +36,7 @@ import scala.concurrent.Future
 import scala.util.{Success, Try}
 
 // Event send by the actor
-case class ClientCreationCompleted(client: Option[ActorRef] = None)
+case object ClientCreationCompleted

Review Comment:
   As we store the reference of a client proxy after creation, we don't need to pass/receive the client-proxy reference via this message. So it became an object.
   
   https://github.com/apache/openwhisk/pull/5348/files#diff-23fc1c1634cd8a2e99b4cfbf342527248a53f5987911af80ab5d910ce7864d70R315



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#issuecomment-1301345025

   I ran a test with the changes here and can no longer reproduce the issue!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#issuecomment-1301119234

   Just have one comment to address, otherwise I think it LGTM!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 commented on a diff in pull request #5348: Handle container cleanup from ActivationClient shutdown gracefully

Posted by GitBox <gi...@apache.org>.
style95 commented on code in PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#discussion_r1012522201


##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala:
##########
@@ -518,6 +505,8 @@ class FunctionPullingContainerProxy(
         data.action.fullyQualifiedName(withVersion = true),
         data.action.rev,
         Some(data.clientProxy))
+
+    case x: Event if x.event != PingCache => delay

Review Comment:
   This state is a temporary state before removing containers and the proxy itself.
   So the next state is always `Removing` as all cases in this state lead to it and both cases clean up etcd data properly.
   
   In the `Removing` state, there is no case of moving back to other states.
   It `stays` or `stops`.
   
   ```scala
     when(Removing, unusedTimeout) {
       // only if ClientProxy is closed, ContainerProxy stops. So it is important for ClientProxy to send ClientClosed.
       case Event(ClientClosed, _) =>
         stop()
   
       // even if any error occurs, it still waits for ClientClosed event in order to be stopped after the client is closed.
       case Event(t: FailureMessage, _) =>
         logging.error(this, s"unable to delete a container due to ${t}")
   
         stay
   
       case Event(StateTimeout, _) =>
         logging.error(this, s"could not receive ClientClosed for ${unusedTimeout}, so just stop the container proxy.")
   
         stop()
   
       case Event(Remove | GracefulShutdown, _) =>
         stay()
   
       case Event(DetermineKeepContainer(_), _) =>
         stay()
     }
   ```
   
   So I suppose it would be OK to delay.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#issuecomment-1301152307

   nit: can we rename the pr to `handle container cleanup from ActivationClient shutdown gracefully"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 commented on a diff in pull request #5348: Handle container cleanup from ActivationClient shutdown gracefully

Posted by GitBox <gi...@apache.org>.
style95 commented on code in PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#discussion_r1011725976


##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala:
##########
@@ -272,8 +272,7 @@ class FunctionPullingContainerProxy(
               job.rpcPort,
               container.containerId)) match {
             case Success(clientProxy) =>
-              clientProxy ! StartClient

Review Comment:
   Previously, this chaining of the future causes the timing issue.
   As a result, we needed this case.
   ```scala
       case Event(ClientCreationCompleted(proxy), _: NonexistentData) =>
         akka.pattern.after(3.milliseconds, actorSystem.scheduler) {
           self ! ClientCreationCompleted(proxy.orElse(Some(sender())))
           Future.successful({})
         }
   ```
   https://github.com/apache/openwhisk/pull/5348/files#diff-23fc1c1634cd8a2e99b4cfbf342527248a53f5987911af80ab5d910ce7864d70L341
   
   It had made the logic indeterministic and less efficient, so I refactored it.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#discussion_r1012208898


##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/ActivationClientProxy.scala:
##########
@@ -164,9 +166,12 @@ class ActivationClientProxy(
           stay()
 
         case _: ActionMismatch =>
-          logging.error(this, s"[${containerId.asString}] action version does not match: $action")
+          val errorMsg = s"[${containerId.asString}] action version does not match: $action"
+          logging.error(this, errorMsg)
           c.activationClient.close().andThen {
-            case _ => self ! ClientClosed
+            case _ =>

Review Comment:
   I was also seeing the issue of orphaned data for a test action that I frequently update every minute outside of the scheduler deployments. Does it stand to reason this was really the same issue and this will also fix that issue?
   
   The issue started happening I want to say around the same time of the October 13th commit that introduced the regression so it seems likely to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#discussion_r1012208898


##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/ActivationClientProxy.scala:
##########
@@ -164,9 +166,12 @@ class ActivationClientProxy(
           stay()
 
         case _: ActionMismatch =>
-          logging.error(this, s"[${containerId.asString}] action version does not match: $action")
+          val errorMsg = s"[${containerId.asString}] action version does not match: $action"
+          logging.error(this, errorMsg)
           c.activationClient.close().andThen {
-            case _ => self ! ClientClosed
+            case _ =>

Review Comment:
   I was also seeing the issue for a test action that I frequently update every minute outside of the scheduler deployments. Does it stand to reason this was really the same issue and this will also fix that issue?
   
   The issue started happening I want to say around the same time of the October 13th commit that introduced the regression so it seems likely to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 merged pull request #5348: Handle container cleanup from ActivationClient shutdown gracefully

Posted by GitBox <gi...@apache.org>.
style95 merged PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 commented on pull request #5348: Handle container cleanup from ActivationClient shutdown gracefully

Posted by GitBox <gi...@apache.org>.
style95 commented on PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#issuecomment-1301848131

   I confirmed it supports zero downtime deployment.
   <img width="1222" alt="image" src="https://user-images.githubusercontent.com/3447251/199688819-740211ef-84a5-4d53-9e05-2182aa7bd99d.png">
   
   No activation is failed during the deployment.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 commented on a diff in pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
style95 commented on code in PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#discussion_r1011733702


##########
core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/SchedulingDecisionMaker.scala:
##########
@@ -96,7 +96,9 @@ class SchedulingDecisionMaker(
               this,
               s"there is no capacity activations will be dropped or throttled, (availableMsg: $availableMsg totalContainers: $totalContainers, limit: $limit, namespaceContainers: ${existingContainerCountInNs}, namespaceInProgressContainer: ${inProgressContainerCountInNs}) [$invocationNamespace:$action]")
             Future.successful(DecisionResults(EnableNamespaceThrottling(dropMsg = totalContainers == 0), 0))
-          case NamespaceThrottled if schedulingConfig.allowOverProvisionBeforeThrottle && ceiling(limit * schedulingConfig.namespaceOverProvisionBeforeThrottleRatio) - existingContainerCountInNs - inProgressContainerCountInNs > 0 =>
+          case NamespaceThrottled
+              if schedulingConfig.allowOverProvisionBeforeThrottle && ceiling(

Review Comment:
   It seems our CI tests do not run `checkScalaFmtAll`.
   Some codes already included in the master branch are incorrectly formatted.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5348: Fix regression

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5348:
URL: https://github.com/apache/openwhisk/pull/5348#discussion_r1012210041


##########
core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala:
##########
@@ -272,8 +272,7 @@ class FunctionPullingContainerProxy(
               job.rpcPort,
               container.containerId)) match {
             case Success(clientProxy) =>
-              clientProxy ! StartClient

Review Comment:
   great



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org