You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2022/03/29 11:00:53 UTC

[GitHub] [yunikorn-k8shim] anuraagnalluri commented on pull request #369: [YUNIKORN-1040] add e2e test that re-starts the scheduler pod

anuraagnalluri commented on pull request #369:
URL: https://github.com/apache/yunikorn-k8shim/pull/369#issuecomment-1081728125


   @yangwwei Done, and changed necessary imports. Thanks for getting another pair of eyes on this. I was able to reproduce the error locally a couple times in plugin mode, but am still unsure why the allocations list is empty. 
   
   When I ran in to the same failure as we see in CI checks, I was able to verify that the applicationID of the sleepjob pod belongs to the newly added `recovery_and_restart` suite and _not_ `basic_scheduling_test`. My initial thought was that a "completed" sleepjob with 0 allocations from a previous test could have been picked up, but this is not the case (as that test also tears down the namespace in cleanup). I could see the sleeppod was in "Running" state and ultimately could not identify any metadata differences in the failing case vs. when it's deployed in passing test runs.
   
   Is it possible that plugin-mode logic could specifically affect this behavior in a way normal mode cannot? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org