You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2021/03/02 05:21:19 UTC

[GitHub] [incubator-yunikorn-core] wilfred-s opened a new pull request #253: [YUNIKORN-551] Deadlock when removing node

wilfred-s opened a new pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253


   When a node gets removed the node alocations get removed as part of the
   cleanup phase. The node removal acquires the partition lock and does not
   release it until all actions are done. This could cause a deadlock with
   scheduling if the node being removed contains allocations for the same
   application that is scheduled.
   The node removal should only lock while updating the partition and
   release before updating the apps and queues. Since node removal is
   called as part of the cleanup of a failed add of a node the add flow is
   changed inline with the removal to minimise lock time.
   
   The partition lock was also held while placing applications. This is not
   required as placing an application does not modify the partition until
   everything is done. The lock should only be acquired at that point.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-core] wilfred-s commented on pull request #253: [YUNIKORN-551] Deadlock when removing node

Posted by GitBox <gi...@apache.org>.
wilfred-s commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-789422204


   > do you mean we are good to get this merged, this change won't cause any conflicts to PR #250?
   
   Yes that is what I meant. There is one set of lines that overlaps between the two changes which is easily fixed even on commit. The renamed termination type is the reason for the overlap.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-core] codecov[bot] commented on pull request #253: [YUNIKORN-551] Deadlock when removing node

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-788603709


   # [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=h1) Report
   > Merging [#253](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=desc) (3e183d2) into [master](https://codecov.io/gh/apache/incubator-yunikorn-core/commit/5a1c19e280f66d6cb4fc206a675116260d6f2ff1?el=desc) (5a1c19e) will **increase** coverage by `0.81%`.
   > The diff coverage is `62.67%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/graphs/tree.svg?width=650&height=150&src=pr&token=SB9NrIi3Hy)](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #253      +/-   ##
   ==========================================
   + Coverage   63.46%   64.27%   +0.81%     
   ==========================================
     Files          60       60              
     Lines        5220     5344     +124     
   ==========================================
   + Hits         3313     3435     +122     
   - Misses       1747     1750       +3     
   + Partials      160      159       -1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [pkg/scheduler/context.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9jb250ZXh0Lmdv) | `5.78% <0.00%> (-0.49%)` | :arrow_down: |
   | [pkg/scheduler/nodes\_usage\_monitor.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9ub2Rlc191c2FnZV9tb25pdG9yLmdv) | `0.00% <0.00%> (ø)` | |
   | [pkg/scheduler/objects/allocation.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9vYmplY3RzL2FsbG9jYXRpb24uZ28=) | `100.00% <ø> (ø)` | |
   | [pkg/scheduler/objects/application.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9vYmplY3RzL2FwcGxpY2F0aW9uLmdv) | `53.19% <57.69%> (+4.28%)` | :arrow_up: |
   | [pkg/scheduler/objects/node.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9vYmplY3RzL25vZGUuZ28=) | `81.98% <75.00%> (+0.16%)` | :arrow_up: |
   | [pkg/scheduler/partition.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9wYXJ0aXRpb24uZ28=) | `69.68% <76.00%> (+3.63%)` | :arrow_up: |
   | [pkg/webservice/handlers.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3dlYnNlcnZpY2UvaGFuZGxlcnMuZ28=) | `57.00% <92.00%> (+2.86%)` | :arrow_up: |
   | [pkg/common/resources/resources.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL2NvbW1vbi9yZXNvdXJjZXMvcmVzb3VyY2VzLmdv) | `96.22% <94.28%> (+0.19%)` | :arrow_up: |
   | [pkg/common/configs/configvalidator.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL2NvbW1vbi9jb25maWdzL2NvbmZpZ3ZhbGlkYXRvci5nbw==) | `90.00% <100.00%> (+2.99%)` | :arrow_up: |
   | [pkg/scheduler/objects/allocation\_ask.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9vYmplY3RzL2FsbG9jYXRpb25fYXNrLmdv) | `93.87% <100.00%> (ø)` | |
   | ... and [4 more](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=footer). Last update [11af257...3e183d2](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-core] yangwwei commented on pull request #253: [YUNIKORN-551] Deadlock when removing node

Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-789367399


   hi @wilfred-s regarding your comment
   
   > I checked this change also on top of the changes for YUNIKORN-460 (PR #250) and the overlap is not a problem the code touched by the 2 changes is limited to a couple of lines with simple to fix merge clashes
   
   do you mean we are good to get this merged, this change won't cause any conflicts to PR #250? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-core] yangwwei merged pull request #253: [YUNIKORN-551] Deadlock when removing node

Posted by GitBox <gi...@apache.org>.
yangwwei merged pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-core] wilfred-s commented on pull request #253: [YUNIKORN-551] Deadlock when removing node

Posted by GitBox <gi...@apache.org>.
wilfred-s commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-788606906


   I checked this change also on top of the changes for YUNIKORN-460 (PR #250) and the overlap is not a problem the code touched by the 2 changes is limited to a couple of lines with simple to fix merge clashes


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-yunikorn-core] yangwwei commented on pull request #253: [YUNIKORN-551] Deadlock when removing node

Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-789443627


   Thanks for the clarification, merged the PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org