You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2021/03/02 05:21:19 UTC
[GitHub] [incubator-yunikorn-core] wilfred-s opened a new pull request #253: [YUNIKORN-551] Deadlock when removing node
wilfred-s opened a new pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253
When a node gets removed the node alocations get removed as part of the
cleanup phase. The node removal acquires the partition lock and does not
release it until all actions are done. This could cause a deadlock with
scheduling if the node being removed contains allocations for the same
application that is scheduled.
The node removal should only lock while updating the partition and
release before updating the apps and queues. Since node removal is
called as part of the cleanup of a failed add of a node the add flow is
changed inline with the removal to minimise lock time.
The partition lock was also held while placing applications. This is not
required as placing an application does not modify the partition until
everything is done. The lock should only be acquired at that point.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-core] wilfred-s commented on pull request #253: [YUNIKORN-551] Deadlock when removing node
Posted by GitBox <gi...@apache.org>.
wilfred-s commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-789422204
> do you mean we are good to get this merged, this change won't cause any conflicts to PR #250?
Yes that is what I meant. There is one set of lines that overlaps between the two changes which is easily fixed even on commit. The renamed termination type is the reason for the overlap.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-core] codecov[bot] commented on pull request #253: [YUNIKORN-551] Deadlock when removing node
Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-788603709
# [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=h1) Report
> Merging [#253](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=desc) (3e183d2) into [master](https://codecov.io/gh/apache/incubator-yunikorn-core/commit/5a1c19e280f66d6cb4fc206a675116260d6f2ff1?el=desc) (5a1c19e) will **increase** coverage by `0.81%`.
> The diff coverage is `62.67%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/graphs/tree.svg?width=650&height=150&src=pr&token=SB9NrIi3Hy)](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #253 +/- ##
==========================================
+ Coverage 63.46% 64.27% +0.81%
==========================================
Files 60 60
Lines 5220 5344 +124
==========================================
+ Hits 3313 3435 +122
- Misses 1747 1750 +3
+ Partials 160 159 -1
```
| [Impacted Files](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [pkg/scheduler/context.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9jb250ZXh0Lmdv) | `5.78% <0.00%> (-0.49%)` | :arrow_down: |
| [pkg/scheduler/nodes\_usage\_monitor.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9ub2Rlc191c2FnZV9tb25pdG9yLmdv) | `0.00% <0.00%> (ø)` | |
| [pkg/scheduler/objects/allocation.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9vYmplY3RzL2FsbG9jYXRpb24uZ28=) | `100.00% <ø> (ø)` | |
| [pkg/scheduler/objects/application.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9vYmplY3RzL2FwcGxpY2F0aW9uLmdv) | `53.19% <57.69%> (+4.28%)` | :arrow_up: |
| [pkg/scheduler/objects/node.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9vYmplY3RzL25vZGUuZ28=) | `81.98% <75.00%> (+0.16%)` | :arrow_up: |
| [pkg/scheduler/partition.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9wYXJ0aXRpb24uZ28=) | `69.68% <76.00%> (+3.63%)` | :arrow_up: |
| [pkg/webservice/handlers.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3dlYnNlcnZpY2UvaGFuZGxlcnMuZ28=) | `57.00% <92.00%> (+2.86%)` | :arrow_up: |
| [pkg/common/resources/resources.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL2NvbW1vbi9yZXNvdXJjZXMvcmVzb3VyY2VzLmdv) | `96.22% <94.28%> (+0.19%)` | :arrow_up: |
| [pkg/common/configs/configvalidator.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL2NvbW1vbi9jb25maWdzL2NvbmZpZ3ZhbGlkYXRvci5nbw==) | `90.00% <100.00%> (+2.99%)` | :arrow_up: |
| [pkg/scheduler/objects/allocation\_ask.go](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree#diff-cGtnL3NjaGVkdWxlci9vYmplY3RzL2FsbG9jYXRpb25fYXNrLmdv) | `93.87% <100.00%> (ø)` | |
| ... and [4 more](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=footer). Last update [11af257...3e183d2](https://codecov.io/gh/apache/incubator-yunikorn-core/pull/253?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-core] yangwwei commented on pull request #253: [YUNIKORN-551] Deadlock when removing node
Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-789367399
hi @wilfred-s regarding your comment
> I checked this change also on top of the changes for YUNIKORN-460 (PR #250) and the overlap is not a problem the code touched by the 2 changes is limited to a couple of lines with simple to fix merge clashes
do you mean we are good to get this merged, this change won't cause any conflicts to PR #250?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-core] yangwwei merged pull request #253: [YUNIKORN-551] Deadlock when removing node
Posted by GitBox <gi...@apache.org>.
yangwwei merged pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-core] wilfred-s commented on pull request #253: [YUNIKORN-551] Deadlock when removing node
Posted by GitBox <gi...@apache.org>.
wilfred-s commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-788606906
I checked this change also on top of the changes for YUNIKORN-460 (PR #250) and the overlap is not a problem the code touched by the 2 changes is limited to a couple of lines with simple to fix merge clashes
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-core] yangwwei commented on pull request #253: [YUNIKORN-551] Deadlock when removing node
Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #253:
URL: https://github.com/apache/incubator-yunikorn-core/pull/253#issuecomment-789443627
Thanks for the clarification, merged the PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org