You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by "pbacsko (via GitHub)" <gi...@apache.org> on 2023/02/14 17:27:56 UTC
[GitHub] [yunikorn-core] pbacsko opened a new pull request, #505: [YUNIKORN-1567] Do not attempt to schedule tasks if partition update is in progress
pbacsko opened a new pull request, #505:
URL: https://github.com/apache/yunikorn-core/pull/505
### What is this PR for?
We should block config updates while we're scheduling allocation asks. If the queue hierarchy changes, this can result in an incorrect, undeterministic behavior.
`ClusterContext.Lock()` is too coarse, `PartitionContext.Lock()` is also problematic for this purpose.
For each partition, a mutex is created which is locked during scheduling and config update.
### What type of PR is it?
* [ ] - Bug Fix
* [x] - Improvement
* [ ] - Feature
* [ ] - Documentation
* [ ] - Hot Fix
* [ ] - Refactoring
### Todos
* [ ] - Task
### What is the Jira issue?
https://issues.apache.org/jira/browse/YUNIKORN-1567
### How should this be tested?
### Screenshots (if appropriate)
### Questions:
* [ ] - The licenses files need update.
* [ ] - There is breaking changes for older versions.
* [ ] - It needs documentation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] [YUNIKORN-1567] Do not attempt to schedule tasks if partition update is in progress [yunikorn-core]
Posted by "pbacsko (via GitHub)" <gi...@apache.org>.
pbacsko closed pull request #505: [YUNIKORN-1567] Do not attempt to schedule tasks if partition update is in progress
URL: https://github.com/apache/yunikorn-core/pull/505
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [yunikorn-core] pbacsko commented on pull request #505: [YUNIKORN-1567] Do not attempt to schedule tasks if partition update is in progress
Posted by "pbacsko (via GitHub)" <gi...@apache.org>.
pbacsko commented on PR #505:
URL: https://github.com/apache/yunikorn-core/pull/505#issuecomment-1433196163
> This looks overly complex. What is the problem with using a partition lock? At the very least, the lock should live in the partition itself and not in a separate map...
Sorry, I forgot to answer this. Partition Lock is not hold, but there are various calls when it's hold then released for a change. I think scheduling it's designed to be "no-lock".
This solution might also raise some eyebrows, but as I went through various codepaths, it looks safe as long as we maintain certain invariants.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [yunikorn-core] craigcondit commented on pull request #505: [YUNIKORN-1567] Do not attempt to schedule tasks if partition update is in progress
Posted by "craigcondit (via GitHub)" <gi...@apache.org>.
craigcondit commented on PR #505:
URL: https://github.com/apache/yunikorn-core/pull/505#issuecomment-1433269059
> Sorry, I forgot to answer this. Partition Lock is not hold, but there are various calls when it's hold then released for a change. I think scheduling it's designed to be "no-lock".
I still think we can take the partition context lock (for reads) during the scheduling cycle, rather than introducing a new one. This still lets us (eventually) process partitions in parallel without worrying about structural changes happening during scheduling. @wilfred-s what do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [yunikorn-core] codecov[bot] commented on pull request #505: [YUNIKORN-1567] Do not attempt to schedule tasks if partition update is in progress
Posted by "codecov[bot] (via GitHub)" <gi...@apache.org>.
codecov[bot] commented on PR #505:
URL: https://github.com/apache/yunikorn-core/pull/505#issuecomment-1430121487
# [Codecov](https://codecov.io/gh/apache/yunikorn-core/pull/505?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#505](https://codecov.io/gh/apache/yunikorn-core/pull/505?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a69be54) into [master](https://codecov.io/gh/apache/yunikorn-core/commit/d695ea5dbd0f5b41066fbfcac8fdb5b9269c6497?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d695ea5) will **decrease** coverage by `0.26%`.
> The diff coverage is `4.34%`.
```diff
@@ Coverage Diff @@
## master #505 +/- ##
==========================================
- Coverage 73.45% 73.19% -0.26%
==========================================
Files 69 69
Lines 10449 10490 +41
==========================================
+ Hits 7675 7678 +3
- Misses 2527 2565 +38
Partials 247 247
```
| [Impacted Files](https://codecov.io/gh/apache/yunikorn-core/pull/505?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [pkg/scheduler/context.go](https://codecov.io/gh/apache/yunikorn-core/pull/505?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL3NjaGVkdWxlci9jb250ZXh0Lmdv) | `29.00% <4.34%> (-1.32%)` | :arrow_down: |
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [yunikorn-core] craigcondit commented on pull request #505: [YUNIKORN-1567] Do not attempt to schedule tasks if partition update is in progress
Posted by "craigcondit (via GitHub)" <gi...@apache.org>.
craigcondit commented on PR #505:
URL: https://github.com/apache/yunikorn-core/pull/505#issuecomment-1430128604
This looks overly complex. What is the problem with using a partition lock? At the very least, the lock should live in the partition itself and not in a separate map...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org