You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2022/05/18 21:21:13 UTC
[GitHub] [yunikorn-core] yangwwei opened a new pull request, #411: [YUNIKORN-1218] Scheduler crashed with concurrent map access error in…
yangwwei opened a new pull request, #411:
URL: https://github.com/apache/yunikorn-core/pull/411
With the latest 1.0 version, we observed that occasionally scheduler crashed and restarted. This was due to the health checker having unsafe access to the application map of the partition context, a concurrent read/write error may occur when an app gets added or deleted in/from the partition at the same time. We need to use a read lock to access the map.
### What is this PR for?
A few sentences describing the overall goals of the pull request's commits.
First time? Check out the contributing guide - http://yunikorn.apache.org/community/how_to_contribute
### What type of PR is it?
* [v] - Bug Fix
* [ ] - Improvement
* [ ] - Feature
* [ ] - Documentation
* [ ] - Hot Fix
* [ ] - Refactoring
### What is the Jira issue?
* https://issues.apache.org/jira/browse/YUNIKORN-1218
### How should this be tested?
A patch to reproduce this issue has been attached in JIRA https://issues.apache.org/jira/browse/YUNIKORN-1218. Without the patch, it panics after running a while; with the patch, the crash won't happen.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [yunikorn-core] codecov[bot] commented on pull request #411: [YUNIKORN-1218] Scheduler crashed with concurrent map access error in…
Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on PR #411:
URL: https://github.com/apache/yunikorn-core/pull/411#issuecomment-1130566554
# [Codecov](https://codecov.io/gh/apache/yunikorn-core/pull/411?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#411](https://codecov.io/gh/apache/yunikorn-core/pull/411?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bbe9c4d) into [master](https://codecov.io/gh/apache/yunikorn-core/commit/9784ce0ea03b64b601dfcd520c6b8fd3d1c8fc1c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9784ce0) will **not change** coverage.
> The diff coverage is `100.00%`.
```diff
@@ Coverage Diff @@
## master #411 +/- ##
=======================================
Coverage 69.09% 69.09%
=======================================
Files 67 67
Lines 9654 9654
=======================================
Hits 6670 6670
Misses 2740 2740
Partials 244 244
```
| [Impacted Files](https://codecov.io/gh/apache/yunikorn-core/pull/411?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [pkg/scheduler/health\_checker.go](https://codecov.io/gh/apache/yunikorn-core/pull/411/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGtnL3NjaGVkdWxlci9oZWFsdGhfY2hlY2tlci5nbw==) | `85.20% <100.00%> (ø)` | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/yunikorn-core/pull/411?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/yunikorn-core/pull/411?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9784ce0...bbe9c4d](https://codecov.io/gh/apache/yunikorn-core/pull/411?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [yunikorn-core] craigcondit closed pull request #411: [YUNIKORN-1218] Scheduler crashed with concurrent map access error in…
Posted by GitBox <gi...@apache.org>.
craigcondit closed pull request #411: [YUNIKORN-1218] Scheduler crashed with concurrent map access error in…
URL: https://github.com/apache/yunikorn-core/pull/411
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@yunikorn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org