You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@celeborn.apache.org by "AngersZhuuuu (via GitHub)" <gi...@apache.org> on 2023/02/21 02:48:32 UTC
[GitHub] [incubator-celeborn] AngersZhuuuu opened a new pull request, #1253: [CELEBORN-316] Wrap Celeborn exception with CelebonrIOException
AngersZhuuuu opened a new pull request, #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253
### What changes were proposed in this pull request?
When we enable spark blacklist mechanism, many executor was excluded from current stage caused by RSS exception, but for all RSS exception we should not been excluded and this mechanism caused less and less executor to return the task, caused job slower.
We should wrap RSS exception with CelebonrIOException then we can do something in spark side
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-celeborn] AngersZhuuuu commented on a diff in pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebornIOException
Posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org>.
AngersZhuuuu commented on code in PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#discussion_r1113925744
##########
common/src/main/scala/org/apache/celeborn/common/exception/CelebornIOException.scala:
##########
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.celeborn.common.exception
+
+import java.io.IOException
+
+class CelebornIOException(message: String, cause: Throwable)
Review Comment:
> How about CelebornException? There might be something like register failed or sendRpc Exception. If you wanna handle celeborn exceptions in spark, I think there is no reason to assume that all exceptions are IO exceptions.
It should extended from IOException
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-celeborn] FMX commented on a diff in pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebornIOException
Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX commented on code in PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#discussion_r1113886838
##########
common/src/main/scala/org/apache/celeborn/common/exception/CelebornIOException.scala:
##########
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.celeborn.common.exception
+
+import java.io.IOException
+
+class CelebornIOException(message: String, cause: Throwable)
Review Comment:
How about CelebornException? There might be something like register failed or sendRpc Exception. If you wanna handle celeborn exceptions in spark, I think there is no reason to assume that all exceptions are IO exceptions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-celeborn] FMX merged pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebornIOException
Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX merged PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-celeborn] RexXiong commented on pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebonrIOException
Posted by "RexXiong (via GitHub)" <gi...@apache.org>.
RexXiong commented on PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#issuecomment-1437810375
Can we also change the related methods signature from IOException -> CelebornIOException
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-celeborn] AngersZhuuuu commented on pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebonrIOException
Posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org>.
AngersZhuuuu commented on PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#issuecomment-1437823954
> Can we also change the related methods signature from IOException -> CelebornIOException
I think it's unnecessary, some place extended from spark's API
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-celeborn] codecov[bot] commented on pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebonrIOException
Posted by "codecov[bot] (via GitHub)" <gi...@apache.org>.
codecov[bot] commented on PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#issuecomment-1437791764
# [Codecov](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#1253](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e689b1) into [main](https://codecov.io/gh/apache/incubator-celeborn/commit/b09b85521a3492eba7eac5b2a4c14bbecd3b8b46?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b09b855) will **decrease** coverage by `0.01%`.
> The diff coverage is `7.70%`.
```diff
@@ Coverage Diff @@
## main #1253 +/- ##
============================================
- Coverage 27.17% 27.15% -0.01%
- Complexity 811 813 +2
============================================
Files 214 215 +1
Lines 18315 18336 +21
Branches 1988 1994 +6
============================================
+ Hits 4975 4978 +3
- Misses 13014 13032 +18
Partials 326 326
```
| [Impacted Files](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...rg/apache/celeborn/client/read/RssInputStream.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jbGllbnQvcmVhZC9Sc3NJbnB1dFN0cmVhbS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...he/celeborn/client/read/WorkerPartitionReader.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jbGllbnQvcmVhZC9Xb3JrZXJQYXJ0aXRpb25SZWFkZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...a/org/apache/celeborn/client/write/DataPusher.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jbGllbnQvd3JpdGUvRGF0YVB1c2hlci5qYXZh) | `0.00% <0.00%> (ø)` | |
| [.../celeborn/common/write/InFlightRequestTracker.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jb21tb24vd3JpdGUvSW5GbGlnaHRSZXF1ZXN0VHJhY2tlci5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...e/celeborn/common/write/SlowStartPushStrategy.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jb21tb24vd3JpdGUvU2xvd1N0YXJ0UHVzaFN0cmF0ZWd5LmphdmE=) | `77.42% <0.00%> (ø)` | |
| [...eleborn/common/exception/CelebornIOException.scala](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvY2VsZWJvcm4vY29tbW9uL2V4Y2VwdGlvbi9DZWxlYm9ybklPRXhjZXB0aW9uLnNjYWxh) | `0.00% <0.00%> (ø)` | |
| [.../org/apache/celeborn/client/ShuffleClientImpl.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jbGllbnQvU2h1ZmZsZUNsaWVudEltcGwuamF2YQ==) | `18.21% <50.00%> (-0.06%)` | :arrow_down: |
| [...deploy/master/clustermeta/AbstractMetaManager.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-bWFzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9zZXJ2aWNlL2RlcGxveS9tYXN0ZXIvY2x1c3Rlcm1ldGEvQWJzdHJhY3RNZXRhTWFuYWdlci5qYXZh) | `85.64% <0.00%> (-2.43%)` | :arrow_down: |
| [...born/common/protocol/message/ControlMessages.scala](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvY2VsZWJvcm4vY29tbW9uL3Byb3RvY29sL21lc3NhZ2UvQ29udHJvbE1lc3NhZ2VzLnNjYWxh) | `0.14% <0.00%> (-<0.01%)` | :arrow_down: |
| ... and [5 more](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org