You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@celeborn.apache.org by "AngersZhuuuu (via GitHub)" <gi...@apache.org> on 2023/02/21 02:48:32 UTC

[GitHub] [incubator-celeborn] AngersZhuuuu opened a new pull request, #1253: [CELEBORN-316] Wrap Celeborn exception with CelebonrIOException

AngersZhuuuu opened a new pull request, #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253

   ### What changes were proposed in this pull request?
   When we enable spark blacklist mechanism, many executor was excluded from current stage caused by RSS exception, but for all RSS exception we should not been excluded and this mechanism caused less and less executor to return the task, caused job slower. 
   
   We should wrap RSS exception with CelebonrIOException then we can do something in spark side
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] AngersZhuuuu commented on a diff in pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebornIOException

Posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org>.
AngersZhuuuu commented on code in PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#discussion_r1113925744


##########
common/src/main/scala/org/apache/celeborn/common/exception/CelebornIOException.scala:
##########
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.celeborn.common.exception
+
+import java.io.IOException
+
+class CelebornIOException(message: String, cause: Throwable)

Review Comment:
   > How about CelebornException? There might be something like register failed or sendRpc Exception. If you wanna handle celeborn exceptions in spark, I think there is no reason to assume that all exceptions are IO exceptions.
   
   It should extended from IOException



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] FMX commented on a diff in pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebornIOException

Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX commented on code in PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#discussion_r1113886838


##########
common/src/main/scala/org/apache/celeborn/common/exception/CelebornIOException.scala:
##########
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.celeborn.common.exception
+
+import java.io.IOException
+
+class CelebornIOException(message: String, cause: Throwable)

Review Comment:
   How about CelebornException? There might be something like register failed or sendRpc Exception. If you wanna handle celeborn exceptions in spark, I think there is no reason to assume that all exceptions are IO exceptions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] FMX merged pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebornIOException

Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX merged PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] RexXiong commented on pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebonrIOException

Posted by "RexXiong (via GitHub)" <gi...@apache.org>.
RexXiong commented on PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#issuecomment-1437810375

   Can we also change the related methods signature from IOException -> CelebornIOException


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] AngersZhuuuu commented on pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebonrIOException

Posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org>.
AngersZhuuuu commented on PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#issuecomment-1437823954

   > Can we also change the related methods signature from IOException -> CelebornIOException
   
   I think it's unnecessary, some place extended from spark's API


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] codecov[bot] commented on pull request #1253: [CELEBORN-316] Wrap Celeborn exception with CelebonrIOException

Posted by "codecov[bot] (via GitHub)" <gi...@apache.org>.
codecov[bot] commented on PR #1253:
URL: https://github.com/apache/incubator-celeborn/pull/1253#issuecomment-1437791764

   # [Codecov](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1253](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e689b1) into [main](https://codecov.io/gh/apache/incubator-celeborn/commit/b09b85521a3492eba7eac5b2a4c14bbecd3b8b46?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b09b855) will **decrease** coverage by `0.01%`.
   > The diff coverage is `7.70%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##               main    #1253      +/-   ##
   ============================================
   - Coverage     27.17%   27.15%   -0.01%     
   - Complexity      811      813       +2     
   ============================================
     Files           214      215       +1     
     Lines         18315    18336      +21     
     Branches       1988     1994       +6     
   ============================================
   + Hits           4975     4978       +3     
   - Misses        13014    13032      +18     
     Partials        326      326              
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...rg/apache/celeborn/client/read/RssInputStream.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jbGllbnQvcmVhZC9Sc3NJbnB1dFN0cmVhbS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...he/celeborn/client/read/WorkerPartitionReader.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jbGllbnQvcmVhZC9Xb3JrZXJQYXJ0aXRpb25SZWFkZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...a/org/apache/celeborn/client/write/DataPusher.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jbGllbnQvd3JpdGUvRGF0YVB1c2hlci5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [.../celeborn/common/write/InFlightRequestTracker.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jb21tb24vd3JpdGUvSW5GbGlnaHRSZXF1ZXN0VHJhY2tlci5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...e/celeborn/common/write/SlowStartPushStrategy.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jb21tb24vd3JpdGUvU2xvd1N0YXJ0UHVzaFN0cmF0ZWd5LmphdmE=) | `77.42% <0.00%> (ø)` | |
   | [...eleborn/common/exception/CelebornIOException.scala](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvY2VsZWJvcm4vY29tbW9uL2V4Y2VwdGlvbi9DZWxlYm9ybklPRXhjZXB0aW9uLnNjYWxh) | `0.00% <0.00%> (ø)` | |
   | [.../org/apache/celeborn/client/ShuffleClientImpl.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jbGllbnQvU2h1ZmZsZUNsaWVudEltcGwuamF2YQ==) | `18.21% <50.00%> (-0.06%)` | :arrow_down: |
   | [...deploy/master/clustermeta/AbstractMetaManager.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-bWFzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9zZXJ2aWNlL2RlcGxveS9tYXN0ZXIvY2x1c3Rlcm1ldGEvQWJzdHJhY3RNZXRhTWFuYWdlci5qYXZh) | `85.64% <0.00%> (-2.43%)` | :arrow_down: |
   | [...born/common/protocol/message/ControlMessages.scala](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvY2VsZWJvcm4vY29tbW9uL3Byb3RvY29sL21lc3NhZ2UvQ29udHJvbE1lc3NhZ2VzLnNjYWxh) | `0.14% <0.00%> (-<0.01%)` | :arrow_down: |
   | ... and [5 more](https://codecov.io/gh/apache/incubator-celeborn/pull/1253?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org