You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uniffle.apache.org by GitBox <gi...@apache.org> on 2023/01/12 08:33:01 UTC

[GitHub] [incubator-uniffle] zjf2012 opened a new issue, #476: [FEATURE] Respect "spark.shuffle.compress" configuration in Uniffle

zjf2012 opened a new issue, #476:
URL: https://github.com/apache/incubator-uniffle/issues/476

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the [issues](https://github.com/apache/incubator-uniffle/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Describe the feature
   
   In Spark, there is a configuration, "spark.shuffle.compress", to control if we should compress shuffled data. It defaults to true. But Uniffle always compresses shuffled data without respecting the configuration. Uniffle should respect it to be more align with vanilla Spark.
   
   ### Motivation
   
   We should let user decide if compress can be applied since some workloads are not compress friendly which otherwise can avoid useless CPU cycles.
   
   ### Describe the solution
   
   We can check "spark.shuffle.compress" configuration in WriteBufferManager and RssShuffleDataIterator before compressing and decompressing.
   
   ### Additional context
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #476: [FEATURE] Respect "spark.shuffle.compress" configuration in Uniffle

Posted by "jerqi (via GitHub)" <gi...@apache.org>.
jerqi commented on issue #476:
URL: https://github.com/apache/incubator-uniffle/issues/476#issuecomment-1422083559

   closed by #495 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #476: [FEATURE] Respect "spark.shuffle.compress" configuration in Uniffle

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #476:
URL: https://github.com/apache/incubator-uniffle/issues/476#issuecomment-1380147177

   > > Some compress algorithms which Uniffle supports may be different from Spark's.
   > 
   > It's not related to Spark's compress alg. since rss shuffle manager are separated from Spark's shuffle manager. But it's about whether you want to give user a choice to decide compress or not. It seems that there is a delegationShuffleManager which delegates to Spark's default shuffle manager in some cases. It may cause inconsistency among shuffle managers in regard to the compress configuration.
   
   Ok, I get your point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #476: [FEATURE] Respect "spark.shuffle.compress" configuration in Uniffle

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #476:
URL: https://github.com/apache/incubator-uniffle/issues/476#issuecomment-1380031918

   Some compress algorithms which  Uniffle supports may be different from Spark's.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #476: [FEATURE] Respect "spark.shuffle.compress" configuration in Uniffle

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #476:
URL: https://github.com/apache/incubator-uniffle/issues/476#issuecomment-1380200912

   > I believe Uniffle uses org.apache.uniffle.common.compression.Codec.NOOP to disable compress. But it's never used in production and may have other hidden issues. cc @zuston.
   
   NOOP is just for test now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi closed issue #476: [FEATURE] Respect "spark.shuffle.compress" configuration in Uniffle

Posted by "jerqi (via GitHub)" <gi...@apache.org>.
jerqi closed issue #476: [FEATURE] Respect "spark.shuffle.compress" configuration in Uniffle
URL: https://github.com/apache/incubator-uniffle/issues/476


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zjf2012 commented on issue #476: [FEATURE] Respect "spark.shuffle.compress" configuration in Uniffle

Posted by GitBox <gi...@apache.org>.
zjf2012 commented on issue #476:
URL: https://github.com/apache/incubator-uniffle/issues/476#issuecomment-1380051963

   > Some compress algorithms which Uniffle supports may be different from Spark's.
   
   It's not related to Spark's compress alg. since rss shuffle manager are separated from Spark's shuffle manager. But it's about whether you want to give user a choice to decide compress or not. It seems that there is a delegationShuffleManager which delegates to Spark's default shuffle manager in some cases. It may cause inconsistency among shuffle managers in regard to the compress configuration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] advancedxy commented on issue #476: [FEATURE] Respect "spark.shuffle.compress" configuration in Uniffle

Posted by GitBox <gi...@apache.org>.
advancedxy commented on issue #476:
URL: https://github.com/apache/incubator-uniffle/issues/476#issuecomment-1380199120

   I believe Uniffle uses `org.apache.uniffle.common.compression.Codec.NOOP` to disable compress. But it's never used in production and may have other hidden issues. cc @zuston.
   
   I'm also in favor of respect compute engine's shuffle options. I just did a quick overview the codec' usage and seems that it's only used in client side. Maybe we should reconsider this part, just let the client(MR or Spark) delegates its compression codec to Spark's or MR's.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org