You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@paimon.apache.org by "wForget (via GitHub)" <gi...@apache.org> on 2024/04/16 07:01:03 UTC

[I] [Feature] Default parallelism in SparkWriter also needs to consider numShufflePartitions [paimon]

wForget opened a new issue, #3217:
URL: https://github.com/apache/paimon/issues/3217

   ### Search before asking
   
   - [X] I searched in the [issues](https://github.com/apache/paimon/issues) and found nothing similar.
   
   
   ### Motivation
   
   `PaimonSparkWriter` uses `sparkSession.sparkContext.defaultParallelism` as the default parallelism, which may cause small parallelism when `DRA` is enabled, so we can also consider using `numShufflePartitions`
   
   https://github.com/apache/paimon/blob/0bd955cb30c015d918e886f2fa61eb70ae697da8/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/commands/PaimonSparkWriter.scala#L133.
   
   ### Solution
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Feature] Default parallelism in SparkWriter also needs to consider numShufflePartitions [paimon]

Posted by "wForget (via GitHub)" <gi...@apache.org>.
wForget commented on issue #3217:
URL: https://github.com/apache/paimon/issues/3217#issuecomment-2058377830

   Similar to https://github.com/apache/iceberg/pull/8327


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Feature] Default parallelism in SparkWriter also needs to consider numShufflePartitions [paimon]

Posted by "YannByron (via GitHub)" <gi...@apache.org>.
YannByron closed issue #3217: [Feature] Default parallelism in SparkWriter also needs to consider numShufflePartitions
URL: https://github.com/apache/paimon/issues/3217


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org