You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@paimon.apache.org by "JingsongLi (via GitHub)" <gi...@apache.org> on 2023/03/29 03:09:25 UTC

[GitHub] [incubator-paimon] JingsongLi opened a new issue, #749: [Feature] Optimize batch multiple partitions inserting

JingsongLi opened a new issue, #749:
URL: https://github.com/apache/incubator-paimon/issues/749

   ### Search before asking
   
   - [X] I searched in the [issues](https://github.com/apache/incubator-paimon/issues) and found nothing similar.
   
   
   ### Motivation
   
   By default, batch sink should sort the input by partition and sequence_field to avoid generating a large number of small files. Too many small files cause poor performance, especially object storage.
   
   We can not implement `SupportsPartitioning.requiresPartitionGrouping`. we need sequence.field to sort, otherwise we can't confirm what the last record is.
   
   ### Solution
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] felixYyu commented on issue #749: [Feature] Optimize batch multiple partitions inserting

Posted by "felixYyu (via GitHub)" <gi...@apache.org>.
felixYyu commented on issue #749:
URL: https://github.com/apache/incubator-paimon/issues/749#issuecomment-1489725155

   I want to finish it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org