You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "RussellSpitzer (via GitHub)" <gi...@apache.org> on 2023/05/30 19:09:19 UTC

[GitHub] [iceberg] RussellSpitzer commented on pull request #7688: Add adaptive split size

RussellSpitzer commented on PR #7688:
URL: https://github.com/apache/iceberg/pull/7688#issuecomment-1568940063

   I think this is mentioned above, but it does feel like we are targeting this at the wrong place. If we have a min parallelism I think the controls should probably be centered around task coalescing. Currently for files with offsets we always break them into the maximal amount of offset tasks before recombining. The only real issue is for files without offsets correct? That's the only reason we may want to control the split size since they are cut up based on that property rather than actual offsets?
   
   I wonder if it might be clearer to just have a "Offset" codepath that just works during recombination and a special codepath for non-offset filetypes?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org