You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/02 20:53:06 UTC

[GitHub] [iceberg] nvitucci commented on pull request #2780: Add partition files to SparkBatchScan description

nvitucci commented on pull request #2780:
URL: https://github.com/apache/iceberg/pull/2780#issuecomment-873251120


   > This is adding every file touched to the description which is probably too much (since this could be in the thousands of files). One of the big issues here is this description will sit on the Spark Driver for a while so it's a pretty large chunk of memory. Maybe just a summary would be sufficient? Reading Z Manifests - X files from Y partitions?
   
   I agree. With this change, my aim is to have means to show that the partition filters are actually being used. I'll try to figure out the minimal amount of information that can still be useful with this respect.
   
   > I think it's a little dangerous to run planning in description since that may be called even when the plan isn't executed and since it becomes caching in this implementation it may have issues if a user runs "Explain" then adds more clauses or something. For a long planned query that also means calling "explain" ends up being a rather expensive operation.
   
   Yes, this is my main doubt. I've changed the implementation to use the lazy `tasks` variable in the extending classes. I'm not quite sure this will make a lot of difference, but could this be going in the right direction?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org