You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/01/04 12:26:04 UTC

[GitHub] [hive] marton-bod opened a new pull request #2917: HIVE-25843: Add flag to disable Iceberg FileIO config serialization

marton-bod opened a new pull request #2917:
URL: https://github.com/apache/hive/pull/2917


   Cross-porting https://github.com/apache/iceberg/pull/3752 from Iceberg (commit da712eaf60744c933c08fe1cab7a00cdcb2f4829)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] marton-bod merged pull request #2917: HIVE-25843: Add flag to disable Iceberg FileIO config serialization

Posted by GitBox <gi...@apache.org>.
marton-bod merged pull request #2917:
URL: https://github.com/apache/hive/pull/2917


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] marton-bod commented on pull request #2917: HIVE-25843: Add flag to disable Iceberg FileIO config serialization

Posted by GitBox <gi...@apache.org>.
marton-bod commented on pull request #2917:
URL: https://github.com/apache/hive/pull/2917#issuecomment-1008749526


   @pvary when working with metadata queries, the IO can be part of the Task, e.g.: https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/DataFilesTable.java#L139 
   If we remove the hadoop config from this IO to make the splits smaller, then we wouldn't be able to inject back the config later on the Tez worker-side, since the Task API does not provide any options for that. 
   
   So I've disabled the serialization skipping for metadata queries. We can look into making this work for metadata queries as well in the future if the need arises, but I think that might require some Iceberg-side API changes too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] marton-bod removed a comment on pull request #2917: HIVE-25843: Add flag to disable Iceberg FileIO config serialization

Posted by GitBox <gi...@apache.org>.
marton-bod removed a comment on pull request #2917:
URL: https://github.com/apache/hive/pull/2917#issuecomment-1005740910


   @pvary Can you please take an initial look? I'm still thinking about the best way to do this, but currently I think using a validation method on the storage handler is the best way to go. Not entirely comfortable with tying this new method to the FileSinkDesc (ideally I'd like to make it a bit more generic) but so far that was the only thing that worked out well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] marton-bod commented on pull request #2917: HIVE-25843: Add flag to disable Iceberg FileIO config serialization

Posted by GitBox <gi...@apache.org>.
marton-bod commented on pull request #2917:
URL: https://github.com/apache/hive/pull/2917#issuecomment-1005740910


   @pvary Can you please take an initial look? I'm still thinking about the best way to do this, but currently I think using a validation method on the storage handler is the best way to go. Not entirely comfortable with tying this new method to the FileSinkDesc (ideally I'd like to make it a bit more generic) but so far that was the only thing that worked out well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org