You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/07/10 00:03:06 UTC

[GitHub] [iceberg] shardulm94 commented on issue #1021: Add _file and _pos metadata columns to ORC readers

shardulm94 commented on issue #1021:
URL: https://github.com/apache/iceberg/issues/1021#issuecomment-656408465


   I was planning to work on this, but I am unsure of some of the details.
   - How does the ORC reader know that it has to project metadata columns? Will it be part of the expected Iceberg schema provided to the readers? If yes, will these columns have reserved field ids which helps the readers identify the metadata columns or do they have specific names?
   - I was thinking of passing starting position of the first row in each VectorizedRowBatch as part of the OrcValueReader interface and then create a new OrcValueReader returns baseOffset+currentRowIndex for every row. Let me know if this sounds okay to you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org