You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2019/06/22 02:23:14 UTC

[GitHub] [incubator-iceberg] rdsr edited a comment on issue #227: ORC column map fix

rdsr edited a comment on issue #227: ORC column map fix
URL: https://github.com/apache/incubator-iceberg/pull/227#issuecomment-504619029
 
 
   @rdblue , @edgarRd . I was thinking about this and wanted to get your inputs here. 
   
   I was thinking if the implementation would become simpler/cleaner if we maintained the column ids close to the ORC schema. For instance, we could do something similar to Avro schema, where the column ids are maintained as metadata properties tied to each schema element. This, of course, is not possible today with ORC's `TypeDescription`. But, let's say we exposed a `RichTypeDescription` schema for ORC instead of its standard `TypeDescription`. The `RichTypeDescription` would maintain metadata properties for each schema field, similar to Avro. This metadata property will hold the column ids. This could help get rid of the separate map which we have to maintain, can possibly also make things simpler when we have to address schema evolution [by id] during read where we'd have to make use of visitors, similar to Avro. 
   
   The big benefit I see is that we don't have to expose two data structures or spread the schema information across two classes.  
   
   Another potential benefit of this approach is, if we can get the ORC community to adopt this metadata per field schema patch, then we can completely get rid of our own `RichTypeDescription` without changing our code drastically. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org