You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/04/13 04:52:27 UTC

[GitHub] [incubator-iceberg] rdsr opened a new issue #920: Implement Schema visitors for ORC

rdsr opened a new issue #920: Implement Schema visitors for ORC 
URL: https://github.com/apache/incubator-iceberg/issues/920
 
 
   I wanted to see if I could implement #897 for ORC.  Given how ORC value readers are implemented for Iceberg-generics and Iceberg-spark, we have to do extra work. We will have to traverse Iceberg schema in `SparkOrcReader` just like we currently do in `GenericOrcReader` This traversal with Iceberg schema is duplicate work which can probably be abstracted in a visitor just as we do for `AvroSchemaWithTypeVisitor` .
   
   This is a big refactoring change, but overall will improve readability and remove duplicate traversal code.  But we don't need to absolutely do it, we can traverse the Iceberg schema in SparkOrcReader and have an abstract StructReader which both Spark and data Orc readers share.   
   
   I'm +1 for doing it, but before doing wanted to see if folks are ok with it.  @rdblue, @shardulm94, what do u guys think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [incubator-iceberg] rdblue commented on issue #920: Implement Schema visitors for ORC value readers

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #920: Implement Schema visitors for ORC value readers
URL: https://github.com/apache/incubator-iceberg/issues/920#issuecomment-612969043
 
 
   I think it's a good idea to add ORC schema visitors. I've found them helpful for Avro and Parquet because the recursive traversal is abstracted from the logic of each visitor. It's been easy to build new visitors. So I'm +1.
   
   I also want to point out that this doesn't have to be one big refactor. We can keep the current code as it is and just use the visitors for new things, and slowly convert over to the visitors in places where it makes sense. A minimal start is to define the visitor interface and visit method, then use it for what you need to, like the read builder in this case.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] edgarRd commented on issue #920: Implement Schema visitors for ORC value readers

Posted by GitBox <gi...@apache.org>.
edgarRd commented on issue #920:
URL: https://github.com/apache/iceberg/issues/920#issuecomment-636101655


   @rdsr we can probably close this issue since the visitors are implemented in https://github.com/apache/iceberg/pull/989 - right?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [incubator-iceberg] rdsr edited a comment on issue #920: Implement Schema visitors for ORC value readers

Posted by GitBox <gi...@apache.org>.
rdsr edited a comment on issue #920:
URL: https://github.com/apache/incubator-iceberg/issues/920#issuecomment-616336196


   Thanks @rdblue . I'm taking this up to support #897 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdsr closed issue #920: Implement Schema visitors for ORC value readers

Posted by GitBox <gi...@apache.org>.
rdsr closed issue #920:
URL: https://github.com/apache/iceberg/issues/920


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdsr commented on issue #920: Implement Schema visitors for ORC value readers

Posted by GitBox <gi...@apache.org>.
rdsr commented on issue #920:
URL: https://github.com/apache/iceberg/issues/920#issuecomment-636217401


   Yea. I think we can close this. There are a few more visitors we could implement, but we'd take that up when required


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [incubator-iceberg] rdsr commented on issue #920: Implement Schema visitors for ORC value readers

Posted by GitBox <gi...@apache.org>.
rdsr commented on issue #920:
URL: https://github.com/apache/incubator-iceberg/issues/920#issuecomment-616336196


   Thanks @rdblue . I'm taking this up as part of #897 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [incubator-iceberg] rdsr commented on issue #920: Implement Schema visitors for ORC value readers

Posted by GitBox <gi...@apache.org>.
rdsr commented on issue #920:
URL: https://github.com/apache/incubator-iceberg/issues/920#issuecomment-620776035


   I plan to work on this this week together with #897 .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org