You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/07/01 17:28:47 UTC

[GitHub] [iceberg] rdsr opened a new issue #1155: Support HiveCatalog for Iceberg StorageHandler

rdsr opened a new issue #1155:
URL: https://github.com/apache/iceberg/issues/1155


   @massdosage, @rdblue  .  Is anyone workin on this? I wanted to take this up and we are only using HiveCatalog.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] guilload commented on issue #1155: Support HiveCatalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
guilload commented on issue #1155:
URL: https://github.com/apache/iceberg/issues/1155#issuecomment-652598657


   I have some thoughts regarding this. Now that the SerDe has been merged, there are two pieces of code that do essentially the same thing:
   - [IcebergInputFormat.findTable(...)](https://github.com/apache/iceberg/blob/master/mr/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java#L357)
   - [TableResolver.resolveTableFromConfiguration(...)](https://github.com/apache/iceberg/blob/master/mr/src/main/java/org/apache/iceberg/mr/mapred/TableResolver.java#L45)
   
   1. Those two classes should probably be consolidated into one.
   2. The ability to define a custom catalog loader is neat. We actually use our own catalog at Airbnb and that feature comes in handy.
   3. At the same, most users are going to use either the default Hadoop or Hive catalogs, so let's make it easy for them.
   4. I'm personally not a fan of using  a "table path" configuration property to pass either a file path or a table identifier and have to look for the `/` character  to know which use is intended.
   
   Roughly, I have something like that in mind:
   
   ```python
   def find_table(conf)
     assert conf.get('table.identifier') is None or conf.get('table.path') is None
     assert conf.get('catalog') is None or conf.get('catalog.loader.class') is None
   
     if conf.get('table.path'):
       return HadoopTables.load(conf.get('table.path'))
   
     if conf.get('catalog.loader.class'):
       loader = load_class(conf.get('catalog.loader.class'))
       return loader.load(conf)
   
     identifier = conf.get('table.identifier')
   
     if (conf.get('catalog') == 'hadoop'):
       return HadoopCatalog.load(identifier)
     elif (conf.get('catalog') == 'hive'):
       return HiveCatalog.load(identifier)
     else:
       raise ...
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] guilload commented on issue #1155: Support custom catalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
guilload commented on issue #1155:
URL: https://github.com/apache/iceberg/issues/1155#issuecomment-662778856


   Ok, I'll take a stab at it!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdsr commented on issue #1155: Support HiveCatalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
rdsr commented on issue #1155:
URL: https://github.com/apache/iceberg/issues/1155#issuecomment-652601748


   I had something similar in mind.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] massdosage commented on issue #1155: Support custom catalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
massdosage commented on issue #1155:
URL: https://github.com/apache/iceberg/issues/1155#issuecomment-652844652


   This is on our TODO list but we were only going to start looking at it properly after the `mapred` InputFormat is merged. The above suggestion to merge the code and the logic outlined above looks good. If someone else wants to start working on this now that's fine by me.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdsr commented on issue #1155: Support custom catalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
rdsr commented on issue #1155:
URL: https://github.com/apache/iceberg/issues/1155#issuecomment-662717784


   @guilload I might be able to get to this this week. So I'm cool if you'd like to finish this up. Happy to review the code!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdsr commented on issue #1155: Support custom catalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
rdsr commented on issue #1155:
URL: https://github.com/apache/iceberg/issues/1155#issuecomment-771004466


   I think so too, I haven't been following the code changes, but I do see in the `hive.md` that we can use `HiveCatalog`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] guilload commented on issue #1155: Support custom catalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
guilload commented on issue #1155:
URL: https://github.com/apache/iceberg/issues/1155#issuecomment-662714548


   Has anybody started working on this issue? Happy to get this going otherwise.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #1155: Support custom catalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #1155:
URL: https://github.com/apache/iceberg/issues/1155#issuecomment-770886069


   I think we can close this issue, as the catalog configuration is already there.
   What do you think @rdsr?
   
   Thanks,
   Peter


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdsr closed issue #1155: Support custom catalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
rdsr closed issue #1155:
URL: https://github.com/apache/iceberg/issues/1155


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdsr edited a comment on issue #1155: Support custom catalog for Iceberg StorageHandler

Posted by GitBox <gi...@apache.org>.
rdsr edited a comment on issue #1155:
URL: https://github.com/apache/iceberg/issues/1155#issuecomment-662717784






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org