You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/05/31 12:24:15 UTC

[GitHub] [iceberg] renshangtao opened a new pull request, #4916: Hive: Fix an error when create external table in hive catalog

renshangtao opened a new pull request, #4916:
URL: https://github.com/apache/iceberg/pull/4916

   `hive> CREATE EXTERNAL TABLE iceberg_db.test_eet(id int, chengshi string)  
             STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' 
             LOCATION 'hdfs://xx.x.xx.xx:9009/user/hive/warehouse/iceberg_db.db/test_eet'
             TBLPROPERTIES ('iceberg.catalog'='hive_catalog');
   FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.iceberg.exceptions.AlreadyExistsException: Table already exists: iceberg_db.test_eet
   `
   
   To access the iceberg table using CDH hive 2.3.8, we need to create an external table in CDH hive, and ensure that the database name and table name are consistent with those in iceberg. If they are inconsistent, no data can be queried. When creating the external table, an error that the table already exists will be reported, because when MapReduce creates the table, if it is hive catalog, it does not check whether the table exists.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] renshangtao closed pull request #4916: Hive: Fix an error when create external table in hive catalog

Posted by GitBox <gi...@apache.org>.
renshangtao closed pull request #4916: Hive: Fix an error when create external table in hive catalog
URL: https://github.com/apache/iceberg/pull/4916


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #4916: Hive: Fix an error when create external table in hive catalog

Posted by GitBox <gi...@apache.org>.
pvary commented on PR #4916:
URL: https://github.com/apache/iceberg/pull/4916#issuecomment-1144875534

   @renshangtao: Thanks for the detailed answer. This helps a lot.
   1. I remember issues using Iceberg with Hive 2.1.x versions before - they have popped  up several times before, so you can expect further issues down the line. Maybe column pruning? Maybe writes to the Iceberg table through Hive... TBH I do not remember exactly, but you might want to search previous issues on github to save some headaches later.
   2. I think the issue is that you are trying to connect 2 HMS instances and you would like to create the cross HMS HiveCatalog for the Iceberg tables. This is not something that is supported yet in Iceberg.
   
   I see 2 options:
   - Could you use a different catalog for the ingest? Maybe simple HadoopTables, or HadoopCatalog? In this case you can read the table from Hive creating a simple Iceberg table using the specific catalog.
   - Could you use the same HMS DB for you 2 Hive instances? In this case you will have the same tables, and no issues


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #4916: Hive: Fix an error when create external table in hive catalog

Posted by GitBox <gi...@apache.org>.
pvary commented on PR #4916:
URL: https://github.com/apache/iceberg/pull/4916#issuecomment-1143649450

   @renshangtao: No version of CDH has Hive 2.3.8. I think CDH 5 has Hive 1.1.x, CDH 6 has Hive 2.1.x, CDP 7 has Hive 3.1.x, so there should be some miscommunication here. Could you please elaborate?
   
   What is the error when you try to create the external table in the first time (without creating the Iceberg table first).
   
   Also could you please describe what you try to archive here?
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #4916: Hive: Fix an error when create external table in hive catalog

Posted by GitBox <gi...@apache.org>.
pvary commented on PR #4916:
URL: https://github.com/apache/iceberg/pull/4916#issuecomment-1147100986

   @renshangtao: Registering a table to multiple catalogs could cause serious trouble later, so it is not supported. See:  https://github.com/apache/iceberg/pull/4946#issuecomment-1146857925
   
   If the same HMS could not be used then it might be time to develop the feature where the non-default HMS could be used for a catalog when accessing the table:
   - HiveCatalog should use the provided extra HMS URL if provided.
   - If the non-default HMS is used then the table creation logic should be changed, as it is done in this PR.
   
   This way the linked table will reflect the changes on the original table, and there will be no conflict between the cleanup processes.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] renshangtao closed pull request #4916: Hive: Fix an error when create external table in hive catalog

Posted by GitBox <gi...@apache.org>.
renshangtao closed pull request #4916: Hive: Fix an error when create external table in hive catalog
URL: https://github.com/apache/iceberg/pull/4916


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] renshangtao commented on pull request #4916: Hive: Fix an error when create external table in hive catalog

Posted by GitBox <gi...@apache.org>.
renshangtao commented on PR #4916:
URL: https://github.com/apache/iceberg/pull/4916#issuecomment-1146950819

   @pvary Thank you 
   We can't use the same HMS because the The customer has a lot of business in CDH and they just want to read the analyzing iceberg table through CDH Hive
   
   the pr which i commit, just want to solve the error message. I think it should not return the existing error message, 
   because in different HMS cases, if you want to use CDH hive to read Iceberg's table, 
   you can only create a table with the same database name and table name, otherwise you cannot read any data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] renshangtao commented on pull request #4916: Hive: Fix an error when create external table in hive catalog

Posted by GitBox <gi...@apache.org>.
renshangtao commented on PR #4916:
URL: https://github.com/apache/iceberg/pull/4916#issuecomment-1144554377

   @pvary thank you for your reply.
   
   We have a customer, they use cdh 6.0 to process bigdata before, and now they deploy a flink +iceberg environment to process stream data. And they want to use cdh hive 2.1.1 to read the iceberg table created by flink, because they have a lot of sql written by for hive;
   
   so i create an EXTERNAL table on cdh hive 2.1.1 to access the table on iceberg which created by flink. The table name and database name must equal with iceberg, otherwise i can't read any data from the table;
   
   when I create the same name table in hive it returns an error, like this:
   FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.iceberg.exceptions.AlreadyExistsException: Table already exists: iceberg_db.test_eet
   
   this two environment use different hms.
   
   Thanks you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] renshangtao commented on pull request #4916: Hive: Fix an error when create external table in hive catalog

Posted by GitBox <gi...@apache.org>.
renshangtao commented on PR #4916:
URL: https://github.com/apache/iceberg/pull/4916#issuecomment-1148319767

   @pvary OK,Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org