You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/12/21 08:09:10 UTC

[GitHub] [iceberg] liangyouze opened a new issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

liangyouze opened a new issue #3783:
URL: https://github.com/apache/iceberg/issues/3783


   I'm trying to create hive catalog table with Spark and set custom tblproperties:
   ```
   CREATE TABLE hive_prod.db.t1 (
       id bigint,
       data string,
       category string,
       ts timestamp)
   USING iceberg
   PARTITIONED BY ( category) 
   TBLPROPERTIES('k1'='v1')
   ```
   but the table properties are not stored in hive-metastore. when I execute sql `show tblproperties hive_prod.db.t1`, I can't see my properties `k1=v1`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] liangyouze commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
liangyouze commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-999410029


   oh,it looks good, thanks. we used version 0.11.x, and this feature is not yet available. maybe we can update our version to 0.12


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] flyrain commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
flyrain commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-1005273950


   @liangyouze. FYI. Table property sync between HMS and metadata.json files are going to be tricky. Other than the command "alter table", the other action can change the table property, for example, the table property need to be updated if you time travel the a previous version. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nssalian commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
nssalian commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-1007606883


   Thanks for the response @pvary. I think identity partitions might suffice my use case.
   I should have added this in the initial note but I wanted to check if using 
   ```hiveCatalog.createTable(tableIdentifier, icebergSchema, icebergPSpec)``` to create a table makes an entry into the PARTITION_KEY. I checked the Metastore Db and I couldn't find it. Do you think we need to change the implementation to add partition information for this API call? I think it adds all the columns as regular columns.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hongyonggan commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
hongyonggan commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-998573667


   I encountered the same problem too,does anyone pay attention to this matter?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nssalian commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
nssalian commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-1006216178


   @pvary do you know if there's a clean way to set the PartitionSpec into the Metastore while creating a table using the Hive Catalog? I see your code setting properties but would setting partition spec be 
   For the Hive Catalog, I see that the partition spec column names (`PKEY_NAME`) aren’t stored in the Hive Metastore in the `PARTITION_KEYS`. I don’t see the code doing anything for Partition spec. Ideally, it would have been great to have the partition spec column be a partition key so it can be represented in the schema.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nssalian commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
nssalian commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-1007606883


   Thanks for the response @pvary. I think identity partitions might suffice my use case.
   I should have added this in the initial note but I wanted to check if using 
   ```hiveCatalog.createTable(tableIdentifier, icebergSchema, icebergPSpec)``` to create a table makes an entry into the PARTITION_KEY. I checked the Metastore Db and I couldn't find it. Do you think we need to change the implementation to add partition information for this API call? I think it adds all the columns as regular columns.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-1007896178


   @nssalian: Hive tables will be unpartitioned by design. See: https://iceberg.apache.org/#hive/
   
   > to Hive, the table appears to be unpartitioned although the underlying Iceberg table is partitioned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hongyonggan removed a comment on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
hongyonggan removed a comment on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-998573667


   I encountered the same problem too,does anyone pay attention to this matter?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-999285522


   Interesting use case. I think it should be possible to support storing table properties in hive metastore aswell. Let's see what others think. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-998735019


   I think the table properties are stored in table metadata file [https://iceberg.apache.org/#spec/#table-metadata-fields] and only reference pointer (location) of table metadata is stored in hive metastore.
   
   can you explain what is the problem you are facing because of this ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-1006352825


   @nssalian: With Hive 2.x and 3.x you can chose from the following possibilities:
   - For identity partitions:
   ```
   CREATE TABLE database_a.table_a (
     id bigint, name string
   ) PARTITIONED BY (
     dept string
   ) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler';
   ```
   - For other partitioning schemas one can use the json formatted schema and partition specification - this is very hard to use, and we do not really recommend it:
   ```
   CREATE EXTERNAL TABLE customers STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
   LOCATION '.../default/customers'
   TBLPROPERTIES (
      'iceberg.mr.table.schema'='{"type":"struct","schema-id":0,"fields":[{"id":1,"name":"customer_id","required":false,"type":"long"},{"id":2,"name":"first_name","required":false,"type":"string","doc":"This is first name"},{"id":3,"name":"last_name","required":false,"type":"string","doc":"This is last name"}]}',
      'iceberg.mr.table.partition.spec'='{"spec-id":0,"fields":[]}')
   ```
   
   In Hive master @lcspinter created a PR which changes the Hive syntax and one can create tables with Iceberg partitioning schemas (https://issues.apache.org/jira/browse/HIVE-25179):
   ```
   CREATE TABLE ... PARTITIONED BY SPEC( year(year_field), month(month_field), day(day_field), hour(hour_field), truncate(3, truncate_field), bucket(5, bucket_field bucket), identity_field ) STORED BY ICEBERG;
   ```
   
   I hope this helps.
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-999365892


   When creating a table we synchronise the iceberg table properties with the HMS table properties.
   See:
   https://github.com/apache/iceberg/blob/7fcc71da65a47ca3c9f6eb6e862a238389b8bdc5/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java#L354-L397
   
   In Hive 2 and Hive 3 there is no way to synchronise the Hive changes back to the Iceberg tables, but we have an upstream Hive patch for synchronising the changes in the other direction too (Hive->Iceberg): [HIVE-25065](https://issues.apache.org/jira/browse/HIVE-25065).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] liangyouze commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
liangyouze commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-999252754


   In our custom hive-metastore, we have made some routing rules for metadata according to tblproperties


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #3783: Cannot set TBLPROPERTIES when using HiveCatalog

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #3783:
URL: https://github.com/apache/iceberg/issues/3783#issuecomment-1007896178


   @nssalian: Hive tables will be unpartitioned by design. See: https://iceberg.apache.org/#hive/
   
   > to Hive, the table appears to be unpartitioned although the underlying Iceberg table is partitioned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org