You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Khazar Mammadli (Jira)" <ji...@apache.org> on 2022/10/25 17:18:00 UTC

[jira] [Resolved] (KUDU-3401) Unable to query Kudu tables from Hive with Kudu HMS Integration enabled

     [ https://issues.apache.org/jira/browse/KUDU-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Khazar Mammadli resolved KUDU-3401.
-----------------------------------
    Fix Version/s: 1.15.0
       Resolution: Fixed

> Unable to query Kudu tables from Hive with Kudu HMS Integration enabled
> -----------------------------------------------------------------------
>
>                 Key: KUDU-3401
>                 URL: https://issues.apache.org/jira/browse/KUDU-3401
>             Project: Kudu
>          Issue Type: Bug
>          Components: hms
>            Reporter: Khazar Mammadli
>            Assignee: Khazar Mammadli
>            Priority: Major
>             Fix For: 1.15.0
>
>
> When Kudu HMS integration is enabled there are several missing fields when creating a table via query  "stored as kudu table" on Impala from hive. This results in ClassNotFound error when trying to query the table from Hive after creating the table:
>  
> {code:java}
> ERROR : Failed
> org.apache.hadoop.hive.metastore.api.MetaException: java.lang.ClassNotFoundException Class not found
> at org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:98) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
> at org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:77) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
> at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:331) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141] {code}
>  
> When running a following sample query in Impala to create a kudu table with Kudu HMS integration enabled the table gets created with the InputFormat, OutputFormat and SerDe Library fields are missing
>  
> {code:java}
> create table default.kudu_test (
> col1 string comment 'col1',
> col2 string comment 'col2',
> primary key (col1)
> )
> comment 'kudu_test'
> stored as kudu;{code}
>  
> |SerDe Library:| |NULL|
> |InputFormat:| |NULL|
> |OutputFormat:| |NULL|
> Hive Metastore log for the table creation:
> INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-5-thread-124]: 134: source:172.25.35.0 create_table: Table(tableName:kudu_test, dbName:default, owner:root, createTime:0, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:col1, type:string, comment:col1), FieldSchema(name:col2, type:string, comment:col2)], location:, inputFormat:, outputFormat:, compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:, serializationLib:, parameters:{}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], parameters:{kudu.table_name=default.kudu_test, kudu.table_id=5ac46856863f402fb69941ce7b967945, comment=, kudu.master_addresses=c3549-node2.coelab.cloudera.com:7051, storage_handler=org.apache.hadoop.hive.kudu.KuduStorageHandler, kudu.cluster_id=65c8dfbc8b75485db1328ab42f55fa07}, viewOriginalText:, viewExpandedText:, tableType:MANAGED_TABLE, temporary:false, ownerType:USER)
> Running the same query in Impala with Kudu HMS Integration disabled on the other hand has these fields populated when the table is created:
> |SerDe Library:|org.apache.hadoop.hive.kudu.KuduSerDe|NULL|
> |InputFormat:|org.apache.hadoop.hive.kudu.KuduInputFormat|NULL|
> |OutputFormat:|org.apache.hadoop.hive.kudu.KuduOutputFormat|NULL|
> Hive Metastore log for table creation:
> NFO  org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-5-thread-173]: 183: source:172.25.35.0 create_table_req: Table(tableName:kudu_test, dbName:default, owner:root, createTime:0, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:col1, type:string, comment:col1), FieldSchema(name:col2, type:string, comment:col2)], location:null, inputFormat:org.apache.hadoop.hive.kudu.KuduInputFormat, outputFormat:org.apache.hadoop.hive.kudu.KuduOutputFormat, compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.kudu.KuduSerDe, parameters:{}), bucketCols:[], sortCols:[], parameters:null), partitionKeys:[], parameters:{comment=kudu_test_lbodor_no_hms_integration, kudu.master_addresses=c3549-node2.coelab.cloudera.com, storage_handler=org.apache.hadoop.hive.kudu.KuduStorageHandler, kudu.table_name=impala::default.kudu_test}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, catName:hive, ownerType:USER, accessType:8)
> --------------------------------
> Code path for table creation when Kudu HMS integration enabled(Kudu Codepath):
> Quick recap of steps when creating a kudu table:
> HMSCatalog::CreateTable() —> hive::Table declared and passed to PopulateTable(… , &table) -> Thirft client Execute call —> HMSClient::CreateTable(Table(one that just got populated), envcontext(default)) -> hms_client.create_table_with_environment_context(table, envcontext). 
> CreateTable
> [https://github.com/apache/kudu/blob/master/src/kudu/hms/hms_catalog.cc#L146] ->
> Populate the fields of table
> [https://github.com/apache/kudu/blob/master/src/kudu/hms/hms_catalog.cc#L367]
> Hms client call
> [https://github.com/apache/kudu/blob/master/src/kudu/hms/hms_client.cc#L280]
> ----------------------------- 
> Code path for table creation when Kudu HMS integration is disabled(Impala Codepath):
> CreateTable -> CreateMetaStoreTable
> [https://github.com/apache/impala/blob/da3d6fc7f7c656b118bb3570cedf7d7c3158bd0b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L3191]
> ->line 3248 tbl.setSd(createSd(params)); 
> CreateSd
> [https://github.com/apache/impala/blob/da3d6fc7f7c656b118bb3570cedf7d7c3158bd0b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L3260|https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/fe/src/main/java/org/apache/impala/catalog/HiveStorageDescriptorFactory.java#L36]
>  
> Checking the code paths its observable that the missing fields are filled via CreateSd with default values for the table getting created without Kudu HMS integration(Through Impala).
> These fields are untouched when Kudu HMS integration is enabled and table is getting created(Kudu code path). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)