You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org> on 2021/04/05 22:35:07 UTC

[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog

Vihang Karajgaonkar has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17244 )

Change subject: IMPALA-10613: Standup HMS thrift server in Catalog
......................................................................

IMPALA-10613: Standup HMS thrift server in Catalog

This change adds the basic infrastructure to start the HMS server in
Catalog. It introduces a new configuration (--start_hms_server) along with a
config for the port and starts a HMS thrift server in the CatalogServiceCatalog
instance. Currently, all the HMS APIs are "pass-through" to the backing HMS
service. Except for the following 3 HMS APIs which can be used to request
a table and its partitions.

Additionally, there is another flag (--enable_catalogd_hms_cache) which
can be used to disable the usage of catalogd for providing the table
and partition metadata. This contribution was done by Kishen Das.

1. get_table_req
2. get_partitions_by_expr
3. get_partitions_by_names

In case of get_partitions_by_expr we need the hive-exec jar to be
present in the classpath since it needs to load the PartitionExpressionProxy
to push down the partition predicates to the HMS database. In case of
get_table_req if column statistics are requested, we return the
table level statistics.

Additionally, this patch adds a new configuration
fallback_to_hms_on_errors for the catalog which is used to determine
if the Catalog falls back to HMS service in case of errors while
executing the API. This is useful for testing purposes.

In order to expose the file-metadata for the tables and partitions,
HMS API changes were made to add the filemetadata fields to table
and partitions. In case of transactional tables, the file-metadata
which is returned is consistent with the provided ValidWriteIdList
in the API call.

There are a few TODOs which will be done in follow up tasks:
1. Add support for SASL support.
2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs
in the hive branch doesn't break Catalog's HMS service.

Testing:
1. Added a new end-to-end test which starts the HMS service in Catalog and runs
some basic HMS APIs against it.
2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and
confirmed most tests are working. There were some test failures but they are
unrelated since the test assumes an empty warehouse whereas we run against the
actual HMS service running in the mini-cluster.

Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5
---
M be/src/catalog/catalog-server.cc
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java
A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java
A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/common/impala_test_suite.py
A tests/custom_cluster/test_metastore_service.py
23 files changed, 5,313 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5
Gerrit-Change-Number: 17244
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>