You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org> on 2022/07/13 16:12:55 UTC

[Impala-ASF-CR] IMPALA-886: Support displaying HBase cols in the order from HMS

Hello Quanlong Huang, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18635

to look at the new patch set (#5).

Change subject: IMPALA-886: Support displaying HBase cols in the order from HMS
......................................................................

IMPALA-886: Support displaying HBase cols in the order from HMS

Before this patch catalogd always ordered HBase columns by
lexicographically by family/qualifier. This is incompatible with other
table formats and the way Hive handles HBase tables, where the order
comes from HMS as defined during CREATE TABLE.

I don't know of any valid reason behind this old behavior, it probably
just made the implementation a bit easier by doing the ordering in FE
instead of BE - the BE actually needs this ordering during scanning
as the HBase API returns results in this order, but this should have
no effect on other parts of Impala.

Added flag use_hms_column_order_for_hbase_tables (used by catalogd)
to decide whether to do this reordering:
- true: keep HMS order
- false: reorder by family/qualifier [default]

The old way is kept as default to avoid breaking existing workloads,
but it would make sense to change it in the next major release.

Note that a query option would be more convenient to use, but it
would be much harder to implement it as the order is decided during
loading in catalogd.

Testing:
- added custom cluster test for
  use_hms_column_order_for_hbase_tables = true

Change-Id: Ibc5df8b803f2ae3b93951765326cdaea706e3563
---
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A testdata/workloads/functional-query/queries/QueryTest/hbase-hms-column-order.test
A tests/custom_cluster/test_hbase_hms_column_order.py
10 files changed, 156 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/18635/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc5df8b803f2ae3b93951765326cdaea706e3563
Gerrit-Change-Number: 18635
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>