You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org> on 2021/02/09 15:24:12 UTC

[Impala-ASF-CR] IMPALA-10482, IMPALA-10493: Fix bugs in full ACID collection query rewrites

Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17038

to look at the new patch set (#3).

Change subject: IMPALA-10482, IMPALA-10493: Fix bugs in full ACID collection query rewrites
......................................................................

IMPALA-10482, IMPALA-10493: Fix bugs in full ACID collection query rewrites

IMPALA-10482: SELECT * query on unrelative collection column of
transactional ORC table will hit IllegalStateException.

The AcidRewriter will rewrite queries like
"select item from my_complex_orc.int_array" to
"select item from my_complex_orc t, t.int_array"

This cause troubles in star expansion. Because the original query
"select * from my_complex_orc.int_array" is analyzed as
"select item from my_complex_orc.int_array"

But the rewritten query "select * from my_complex_orc t, t.int_array" is
analyzed as "select id, item from my_complex_orc t, t.int_array".

Hidden table refs can also cause issues during regular column
resolution. E.g. when the table has top-level 'pos'/'item'/'key'/'value'
columns.

The workaround is to keep track of the automatically added table refs
during query rewrite. So when we analyze the rewritten query we can
ignore these auxiliary table refs.

IMPALA-10493: Using JOIN ON syntax to join two full ACID collections
produces wrong results.

When AcidRewriter.splitCollectionRef() creates a new collection ref
it doesn't copy every information needed to correctly execute the
query. E.g. it dropped the ON clause, turning INNER joins to CROSS
joins.

Testing:
 * added e2e tests

Change-Id: I8fc758d3c1e75c7066936d590aec8bff8d2b00b0
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/QueryTest/nested-types-scanner-basic.test
7 files changed, 231 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/17038/3
-- 
To view, visit http://gerrit.cloudera.org:8080/17038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8fc758d3c1e75c7066936d590aec8bff8d2b00b0
Gerrit-Change-Number: 17038
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>