You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2018/10/30 15:29:11 UTC

[Impala-ASF-CR] IMPALA-7586: fix predicate pushdown of escaped strings

Tim Armstrong has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/11814 )

Change subject: IMPALA-7586: fix predicate pushdown of escaped strings
......................................................................

IMPALA-7586: fix predicate pushdown of escaped strings

This fixes a class of bugs where the planner incorrectly uses the raw
string from the parser instead of the unescaped string. This occurs in
several places that push predicates down to the storage layer:
* Kudu scans
* HBase scans
* Data source scans

There are some more complex issues with escapes and the LIKE predicate
that are tracked separately by IMPALA-2422.

This also uncovered a different issue with RCFiles that is tracked by
IMPALA-7778 and is worked around by the tests added.

In order to make bugs like this more obvious in future, I renamed
getValue() to getValueWithOriginalEscapes().

Testing:
Added regression test that tests handling of backslash escapes on all file
formats. I did not add a regression test for the data source bug since it
seems to require some major modification of the data source test
infrastructure.

Change-Id: I53d6e20dd48ab6837ddd325db8a9d49ee04fed28
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java
M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M testdata/data/README
A testdata/data/strings_with_quotes.csv
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A testdata/workloads/functional-query/queries/QueryTest/string-escaping-rcfile-bug.test
A testdata/workloads/functional-query/queries/QueryTest/string-escaping.test
M tests/query_test/test_scanners.py
15 files changed, 193 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/11814/5
-- 
To view, visit http://gerrit.cloudera.org:8080/11814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I53d6e20dd48ab6837ddd325db8a9d49ee04fed28
Gerrit-Change-Number: 11814
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>