You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Andrew Sherman (Code Review)" <ge...@cloudera.org> on 2021/08/20 16:58:51 UTC

[Impala-ASF-CR] IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17798


Change subject: IMPALA-10849: Ignore escaped wildcards that terminate like predicates.
......................................................................

IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

A like predicate is generally evaluated by converting it into a regex
that is evaluated at execution time. If the predicate of a like clause
is a constant (which is the common case when you say "row
like 'start%'") then there are optimizations where some cases that are
simpler then a regex are spotted, and a simple function than a regex
evaluator is used. One example is that a predicate such as ‘start%’ is
evaluated by looking for strings that begin with "start". Amusingly the
code that spots the potential optimizations uses regexes to look for
patterns in the like predicate. The code that looks for the
optimization where a simple prefix can be searched for does not deal
with the case where the '%' wildcard at the end of the predicate is
escaped. To fix this we add a test that deals with the case where the
predicate ends in an escaped '%'.

There are some other problems with escaped wildcards discussed in
IMPALA-2422. This change does not fix these problems, which are hard.

New tests for escaped wildcards are added to exprs.test - note that
these tests cannot be part of the LikeTbl tests as the like predicate
optimizations are only applied when the like predicate is a string
literal.

Exhaustive tests ran clean.

Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
---
M be/src/exprs/like-predicate.cc
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
2 files changed, 63 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/17798/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Gerrit-Change-Number: 17798
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>

[Impala-ASF-CR] IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17798 )

Change subject: IMPALA-10849: Ignore escaped wildcards that terminate like predicates.
......................................................................


Patch Set 2: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Gerrit-Change-Number: 17798
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 25 Aug 2021 05:31:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/17798 )

Change subject: IMPALA-10849: Ignore escaped wildcards that terminate like predicates.
......................................................................


Patch Set 1: Code-Review+1

LGTM


-- 
To view, visit http://gerrit.cloudera.org:8080/17798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Gerrit-Change-Number: 17798
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Aug 2021 00:10:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17798 )

Change subject: IMPALA-10849: Ignore escaped wildcards that terminate like predicates.
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Gerrit-Change-Number: 17798
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Aug 2021 23:13:51 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/17798 )

Change subject: IMPALA-10849: Ignore escaped wildcards that terminate like predicates.
......................................................................


Patch Set 1: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Gerrit-Change-Number: 17798
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Aug 2021 23:04:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17798 )

Change subject: IMPALA-10849: Ignore escaped wildcards that terminate like predicates.
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7416/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Gerrit-Change-Number: 17798
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Aug 2021 23:13:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17798 )

Change subject: IMPALA-10849: Ignore escaped wildcards that terminate like predicates.
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9339/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Gerrit-Change-Number: 17798
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 20 Aug 2021 17:22:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17798 )

Change subject: IMPALA-10849: Ignore escaped wildcards that terminate like predicates.
......................................................................

IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

A like predicate is generally evaluated by converting it into a regex
that is evaluated at execution time. If the predicate of a like clause
is a constant (which is the common case when you say "row
like 'start%'") then there are optimizations where some cases that are
simpler then a regex are spotted, and a simple function than a regex
evaluator is used. One example is that a predicate such as ‘start%’ is
evaluated by looking for strings that begin with "start". Amusingly the
code that spots the potential optimizations uses regexes to look for
patterns in the like predicate. The code that looks for the
optimization where a simple prefix can be searched for does not deal
with the case where the '%' wildcard at the end of the predicate is
escaped. To fix this we add a test that deals with the case where the
predicate ends in an escaped '%'.

There are some other problems with escaped wildcards discussed in
IMPALA-2422. This change does not fix these problems, which are hard.

New tests for escaped wildcards are added to exprs.test - note that
these tests cannot be part of the LikeTbl tests as the like predicate
optimizations are only applied when the like predicate is a string
literal.

Exhaustive tests ran clean.

Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Reviewed-on: http://gerrit.cloudera.org:8080/17798
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exprs/like-predicate.cc
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
2 files changed, 63 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/17798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Gerrit-Change-Number: 17798
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>