You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/08/25 15:38:00 UTC

[jira] [Commented] (IMPALA-10849) A LIKE predicate that ends in an escaped wildcard is incorrectly evaluated

    [ https://issues.apache.org/jira/browse/IMPALA-10849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404558#comment-17404558 ] 

ASF subversion and git services commented on IMPALA-10849:
----------------------------------------------------------

Commit b54d0c35ffd354ee1ac9bc781a8ff36a64125676 in impala's branch refs/heads/master from Andrew Sherman
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b54d0c3 ]

IMPALA-10849: Ignore escaped wildcards that terminate like predicates.

A like predicate is generally evaluated by converting it into a regex
that is evaluated at execution time. If the predicate of a like clause
is a constant (which is the common case when you say "row
like 'start%'") then there are optimizations where some cases that are
simpler then a regex are spotted, and a simple function than a regex
evaluator is used. One example is that a predicate such as ‘start%’ is
evaluated by looking for strings that begin with "start". Amusingly the
code that spots the potential optimizations uses regexes to look for
patterns in the like predicate. The code that looks for the
optimization where a simple prefix can be searched for does not deal
with the case where the '%' wildcard at the end of the predicate is
escaped. To fix this we add a test that deals with the case where the
predicate ends in an escaped '%'.

There are some other problems with escaped wildcards discussed in
IMPALA-2422. This change does not fix these problems, which are hard.

New tests for escaped wildcards are added to exprs.test - note that
these tests cannot be part of the LikeTbl tests as the like predicate
optimizations are only applied when the like predicate is a string
literal.

Exhaustive tests ran clean.

Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Reviewed-on: http://gerrit.cloudera.org:8080/17798
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> A LIKE predicate that ends in an escaped wildcard is incorrectly evaluated
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-10849
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10849
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Andrew Sherman
>            Assignee: Andrew Sherman
>            Priority: Critical
>
> If the last character of a LIKE predicate is an escaped wildcard (e.g. LIKE foo\%) then it is incorrectly evaluated. This is because the fast path optimizations in LikePrepareInternal treat the predicate as being a search for a string with a fixed prefix. If the fast path optimizations are commented out then the LIKE is evaluated correctly.
> A possible fix would be to make the fast path optimizations recognize that escaped wildcards cannot be evaluated by the fixed prefix search.
> This is a simpler bug than that discussed in IMPALA-2422 which is to do with ambiguities in the logic of unescaping string literals (which is more tricky to fix).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org