You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/10 05:19:55 UTC

[GitHub] [spark] cloud-fan commented on a change in pull request #35465: [SPARK-38168][SQL] LikeSimplification rule handles escape characters

cloud-fan commented on a change in pull request #35465:
URL: https://github.com/apache/spark/pull/35465#discussion_r803309117



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -680,42 +680,50 @@ object PushFoldableIntoBranches extends Rule[LogicalPlan] with PredicateHelper {
  * pattern.
  */
 object LikeSimplification extends Rule[LogicalPlan] {
-  // if guards below protect from escapes on trailing %.
-  // Cases like "something\%" are not optimized, but this does not affect correctness.
-  private val startsWith = "([^_%]+)%".r
-  private val endsWith = "%([^_%]+)".r
-  private val startsAndEndsWith = "([^_%]+)%([^_%]+)".r
-  private val contains = "%([^_%]+)%".r
-  private val equalTo = "([^_%]*)".r
-
   private def simplifyLike(
       input: Expression, pattern: String, escapeChar: Char = '\\'): Option[Expression] = {
-    if (pattern.contains(escapeChar)) {
-      // There are three different situations when pattern containing escapeChar:
-      // 1. pattern contains invalid escape sequence, e.g. 'm\aca'
-      // 2. pattern contains escaped wildcard character, e.g. 'ma\%ca'
-      // 3. pattern contains escaped escape character, e.g. 'ma\\ca'
-      // Although there are patterns can be optimized if we handle the escape first, we just
-      // skip this rule if pattern contains any escapeChar for simplicity.

Review comment:
       This is a trade-off between code simplicity and performance. The assumption is that using escape char is rare and we shouldn't add a complicated implementation for it.
   
   Besides, Spark now provides `starts_with`, `ends_with`, etc. functions and people can use them directly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org