You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/29 13:44:49 UTC

[GitHub] [spark] wangyum commented on a diff in pull request #37672: [SPARK-40228][SQL] Do not simplify multiLike if child is not attribute

wangyum commented on code in PR #37672:
URL: https://github.com/apache/spark/pull/37672#discussion_r957357331


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala:
##########
@@ -771,6 +771,9 @@ object LikeSimplification extends Rule[LogicalPlan] {
     }
   }
 
+  private def isSimplifyMultiLike(child: Expression): Boolean =
+    child.isInstanceOf[Attribute] || child.foldable

Review Comment:
   It may also be optimized to `StringStartsWith`, `StringEndsWith` and `StringContains`. For example:
   ```sql
   select * from t1 where id like all('%a', 'b%', '%c%', '%a%b%', '%a%b%c%')
   ```
   ```
   == Physical Plan ==
   *(1) Filter ((((isnotnull(id#7) AND EndsWith(id#7, a)) AND StartsWith(id#7, b)) AND Contains(id#7, c)) AND likeall(id#7, %a%b%, %a%b%c%))
   +- *(1) ColumnarToRow
      +- FileScan parquet spark_catalog.default.t1[id#7] Batched: true, DataFilters: [isnotnull(id#7), EndsWith(id#7, a), StartsWith(id#7, b), Contains(id#7, c), likeall(id#7, %a%b%,..., Format: Parquet, PartitionFilters: [], PushedFilters: [IsNotNull(id), StringEndsWith(id,a), StringStartsWith(id,b), StringContains(id,c)], ReadSchema: struct<id:string>
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org