You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/10 10:26:01 UTC

[GitHub] [spark] beliefer opened a new pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

beliefer opened a new pull request #29999:
URL: https://github.com/apache/spark/pull/29999


   ### What changes were proposed in this pull request?
   Spark already support `LIKE ALL` syntax, but it will throw `StackOverflowError` if there are many elements(more than 14378 elements). We should implement built-in function for LIKE ALL to fix this issue.
   
   
   ### Why are the changes needed?
   1.Fix the `StackOverflowError` issue.
   2.Support built-in function `like_all`.
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   
   
   ### How was this patch tested?
   Jenkins test.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-707634498


   @maropu We can reproduce the `java.lang.StackOverflowError` in this way:
   ```scala
   spark.sql("create table SPARK_33045(id string) using parquet")
   val values = Range(1, 10000)
   spark.sql(s"select * from SPARK_33045 where id like all (${values.mkString(", ")})").show
   ```
   This is because we rewrite like all/any to like:
   ```scala
   spark.sql(s"select * from SPARK_33045 where ${values.map(i => s"id like $i").mkString(" and ")}").show
   ```
   
   
   
   
   
   
   And `In` predicate will not throw `java.lang.StackOverflowError` :
   ```scala
   spark.sql(s"select * from SPARK_33045 where id in (${values.mkString(", ")})").show
   ```
   
   So I think we can implement built-in LIKE ANY and LIKE ALL UDF similar to `In` predicate to fix this issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r505197069



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,125 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    StringType +: Seq.fill(children.size - 1)(StringType)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      var hasNull = false
+      var match = true
+      list.foreach { e =>
+        val str = e.eval(input)
+        if (str == null) {
+          hasNull = true
+        } else {
+          val regex =
+            Pattern.compile(StringUtils.escapeLikeRegex(str.asInstanceOf[UTF8String].toString, '\\'))
+          if ((isNot && matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) ||
+            !(isNot || matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {

Review comment:
       can we put `matches` as a local variable to shorten the code here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r505199510



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,125 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    StringType +: Seq.fill(children.size - 1)(StringType)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      var hasNull = false
+      var match = true
+      list.foreach { e =>
+        val str = e.eval(input)
+        if (str == null) {
+          hasNull = true
+        } else {
+          val regex =
+            Pattern.compile(StringUtils.escapeLikeRegex(str.asInstanceOf[UTF8String].toString, '\\'))
+          if ((isNot && matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) ||
+            !(isNot || matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {
+            match = false
+          }
+        }
+      }
+      if (hasNull) {
+        null
+      } else {
+        match
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val patternClass = classOf[Pattern].getName
+    val escapeFunc = StringUtils.getClass.getName.stripSuffix("$") + ".escapeLikeRegex"
+    val javaDataType = CodeGenerator.javaType(value.dataType)
+    val valueGen = value.genCode(ctx)
+    val listGen = list.map(_.genCode(ctx))
+    val pattern = ctx.freshName("pattern")
+    val rightStr = ctx.freshName("rightStr")
+    val escapedEscapeChar = StringEscapeUtils.escapeJava("\\")
+    val hasNull = ctx.freshName("hasNull")
+    val matched = ctx.freshName("matched")
+    val valueArg = ctx.freshName("valueArg")
+    val listCode = listGen.map(x =>
+      s"""
+         |${x.code}
+         |if (${x.isNull}) {
+         |  $hasNull = true; // ${ev.isNull} = true;
+         |} else if (!$hasNull && $matched) {
+         |  String $rightStr = ${x.value}.toString();
+         |  $patternClass $pattern =
+         |    $patternClass.compile($escapeFunc($rightStr, '$escapedEscapeChar'));

Review comment:
       this might cause perf regression.
   
   In `LIke` expression, we build the regex only once if the regex string is foldable.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725256865


   **[Test build #130900 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130900/testReport)** for PR 29999 at commit [`15bac5b`](https://github.com/apache/spark/commit/15bac5bfecb209ba7b6963d83423b659fbc5086d).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-712150902






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730203655


   **[Test build #131331 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131331/testReport)** for PR 29999 at commit [`001eb38`](https://github.com/apache/spark/commit/001eb38f603267c6a6f4e1c25430b8900644f5b7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706527357


   **[Test build #129623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129623/testReport)** for PR 29999 at commit [`4163382`](https://github.com/apache/spark/commit/41633827583d6f0d91e0e48b781c25c95ec06765).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708916521


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129802/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706863638


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r524852125



##########
File path: sql/core/src/test/resources/sql-functions/sql-expression-schema.md
##########
@@ -346,4 +346,4 @@
 | org.apache.spark.sql.catalyst.expressions.xml.XPathList | xpath | SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()') | struct<xpath(<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>, a/b/text()):array<string>> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathLong | xpath_long | SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | struct<xpath_long(<a><b>1</b><b>2</b></a>, sum(a/b)):bigint> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathShort | xpath_short | SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | struct<xpath_short(<a><b>1</b><b>2</b></a>, sum(a/b)):smallint> |
-| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |
\ No newline at end of file
+| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |

Review comment:
       I reverted it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706841427


   **[Test build #129659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129659/testReport)** for PR 29999 at commit [`a7cd416`](https://github.com/apache/spark/commit/a7cd416f40308cfc841fb0c7210728e69ba4ac1e).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r521160412



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##########
@@ -1408,7 +1408,20 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
           case Some(SqlBaseParser.ANY) | Some(SqlBaseParser.SOME) =>
             getLikeQuantifierExprs(ctx.expression).reduceLeft(Or)
           case Some(SqlBaseParser.ALL) =>
-            getLikeQuantifierExprs(ctx.expression).reduceLeft(And)
+            validate(!ctx.expression.isEmpty, "Expected something between '(' and ')'.", ctx)
+            val expressions = ctx.expression.asScala.map(expression)
+            if (expressions.size > 200 && expressions.forall(_.foldable)) {

Review comment:
       OK




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-726785967






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709702627


   **[Test build #129872 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129872/testReport)** for PR 29999 at commit [`be5eb8a`](https://github.com/apache/spark/commit/be5eb8a1f092e15c941d39d517284aed67de72c9).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709988169


   **[Test build #129894 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129894/testReport)** for PR 29999 at commit [`f657ff0`](https://github.com/apache/spark/commit/f657ff0372f1cac48ea008a08c1cc7011f934d98).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706919391


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708868898


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34387/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-715204127


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34798/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706836333


   I have the same impression like @maropu 's first comment. Could you answer his question, @beliefer ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706919945






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714227951


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34730/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708878815


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34389/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-711955492


   **[Test build #129999 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129999/testReport)** for PR 29999 at commit [`ad4d2d9`](https://github.com/apache/spark/commit/ad4d2d9cde81beff27c9eaadae77a132d59599cc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r522752648



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,90 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[Any]

Review comment:
       OK




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728745887






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714215550


   **[Test build #130125 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130125/testReport)** for PR 29999 at commit [`55465b8`](https://github.com/apache/spark/commit/55465b8fcd5dbde93c23eae99d94fb877e9cb5f3).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714487188


   **[Test build #130139 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130139/testReport)** for PR 29999 at commit [`55465b8`](https://github.com/apache/spark/commit/55465b8fcd5dbde93c23eae99d94fb877e9cb5f3).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `abstract class LikeAllBase extends Expression with ImplicitCastInputTypes `


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708930779






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708867172


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34385/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709984475


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728671247


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35793/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730201079






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708935701


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/34411/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724815285






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709984040


   **[Test build #129884 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129884/testReport)** for PR 29999 at commit [`f657ff0`](https://github.com/apache/spark/commit/f657ff0372f1cac48ea008a08c1cc7011f934d98).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728911196


   **[Test build #131211 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131211/testReport)** for PR 29999 at commit [`f0e3de1`](https://github.com/apache/spark/commit/f0e3de1718e99c887833f230c77c17c3851f9fc7).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706869300


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34265/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728768216


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708692143


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34363/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725271482






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-712227544


   **[Test build #129999 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129999/testReport)** for PR 29999 at commit [`ad4d2d9`](https://github.com/apache/spark/commit/ad4d2d9cde81beff27c9eaadae77a132d59599cc).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730203655


   **[Test build #131331 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131331/testReport)** for PR 29999 at commit [`001eb38`](https://github.com/apache/spark/commit/001eb38f603267c6a6f4e1c25430b8900644f5b7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r522786176



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##########
@@ -1408,7 +1408,20 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
           case Some(SqlBaseParser.ANY) | Some(SqlBaseParser.SOME) =>
             getLikeQuantifierExprs(ctx.expression).reduceLeft(Or)
           case Some(SqlBaseParser.ALL) =>
-            getLikeQuantifierExprs(ctx.expression).reduceLeft(And)
+            validate(!ctx.expression.isEmpty, "Expected something between '(' and ')'.", ctx)
+            val expressions = ctx.expression.asScala.map(expression)
+            if (expressions.size > SQLConf.get.optimizerLikeAllConversionThreshold &&
+              expressions.forall(_.foldable)) {

Review comment:
       Yes.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728769754


   **[Test build #131211 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131211/testReport)** for PR 29999 at commit [`f0e3de1`](https://github.com/apache/spark/commit/f0e3de1718e99c887833f230c77c17c3851f9fc7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-712149566


   **[Test build #129997 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129997/testReport)** for PR 29999 at commit [`8df5231`](https://github.com/apache/spark/commit/8df52316a1bb4bbeab427dd165b23addfaa3b859).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728660048


   **[Test build #131191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131191/testReport)** for PR 29999 at commit [`f0e3de1`](https://github.com/apache/spark/commit/f0e3de1718e99c887833f230c77c17c3851f9fc7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706858224


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34263/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728649075


   **[Test build #131189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131189/testReport)** for PR 29999 at commit [`97c1c73`](https://github.com/apache/spark/commit/97c1c7389e537f0d38f1b6a17bbe9ba70c9bc6ea).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r526618529



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,89 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotLikeAll: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotLikeAll) {
+        !cache.exists(p => p.matcher(exprValue.toString).matches())
+      } else {
+        cache.forall(p => p.matcher(exprValue.toString).matches())
+      }
+      if (allMatched && hasNull) {
+        null
+      } else {
+        allMatched
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val eval = child.genCode(ctx)
+    val patternClass = classOf[Pattern].getName
+    val javaDataType = CodeGenerator.javaType(child.dataType)
+    val pattern = ctx.freshName("pattern")
+    val allMatched = ctx.freshName("allMatched")
+    val valueIsNull = ctx.freshName("valueIsNull")
+    val valueArg = ctx.freshName("valueArg")
+    val patternCache = ctx.addReferenceObj("patternCache", cache.asJava)
+
+    val matchCode = if (isNotLikeAll) {
+      s"$pattern.matcher($valueArg.toString()).matches()"
+    } else {
+      s"!$pattern.matcher($valueArg.toString()).matches()"
+    }
+
+    ev.copy(code =
+      code"""
+            |${eval.code}
+            |boolean $allMatched = true;
+            |boolean $valueIsNull = false;
+            |if (${eval.isNull}) {
+            |  $valueIsNull = true;
+            |} else {
+            |  $javaDataType $valueArg = ${eval.value};
+            |  for ($patternClass $pattern: $patternCache) {

Review comment:
       Yeah! Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714333774


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34746/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r521135433



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,86 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[Any]
+
+  protected def isNotDefined: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    if (hasNull) {
+      null

Review comment:
       ```sql
   spark-sql> select 'a' like all ('%a%', null);
   NULL
   spark-sql> select 'a' not like all ('%a%', null);
   false
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-710019726


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34499/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-726638718


   **[Test build #131050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131050/testReport)** for PR 29999 at commit [`7af8ffe`](https://github.com/apache/spark/commit/7af8ffe49fc02765a80a85faccaa7209fe8b9c57).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r503832375



##########
File path: sql/core/src/test/resources/sql-tests/inputs/regexp-functions.sql
##########
@@ -31,3 +31,13 @@ SELECT regexp_extract_all('1a 2b 14m', '(\\d+)([a-z]+)', 3);
 SELECT regexp_extract_all('1a 2b 14m', '(\\d+)([a-z]+)', -1);
 SELECT regexp_extract_all('1a 2b 14m', '(\\d+)?([a-z]+)', 1);
 SELECT regexp_extract_all('a 2b 14m', '(\\d+)?([a-z]+)', 1);
+
+-- like_all
+SELECT like_all('foo', '%foo%', '%oo');

Review comment:
       We already have a test file: https://github.com/apache/spark/blob/b10263b8e5106409467e0115968bbaf0b9141cd1/sql/core/src/test/resources/sql-tests/inputs/like-all.sql




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-712053647






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706919702






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725157090






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r526612173



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,89 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotLikeAll: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotLikeAll) {
+        !cache.exists(p => p.matcher(exprValue.toString).matches())
+      } else {
+        cache.forall(p => p.matcher(exprValue.toString).matches())
+      }
+      if (allMatched && hasNull) {
+        null
+      } else {
+        allMatched
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val eval = child.genCode(ctx)
+    val patternClass = classOf[Pattern].getName
+    val javaDataType = CodeGenerator.javaType(child.dataType)
+    val pattern = ctx.freshName("pattern")
+    val allMatched = ctx.freshName("allMatched")
+    val valueIsNull = ctx.freshName("valueIsNull")
+    val valueArg = ctx.freshName("valueArg")
+    val patternCache = ctx.addReferenceObj("patternCache", cache.asJava)
+
+    val matchCode = if (isNotLikeAll) {
+      s"$pattern.matcher($valueArg.toString()).matches()"
+    } else {
+      s"!$pattern.matcher($valueArg.toString()).matches()"
+    }
+
+    ev.copy(code =
+      code"""
+            |${eval.code}
+            |boolean $allMatched = true;

Review comment:
       `notMatched`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708926914






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-726679865


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35656/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709734770






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706841427






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706545946


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129623/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709717092


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34477/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714215550


   **[Test build #130125 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130125/testReport)** for PR 29999 at commit [`55465b8`](https://github.com/apache/spark/commit/55465b8fcd5dbde93c23eae99d94fb877e9cb5f3).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714237121






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706545942


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730200947


   **[Test build #131326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131326/testReport)** for PR 29999 at commit [`001eb38`](https://github.com/apache/spark/commit/001eb38f603267c6a6f4e1c25430b8900644f5b7).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725157121


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35506/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708874795






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r520337999



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +179,142 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes {

Review comment:
       The current implementation requires the expression list to be foldable, including literal. In addition, in my earliest implementation, `nullSafeEval` also used the caching of each pattern.  But through offline discussions with @cloud-fan , there is no need to do this. The current implementation of `doGenCode`, if it is all literal, has actually achieved the effect.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706869315






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725360400


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35529/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706858233






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714437145






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728677455


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35793/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725274237


   **[Test build #130924 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130924/testReport)** for PR 29999 at commit [`d039c33`](https://github.com/apache/spark/commit/d039c33de33ea4bab4cea3170925c0c4f92ca771).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714287382


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706855750


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34261/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706849694


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34261/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724622736


   **[Test build #130860 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130860/testReport)** for PR 29999 at commit [`1fc5214`](https://github.com/apache/spark/commit/1fc5214964a3a522f3cc0a1daf91ced342bb1b51).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708874781


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34387/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #29999:
URL: https://github.com/apache/spark/pull/29999


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708938150


   **[Test build #129810 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129810/testReport)** for PR 29999 at commit [`b770f92`](https://github.com/apache/spark/commit/b770f929594dd551a544fb6b0e5f9d4f2ddff7d4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r524213551



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,90 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotDefined: Boolean

Review comment:
       nit: `isNotLikeAll`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708911515






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708787077


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708912448


   **[Test build #129800 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129800/testReport)** for PR 29999 at commit [`60f01f4`](https://github.com/apache/spark/commit/60f01f4edfbd112ea085e118c6a50f024c8c4dff).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708910768


   **[Test build #129780 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129780/testReport)** for PR 29999 at commit [`de65829`](https://github.com/apache/spark/commit/de658290b417645d4dd8b91bc1f2febb747e1f3b).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708843179






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724674796


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35472/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709925953






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706946167


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34276/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709984478


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129884/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r520305996



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +179,142 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes {

Review comment:
       Sounds reasonable. If there are more than 14378 elements, most likely they are literals.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728758630


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131191/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706855768






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708930785


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/34409/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708858723


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34385/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-707065749






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725272877


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725271482


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730202933


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730190816


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35930/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706840400


   > I have the same impression like @maropu 's first comment. Could you answer his question please, @beliefer ?
   
   Thanks for your remind.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708939230


   **[Test build #129810 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129810/testReport)** for PR 29999 at commit [`b770f92`](https://github.com/apache/spark/commit/b770f929594dd551a544fb6b0e5f9d4f2ddff7d4).
    * This patch **fails Scala style tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-715174110


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34798/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-712150902






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709864778






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r505202752



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,125 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean

Review comment:
       OK




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709988169


   **[Test build #129894 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129894/testReport)** for PR 29999 at commit [`f657ff0`](https://github.com/apache/spark/commit/f657ff0372f1cac48ea008a08c1cc7011f934d98).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725081708


   cc @cloud-fan @wangyum 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724685139






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725157049


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35506/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724637618


   **[Test build #130864 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130864/testReport)** for PR 29999 at commit [`53406d3`](https://github.com/apache/spark/commit/53406d349a46dad7edf61e5eb2e27b11e92e508a).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724685139






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r504681811



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,195 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+    StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = value.foldable && list.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def escape(v: String): String = StringUtils.escapeLikeRegex(v, '\\')
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      list.foreach { e =>
+        val str = e.eval(input)
+        if (str == null) {
+          return null
+        }
+        val regex = Pattern.compile(escape(str.asInstanceOf[UTF8String].toString))
+        if(regex == null) {
+          return null
+        } else if (isNot && matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {
+          return false
+        } else if (!isNot && !matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {
+          return false
+        }
+      }
+      return true
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val patternClass = classOf[Pattern].getName
+    val escapeFunc = StringUtils.getClass.getName.stripSuffix("$") + ".escapeLikeRegex"
+    val javaDataType = CodeGenerator.javaType(value.dataType)
+    val valueGen = value.genCode(ctx)
+    val listGen = list.map(_.genCode(ctx))
+    val pattern = ctx.freshName("pattern")
+    val rightStr = ctx.freshName("rightStr")
+    val escapedEscapeChar = StringEscapeUtils.escapeJava("\\")
+    val hasNull = ctx.freshName("hasNull")
+    val matched = ctx.freshName("matched")
+    val valueArg = ctx.freshName("valueArg")
+    val listCode = listGen.map(x =>
+      s"""
+         |${x.code}
+         |if (${x.isNull}) {
+         |  $hasNull = true; // ${ev.isNull} = true;
+         |} else if (!$hasNull && $matched) {
+         |  String $rightStr = ${x.value}.toString();
+         |  $patternClass $pattern =
+         |    $patternClass.compile($escapeFunc($rightStr, '$escapedEscapeChar'));
+         |  if ($isNot && $pattern.matcher($valueArg.toString()).matches()) {
+         |    $matched = false;
+         |  } else if (!$isNot && !$pattern.matcher($valueArg.toString()).matches()) {
+         |    $matched = false;
+         |  }
+         |}
+       """.stripMargin)
+
+    val resultType = CodeGenerator.javaType(dataType)
+    val codes = ctx.splitExpressionsWithCurrentInputs(
+      expressions = listCode,
+      funcName = "likeAll",
+      extraArguments = (javaDataType, valueArg) :: (CodeGenerator.JAVA_BOOLEAN, hasNull) ::
+        (resultType, matched) :: Nil,
+      returnType = resultType,
+      makeSplitFunction = body =>
+        s"""
+           |if (!$hasNull && $matched) {
+           |  $body;
+           |}
+         """.stripMargin,
+      foldFunctions = _.map { funcCall =>
+        s"""
+           |if (!$hasNull && $matched) {
+           |  $funcCall;
+           |}
+         """.stripMargin
+      }.mkString("\n"))
+    ev.copy(code =
+      code"""
+            |${valueGen.code}
+            |boolean $hasNull = false;
+            |boolean $matched = true;
+            |if (${valueGen.isNull}) {
+            |  $hasNull = true;
+            |} else {
+            |  $javaDataType $valueArg = ${valueGen.value};
+            |  $codes
+            |}
+            |final boolean ${ev.isNull} = ($hasNull == true);

Review comment:
       `hasNull` is already a boolean




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714352819


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708938150


   **[Test build #129810 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129810/testReport)** for PR 29999 at commit [`b770f92`](https://github.com/apache/spark/commit/b770f929594dd551a544fb6b0e5f9d4f2ddff7d4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725411085


   **[Test build #130924 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130924/testReport)** for PR 29999 at commit [`d039c33`](https://github.com/apache/spark/commit/d039c33de33ea4bab4cea3170925c0c4f92ca771).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724800934


   **[Test build #130860 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130860/testReport)** for PR 29999 at commit [`1fc5214`](https://github.com/apache/spark/commit/1fc5214964a3a522f3cc0a1daf91ced342bb1b51).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant `
     * `case class LikeAll(child: Expression, patterns: Seq[Any]) extends LikeAllBase `
     * `case class NotLikeAll(child: Expression, patterns: Seq[Any]) extends LikeAllBase `


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r504677077



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,195 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+    StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = value.foldable && list.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def escape(v: String): String = StringUtils.escapeLikeRegex(v, '\\')
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      list.foreach { e =>
+        val str = e.eval(input)
+        if (str == null) {
+          return null
+        }
+        val regex = Pattern.compile(escape(str.asInstanceOf[UTF8String].toString))
+        if(regex == null) {

Review comment:
       `SELECT company FROM like_all_table WHERE company LIKE ALL ('%oo%', null);`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r504671005



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,195 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+    StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = value.foldable && list.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def escape(v: String): String = StringUtils.escapeLikeRegex(v, '\\')
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      list.foreach { e =>
+        val str = e.eval(input)
+        if (str == null) {
+          return null
+        }
+        val regex = Pattern.compile(escape(str.asInstanceOf[UTF8String].toString))
+        if(regex == null) {
+          return null
+        } else if (isNot && matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {
+          return false
+        } else if (!isNot && !matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {
+          return false
+        }
+      }
+      return true
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val patternClass = classOf[Pattern].getName
+    val escapeFunc = StringUtils.getClass.getName.stripSuffix("$") + ".escapeLikeRegex"
+    val javaDataType = CodeGenerator.javaType(value.dataType)
+    val valueGen = value.genCode(ctx)
+    val listGen = list.map(_.genCode(ctx))
+    val pattern = ctx.freshName("pattern")
+    val rightStr = ctx.freshName("rightStr")
+    val escapedEscapeChar = StringEscapeUtils.escapeJava("\\")
+    val hasNull = ctx.freshName("hasNull")
+    val matched = ctx.freshName("matched")
+    val valueArg = ctx.freshName("valueArg")
+    val listCode = listGen.map(x =>
+      s"""
+         |${x.code}
+         |if (${x.isNull}) {
+         |  $hasNull = true; // ${ev.isNull} = true;
+         |} else if (!$hasNull && $matched) {
+         |  String $rightStr = ${x.value}.toString();
+         |  $patternClass $pattern =
+         |    $patternClass.compile($escapeFunc($rightStr, '$escapedEscapeChar'));
+         |  if ($isNot && $pattern.matcher($valueArg.toString()).matches()) {
+         |    $matched = false;
+         |  } else if (!$isNot && !$pattern.matcher($valueArg.toString()).matches()) {
+         |    $matched = false;
+         |  }
+         |}
+       """.stripMargin)
+
+    val resultType = CodeGenerator.javaType(dataType)
+    val codes = ctx.splitExpressionsWithCurrentInputs(
+      expressions = listCode,
+      funcName = "likeAll",
+      extraArguments = (javaDataType, valueArg) :: (CodeGenerator.JAVA_BOOLEAN, hasNull) ::
+        (resultType, matched) :: Nil,
+      returnType = resultType,
+      makeSplitFunction = body =>
+        s"""
+           |if (!$hasNull && $matched) {
+           |  $body;
+           |}
+         """.stripMargin,
+      foldFunctions = _.map { funcCall =>
+        s"""
+           |if (!$hasNull && $matched) {
+           |  $funcCall;
+           |}
+         """.stripMargin
+      }.mkString("\n"))
+    ev.copy(code =
+      code"""
+            |${valueGen.code}
+            |boolean $hasNull = false;
+            |boolean $matched = true;
+            |if (${valueGen.isNull}) {
+            |  $hasNull = true;
+            |} else {
+            |  $javaDataType $valueArg = ${valueGen.value};
+            |  $codes
+            |}
+            |final boolean ${ev.isNull} = ($hasNull == true);
+            |final boolean ${ev.value} = ($matched == true);
+      """.stripMargin)
+  }
+}
+
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(str, pattern1, pattern2, ...) - Returns true if `str` matches all the pattern string, " +

Review comment:
       The doc is not needed since we don't register the function




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728677472


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728744819


   **[Test build #131189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131189/testReport)** for PR 29999 at commit [`97c1c73`](https://github.com/apache/spark/commit/97c1c7389e537f0d38f1b6a17bbe9ba70c9bc6ea).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r521164688



##########
File path: sql/core/src/test/resources/sql-functions/sql-expression-schema.md
##########
@@ -346,4 +346,4 @@
 | org.apache.spark.sql.catalyst.expressions.xml.XPathList | xpath | SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()') | struct<xpath(<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>, a/b/text()):array<string>> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathLong | xpath_long | SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | struct<xpath_long(<a><b>1</b><b>2</b></a>, sum(a/b)):bigint> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathShort | xpath_short | SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | struct<xpath_short(<a><b>1</b><b>2</b></a>, sum(a/b)):smallint> |
-| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |
\ No newline at end of file
+| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |

Review comment:
       I tried revert it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r521109327



##########
File path: sql/core/src/test/resources/sql-functions/sql-expression-schema.md
##########
@@ -346,4 +346,4 @@
 | org.apache.spark.sql.catalyst.expressions.xml.XPathList | xpath | SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()') | struct<xpath(<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>, a/b/text()):array<string>> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathLong | xpath_long | SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | struct<xpath_long(<a><b>1</b><b>2</b></a>, sum(a/b)):bigint> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathShort | xpath_short | SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | struct<xpath_short(<a><b>1</b><b>2</b></a>, sum(a/b)):smallint> |
-| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |
\ No newline at end of file
+| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |

Review comment:
       Revert this change?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706869315






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r504673341



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,195 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+    StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = value.foldable && list.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def escape(v: String): String = StringUtils.escapeLikeRegex(v, '\\')
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      list.foreach { e =>
+        val str = e.eval(input)
+        if (str == null) {
+          return null
+        }
+        val regex = Pattern.compile(escape(str.asInstanceOf[UTF8String].toString))
+        if(regex == null) {

Review comment:
       can this happen?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r502782869



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##########
@@ -1408,7 +1408,13 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
           case Some(SqlBaseParser.ANY) | Some(SqlBaseParser.SOME) =>
             getLikeQuantifierExprs(ctx.expression).reduceLeft(Or)
           case Some(SqlBaseParser.ALL) =>
-            getLikeQuantifierExprs(ctx.expression).reduceLeft(And)
+            if (ctx.expression.isEmpty) {
+              throw new ParseException("Expected something between '(' and ')'.", ctx)
+            }
+            ctx.NOT match {
+              case null => LikeAll(e, ctx.expression.asScala.map(expression))

Review comment:
       Does this change disable the datasource pushdown for LIKE (e.g., StartsWith, EndsWith)? If so, we possibly get performance regression when reading datasources, I think.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-710125763






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r526617230



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,89 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotLikeAll: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotLikeAll) {
+        !cache.exists(p => p.matcher(exprValue.toString).matches())
+      } else {
+        cache.forall(p => p.matcher(exprValue.toString).matches())
+      }
+      if (allMatched && hasNull) {
+        null
+      } else {
+        allMatched
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val eval = child.genCode(ctx)
+    val patternClass = classOf[Pattern].getName
+    val javaDataType = CodeGenerator.javaType(child.dataType)
+    val pattern = ctx.freshName("pattern")
+    val allMatched = ctx.freshName("allMatched")
+    val valueIsNull = ctx.freshName("valueIsNull")
+    val valueArg = ctx.freshName("valueArg")
+    val patternCache = ctx.addReferenceObj("patternCache", cache.asJava)
+
+    val matchCode = if (isNotLikeAll) {

Review comment:
       OK




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714280358






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709890593


   **[Test build #129884 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129884/testReport)** for PR 29999 at commit [`f657ff0`](https://github.com/apache/spark/commit/f657ff0372f1cac48ea008a08c1cc7011f934d98).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708884791






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728669429


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35791/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r524214644



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,90 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotDefined: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotDefined) {
+        !cache.exists(p => p.matcher(exprValue.toString).matches())
+      } else {
+        cache.forall(p => p.matcher(exprValue.toString).matches())
+      }
+      if (allMatched && hasNull) {
+        null
+      } else {
+        allMatched
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val eval = child.genCode(ctx)
+    val patternClass = classOf[Pattern].getName
+    val javaDataType = CodeGenerator.javaType(child.dataType)
+    val pattern = ctx.freshName("pattern")
+    val allMatched = ctx.freshName("allMatched")
+    val valueIsNull = ctx.freshName("valueIsNull")
+    val valueArg = ctx.freshName("valueArg")
+    val patternHasNull = ctx.addReferenceObj("hasNull", hasNull)

Review comment:
       It's a boolean constant. We can change the generated code based on it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708926088






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708916497


   **[Test build #129802 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129802/testReport)** for PR 29999 at commit [`c32f89b`](https://github.com/apache/spark/commit/c32f89b8ba34b7b689ea5d2712f55824c99ba6f0).
    * This patch **fails Scala style tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708912448


   **[Test build #129800 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129800/testReport)** for PR 29999 at commit [`60f01f4`](https://github.com/apache/spark/commit/60f01f4edfbd112ea085e118c6a50f024c8c4dff).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728669440






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706527357


   **[Test build #129623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129623/testReport)** for PR 29999 at commit [`4163382`](https://github.com/apache/spark/commit/41633827583d6f0d91e0e48b781c25c95ec06765).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-707064823


   **[Test build #129672 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129672/testReport)** for PR 29999 at commit [`3e41cff`](https://github.com/apache/spark/commit/3e41cffb800e8e3f5a485021706f38a4fc73e07c).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `case class LikeAll(children: Seq[Expression]) extends LikeAllBase `
     * `case class NotLikeAll(children: Seq[Expression]) extends LikeAllBase `


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706919752


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708787105


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129778/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r504660905



##########
File path: sql/core/src/test/resources/sql-tests/inputs/regexp-functions.sql
##########
@@ -31,3 +31,13 @@ SELECT regexp_extract_all('1a 2b 14m', '(\\d+)([a-z]+)', 3);
 SELECT regexp_extract_all('1a 2b 14m', '(\\d+)([a-z]+)', -1);
 SELECT regexp_extract_all('1a 2b 14m', '(\\d+)?([a-z]+)', 1);
 SELECT regexp_extract_all('a 2b 14m', '(\\d+)?([a-z]+)', 1);
+
+-- like_all
+SELECT like_all('foo', '%foo%', '%oo');

Review comment:
       I have delete this change.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r505202129



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,125 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    StringType +: Seq.fill(children.size - 1)(StringType)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()

Review comment:
       OK




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728669440






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-715204141






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r519773515



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +179,142 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes {

Review comment:
       Could we make it only support `Literal`, for example:
   ```scala
   case class LikeAll(child: Expression, isNotDefined: Boolean, seq: mutable.Buffer[Any])
     extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
   
     override def dataType: DataType = BooleanType
   
     override def inputTypes: Seq[DataType] = StringType :: Nil
   
     @transient private[this] lazy val hasNull: Boolean = seq.contains(null)
   
     @transient private lazy val cachedPattern = seq.filterNot(_ == null)
       .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
   
     override protected def nullSafeEval(input1: Any): Any = {
       if (hasNull) {
         false
       } else {
         val str = input1.asInstanceOf[UTF8String].toString
         if (isNotDefined) {
           !cachedPattern.exists(p => p.matcher(str).matches())
         } else {
           cachedPattern.forall(p => p.matcher(str).matches())
         }
       }
     }
   
    // TODO: codegen
   }
   ```
   
   ```scala
   val exps = ctx.expression.asScala.map(expression)
   validate(exps.nonEmpty, "Expected something between '(' and ')'.", ctx)
   if (exps.size > 10 && exps.forall(_.foldable)) {
      LikeAll(e, isNotDefined, exps.map(_.eval(EmptyRow)))
   } else {
     exps.map(p => invertIfNotDefined(Like(e, p))).reduceLeft(And)
   }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724685114


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35472/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730400395






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730245197


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35936/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-726638718


   **[Test build #131050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131050/testReport)** for PR 29999 at commit [`7af8ffe`](https://github.com/apache/spark/commit/7af8ffe49fc02765a80a85faccaa7209fe8b9c57).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706919391






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730257842






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725372003


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35529/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r521133744



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala
##########
@@ -102,6 +102,8 @@ package object dsl {
     def like(other: Expression, escapeChar: Char = '\\'): Expression =
       Like(expr, other, escapeChar)
     def rlike(other: Expression): Expression = RLike(expr, other)
+    def likeAll(others: Literal*): Expression = LikeAll(expr, others.map(_.eval(EmptyRow)))

Review comment:
       OK




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714437108


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34758/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708944484






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-715204141






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r524848920



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,90 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotDefined: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotDefined) {
+        !cache.exists(p => p.matcher(exprValue.toString).matches())
+      } else {
+        cache.forall(p => p.matcher(exprValue.toString).matches())
+      }
+      if (allMatched && hasNull) {
+        null
+      } else {
+        allMatched
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val eval = child.genCode(ctx)
+    val patternClass = classOf[Pattern].getName
+    val javaDataType = CodeGenerator.javaType(child.dataType)
+    val pattern = ctx.freshName("pattern")
+    val allMatched = ctx.freshName("allMatched")
+    val valueIsNull = ctx.freshName("valueIsNull")
+    val valueArg = ctx.freshName("valueArg")
+    val patternHasNull = ctx.addReferenceObj("hasNull", hasNull)

Review comment:
       OK
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r524217980



##########
File path: sql/core/src/test/resources/sql-functions/sql-expression-schema.md
##########
@@ -346,4 +346,4 @@
 | org.apache.spark.sql.catalyst.expressions.xml.XPathList | xpath | SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()') | struct<xpath(<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>, a/b/text()):array<string>> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathLong | xpath_long | SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | struct<xpath_long(<a><b>1</b><b>2</b></a>, sum(a/b)):bigint> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathShort | xpath_short | SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | struct<xpath_short(<a><b>1</b><b>2</b></a>, sum(a/b)):smallint> |
-| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |
\ No newline at end of file
+| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |

Review comment:
       What gets changed here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728680562


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35794/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728677472






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706836333


   I have the same impression like @maropu 's first comment. Could you answer his question please, @beliefer ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730398742


   **[Test build #131331 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131331/testReport)** for PR 29999 at commit [`001eb38`](https://github.com/apache/spark/commit/001eb38f603267c6a6f4e1c25430b8900644f5b7).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724656881


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35467/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728661500


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35791/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706538267


   In the PR description, could you describe why the stack overflow can happen in the current approach and why the fix in this PR can avoid the error?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-726785967






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r526619223



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,89 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotLikeAll: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotLikeAll) {
+        !cache.exists(p => p.matcher(exprValue.toString).matches())
+      } else {
+        cache.forall(p => p.matcher(exprValue.toString).matches())
+      }
+      if (allMatched && hasNull) {
+        null
+      } else {
+        allMatched
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val eval = child.genCode(ctx)
+    val patternClass = classOf[Pattern].getName
+    val javaDataType = CodeGenerator.javaType(child.dataType)
+    val pattern = ctx.freshName("pattern")
+    val allMatched = ctx.freshName("allMatched")
+    val valueIsNull = ctx.freshName("valueIsNull")
+    val valueArg = ctx.freshName("valueArg")
+    val patternCache = ctx.addReferenceObj("patternCache", cache.asJava)
+
+    val matchCode = if (isNotLikeAll) {
+      s"$pattern.matcher($valueArg.toString()).matches()"
+    } else {
+      s"!$pattern.matcher($valueArg.toString()).matches()"
+    }
+
+    ev.copy(code =
+      code"""
+            |${eval.code}
+            |boolean $allMatched = true;

Review comment:
       the code flow can be
   ```
   boolean ${ev.isNull} = false;
   boolean ${ev.value} = true;
   if (${eval.isNull}) {
     ${ev.isNull} = true;
   } else {
     $javaDataType $valueArg = ${eval.value};
     for ... {
       if (notMatched) {
         $ev.value = false;
         break;
       }
     }
     if (${ev.value} && hasNull) ${ev.isNull} = true;
   }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728769754


   **[Test build #131211 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131211/testReport)** for PR 29999 at commit [`f0e3de1`](https://github.com/apache/spark/commit/f0e3de1718e99c887833f230c77c17c3851f9fc7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-711955225






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706545886


   **[Test build #129623 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129623/testReport)** for PR 29999 at commit [`4163382`](https://github.com/apache/spark/commit/41633827583d6f0d91e0e48b781c25c95ec06765).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant `
     * `case class LikeAll(value: Expression, list: Seq[Expression]) extends LikeAllBase `
     * `case class NotLikeAll(value: Expression, list: Seq[Expression]) extends LikeAllBase `


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r522716611



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,90 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[Any]

Review comment:
       should be `Seq[UTF8String]`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714572132


   **[Test build #130151 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130151/testReport)** for PR 29999 at commit [`f160c64`](https://github.com/apache/spark/commit/f160c64b4c2bf8f07aaba09cffddb51fd727401c).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714488547






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r503949480



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
##########
@@ -344,6 +344,8 @@ object FunctionRegistry {
     expression[Length]("length"),
     expression[Levenshtein]("levenshtein"),
     expression[Like]("like"),
+    expression[LikeAll]("like_all"),
+    expression[NotLikeAll]("not_like_all"),

Review comment:
       I'd prefer not, unless they are common in other databases.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-715097050


   **[Test build #130196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130196/testReport)** for PR 29999 at commit [`7b7120f`](https://github.com/apache/spark/commit/7b7120faaa0dcfd5e152cab135d1790a550f5fa9).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714425678


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34758/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706919955


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129657/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725094540


   **[Test build #130900 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130900/testReport)** for PR 29999 at commit [`15bac5b`](https://github.com/apache/spark/commit/15bac5bfecb209ba7b6963d83423b659fbc5086d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r545253629



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -216,6 +216,18 @@ object SQLConf {
         "for using switch statements in InSet must be non-negative and less than or equal to 600")
       .createWithDefault(400)
 
+  val OPTIMIZER_LIKE_ALL_CONVERSION_THRESHOLD =
+    buildConf("spark.sql.optimizer.likeAllConversionThreshold")
+      .internal()
+      .doc("Configure the maximum size of the pattern sequence in like all. Spark will convert " +
+        "the logical combination of like to avoid StackOverflowError. 200 is an empirical value " +
+        "that will not cause StackOverflowError.")
+      .version("3.1.0")
+      .intConf
+      .checkValue(threshold => threshold >= 0, "The maximum size of pattern sequence " +
+        "in like all must be non-negative")
+      .createWithDefault(200)

Review comment:
       We have removed this config: https://github.com/beliefer/spark/commit/9273d4250ddd5e011487a5a942c1b4d0f0412f78#diff-13c5b65678b327277c68d17910ae93629801af00117a0e3da007afd95b6c6764L219
   
   We will always use the new expression for LIKE ALL.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706545942






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725274237


   **[Test build #130924 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130924/testReport)** for PR 29999 at commit [`d039c33`](https://github.com/apache/spark/commit/d039c33de33ea4bab4cea3170925c0c4f92ca771).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r504682711



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,195 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+    StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = value.foldable && list.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def escape(v: String): String = StringUtils.escapeLikeRegex(v, '\\')
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      list.foreach { e =>
+        val str = e.eval(input)
+        if (str == null) {
+          return null
+        }
+        val regex = Pattern.compile(escape(str.asInstanceOf[UTF8String].toString))
+        if(regex == null) {
+          return null
+        } else if (isNot && matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {
+          return false
+        } else if (!isNot && !matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {
+          return false
+        }
+      }
+      return true
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val patternClass = classOf[Pattern].getName
+    val escapeFunc = StringUtils.getClass.getName.stripSuffix("$") + ".escapeLikeRegex"
+    val javaDataType = CodeGenerator.javaType(value.dataType)
+    val valueGen = value.genCode(ctx)
+    val listGen = list.map(_.genCode(ctx))
+    val pattern = ctx.freshName("pattern")
+    val rightStr = ctx.freshName("rightStr")
+    val escapedEscapeChar = StringEscapeUtils.escapeJava("\\")
+    val hasNull = ctx.freshName("hasNull")
+    val matched = ctx.freshName("matched")
+    val valueArg = ctx.freshName("valueArg")
+    val listCode = listGen.map(x =>
+      s"""
+         |${x.code}
+         |if (${x.isNull}) {
+         |  $hasNull = true; // ${ev.isNull} = true;
+         |} else if (!$hasNull && $matched) {
+         |  String $rightStr = ${x.value}.toString();
+         |  $patternClass $pattern =
+         |    $patternClass.compile($escapeFunc($rightStr, '$escapedEscapeChar'));
+         |  if ($isNot && $pattern.matcher($valueArg.toString()).matches()) {
+         |    $matched = false;
+         |  } else if (!$isNot && !$pattern.matcher($valueArg.toString()).matches()) {
+         |    $matched = false;
+         |  }
+         |}
+       """.stripMargin)
+
+    val resultType = CodeGenerator.javaType(dataType)
+    val codes = ctx.splitExpressionsWithCurrentInputs(
+      expressions = listCode,
+      funcName = "likeAll",
+      extraArguments = (javaDataType, valueArg) :: (CodeGenerator.JAVA_BOOLEAN, hasNull) ::
+        (resultType, matched) :: Nil,
+      returnType = resultType,
+      makeSplitFunction = body =>
+        s"""
+           |if (!$hasNull && $matched) {
+           |  $body;
+           |}
+         """.stripMargin,
+      foldFunctions = _.map { funcCall =>
+        s"""
+           |if (!$hasNull && $matched) {
+           |  $funcCall;
+           |}
+         """.stripMargin
+      }.mkString("\n"))
+    ev.copy(code =
+      code"""
+            |${valueGen.code}
+            |boolean $hasNull = false;
+            |boolean $matched = true;
+            |if (${valueGen.isNull}) {
+            |  $hasNull = true;
+            |} else {
+            |  $javaDataType $valueArg = ${valueGen.value};
+            |  $codes
+            |}
+            |final boolean ${ev.isNull} = ($hasNull == true);
+            |final boolean ${ev.value} = ($matched == true);

Review comment:
       can we make the interpreted code path (`eval`) follow codegen? Similar code style can help people to review this PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r521107867



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,86 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[Any]
+
+  protected def isNotDefined: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    if (hasNull) {
+      null

Review comment:
       `null` -> `false`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714370504


   **[Test build #130151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130151/testReport)** for PR 29999 at commit [`f160c64`](https://github.com/apache/spark/commit/f160c64b4c2bf8f07aaba09cffddb51fd727401c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-712053647






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708914866






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708787077






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725157090


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708935688






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708945114






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708867194






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-712229098






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708867201


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/34385/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708944491


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/34415/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-710019753






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714573850






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730169706


   **[Test build #131326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131326/testReport)** for PR 29999 at commit [`001eb38`](https://github.com/apache/spark/commit/001eb38f603267c6a6f4e1c25430b8900644f5b7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r526611514



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,89 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotLikeAll: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotLikeAll) {

Review comment:
       to improve readability:
   ```
   val matchFunc: Pattern => Booolean = if (isNotLikeAll) {
     p => !p.matcher(exprValue.toString).matches()
   } else {
     p => p.matcher(exprValue.toString).matches()
   }
   if (cache.forall(matchFunc)) {
     if (hasNull) null else true
   } else {
     false
   }
   
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708787034


   **[Test build #129778 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129778/testReport)** for PR 29999 at commit [`369959f`](https://github.com/apache/spark/commit/369959f6c627004c99206fc6c9e252c9676b82a7).
    * This patch **fails Scala style tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706532052


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34227/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708913180


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706863638






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708913180






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-711955276






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r526619223



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,89 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotLikeAll: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotLikeAll) {
+        !cache.exists(p => p.matcher(exprValue.toString).matches())
+      } else {
+        cache.forall(p => p.matcher(exprValue.toString).matches())
+      }
+      if (allMatched && hasNull) {
+        null
+      } else {
+        allMatched
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val eval = child.genCode(ctx)
+    val patternClass = classOf[Pattern].getName
+    val javaDataType = CodeGenerator.javaType(child.dataType)
+    val pattern = ctx.freshName("pattern")
+    val allMatched = ctx.freshName("allMatched")
+    val valueIsNull = ctx.freshName("valueIsNull")
+    val valueArg = ctx.freshName("valueArg")
+    val patternCache = ctx.addReferenceObj("patternCache", cache.asJava)
+
+    val matchCode = if (isNotLikeAll) {
+      s"$pattern.matcher($valueArg.toString()).matches()"
+    } else {
+      s"!$pattern.matcher($valueArg.toString()).matches()"
+    }
+
+    ev.copy(code =
+      code"""
+            |${eval.code}
+            |boolean $allMatched = true;

Review comment:
       the code flow can be
   ```
   boolean ${ev.isNull} = false;
   boolean ${ev.value} = true;
   if (${eval.isNull}) {
     ${ev.isNull} = true;
   } else {
     $javaDataType $valueArg = ${eval.value};
     for ... {
       if (notMatched) $ev.value = false;
     }
     if (${ev.value} && hasNull) ${ev.isNull} = true;
   }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708698527






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708832273


   **[Test build #129782 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129782/testReport)** for PR 29999 at commit [`1754f0d`](https://github.com/apache/spark/commit/1754f0d3e234afbd69d408a27a2ca9dea11b4ba1).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-726784775


   **[Test build #131050 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131050/testReport)** for PR 29999 at commit [`7af8ffe`](https://github.com/apache/spark/commit/7af8ffe49fc02765a80a85faccaa7209fe8b9c57).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708874795






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725304059


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725255595


   **[Test build #130915 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130915/testReport)** for PR 29999 at commit [`d039c33`](https://github.com/apache/spark/commit/d039c33de33ea4bab4cea3170925c0c4f92ca771).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725122529


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35506/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706535126






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708943814


   **[Test build #129782 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129782/testReport)** for PR 29999 at commit [`1754f0d`](https://github.com/apache/spark/commit/1754f0d3e234afbd69d408a27a2ca9dea11b4ba1).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r521106614



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala
##########
@@ -102,6 +102,8 @@ package object dsl {
     def like(other: Expression, escapeChar: Char = '\\'): Expression =
       Like(expr, other, escapeChar)
     def rlike(other: Expression): Expression = RLike(expr, other)
+    def likeAll(others: Literal*): Expression = LikeAll(expr, others.map(_.eval(EmptyRow)))

Review comment:
       `others: Literal*` -> `others: String*`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-725271492


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130915/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-707458123


   @maropu If there have a lot of like, the reduceLeft will construct very deep tree. This will lead to unlimited function calls to increase the height of the thread stack.
   ```
   at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:175)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:175)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:175)
   	at scala.collection.immutable.List.foreach(List.scala:392)
   at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:175)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:175)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:175)
   	at scala.collection.immutable.List.foreach(List.scala:392)
   at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:175)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:175)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:175)
   	at scala.collection.immutable.List.foreach(List.scala:392)
   ......
   
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706863628


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34264/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706535126






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714280358






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708930779


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730505334


   thanks, merging to master!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-714370504


   **[Test build #130151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130151/testReport)** for PR 29999 at commit [`f160c64`](https://github.com/apache/spark/commit/f160c64b4c2bf8f07aaba09cffddb51fd727401c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r505212166



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,125 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    StringType +: Seq.fill(children.size - 1)(StringType)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      var hasNull = false
+      var match = true
+      list.foreach { e =>
+        val str = e.eval(input)

Review comment:
       Yes




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730202933






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r524214644



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,90 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotDefined: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotDefined) {
+        !cache.exists(p => p.matcher(exprValue.toString).matches())
+      } else {
+        cache.forall(p => p.matcher(exprValue.toString).matches())
+      }
+      if (allMatched && hasNull) {
+        null
+      } else {
+        allMatched
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val eval = child.genCode(ctx)
+    val patternClass = classOf[Pattern].getName
+    val javaDataType = CodeGenerator.javaType(child.dataType)
+    val pattern = ctx.freshName("pattern")
+    val allMatched = ctx.freshName("allMatched")
+    val valueIsNull = ctx.freshName("valueIsNull")
+    val valueArg = ctx.freshName("valueArg")
+    val patternHasNull = ctx.addReferenceObj("hasNull", hasNull)

Review comment:
       It's a boolean contant. We can change the generated code based on it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706960737






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724802450






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708601868


   **[Test build #129757 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129757/testReport)** for PR 29999 at commit [`d841b54`](https://github.com/apache/spark/commit/d841b54007d36963ede98a3745d4dd69c8f65c3e).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
mridulm commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-730583151


   @cloud-fan This is causing failures in scala-2.13 build
   See [this](https://github.com/apache/spark/pull/30164/checks?check_run_id=1425957338) for example.
   
   +CC @dongjoon-hyun, @srowen 
   
   I believe @sunchao's PR is attempting to address it [here](https://github.com/apache/spark/pull/30431)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706851744


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34263/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706538725


   One more question; does this PR approach has the same performance with the current one in case of the small number of elements in `LIKE ALL`?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-724637618


   **[Test build #130864 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130864/testReport)** for PR 29999 at commit [`53406d3`](https://github.com/apache/spark/commit/53406d349a46dad7edf61e5eb2e27b11e92e508a).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-708926967


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129807/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728649075


   **[Test build #131189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131189/testReport)** for PR 29999 at commit [`97c1c73`](https://github.com/apache/spark/commit/97c1c7389e537f0d38f1b6a17bbe9ba70c9bc6ea).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709864803


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129872/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r505196281



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,125 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    StringType +: Seq.fill(children.size - 1)(StringType)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      var hasNull = false
+      var match = true

Review comment:
       `matched`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-709863988


   **[Test build #129872 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129872/testReport)** for PR 29999 at commit [`be5eb8a`](https://github.com/apache/spark/commit/be5eb8a1f092e15c941d39d517284aed67de72c9).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum edited a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
wangyum edited a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-707634498


   @maropu We can reproduce the `java.lang.StackOverflowError` in this way:
   ```scala
   spark.sql("create table SPARK_33045(id string) using parquet")
   val values = Range(1, 10000)
   spark.sql(s"select * from SPARK_33045 where id like all (${values.mkString(", ")})").show
   ```
   This is because we rewrite like all/any to like:
   ```scala
   spark.sql(s"select * from SPARK_33045 where ${values.map(i => s"id like $i").mkString(" and ")}").show
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-706919007






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728834744






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r505098317



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -176,6 +177,195 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+abstract class LikeAllBase extends Expression with ImplicitCastInputTypes with NullIntolerant {
+  def value: Expression = children.head
+  def list: Seq[Expression] = children.tail
+  def isNot: Boolean
+
+  override def inputTypes: Seq[AbstractDataType] = {
+    val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+    StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = BooleanType
+
+  override def foldable: Boolean = value.foldable && list.forall(_.foldable)
+
+  override def nullable: Boolean = true
+
+  def escape(v: String): String = StringUtils.escapeLikeRegex(v, '\\')
+
+  def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
+
+  override def eval(input: InternalRow): Any = {
+    val evaluatedValue = value.eval(input)
+    if (evaluatedValue == null) {
+      null
+    } else {
+      list.foreach { e =>
+        val str = e.eval(input)
+        if (str == null) {
+          return null
+        }
+        val regex = Pattern.compile(escape(str.asInstanceOf[UTF8String].toString))
+        if(regex == null) {
+          return null
+        } else if (isNot && matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {
+          return false
+        } else if (!isNot && !matches(regex, evaluatedValue.asInstanceOf[UTF8String].toString)) {
+          return false
+        }
+      }
+      return true
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val patternClass = classOf[Pattern].getName
+    val escapeFunc = StringUtils.getClass.getName.stripSuffix("$") + ".escapeLikeRegex"
+    val javaDataType = CodeGenerator.javaType(value.dataType)
+    val valueGen = value.genCode(ctx)
+    val listGen = list.map(_.genCode(ctx))
+    val pattern = ctx.freshName("pattern")
+    val rightStr = ctx.freshName("rightStr")
+    val escapedEscapeChar = StringEscapeUtils.escapeJava("\\")
+    val hasNull = ctx.freshName("hasNull")
+    val matched = ctx.freshName("matched")
+    val valueArg = ctx.freshName("valueArg")
+    val listCode = listGen.map(x =>
+      s"""
+         |${x.code}
+         |if (${x.isNull}) {
+         |  $hasNull = true; // ${ev.isNull} = true;
+         |} else if (!$hasNull && $matched) {
+         |  String $rightStr = ${x.value}.toString();
+         |  $patternClass $pattern =
+         |    $patternClass.compile($escapeFunc($rightStr, '$escapedEscapeChar'));
+         |  if ($isNot && $pattern.matcher($valueArg.toString()).matches()) {
+         |    $matched = false;
+         |  } else if (!$isNot && !$pattern.matcher($valueArg.toString()).matches()) {
+         |    $matched = false;
+         |  }
+         |}
+       """.stripMargin)
+
+    val resultType = CodeGenerator.javaType(dataType)
+    val codes = ctx.splitExpressionsWithCurrentInputs(
+      expressions = listCode,
+      funcName = "likeAll",
+      extraArguments = (javaDataType, valueArg) :: (CodeGenerator.JAVA_BOOLEAN, hasNull) ::
+        (resultType, matched) :: Nil,
+      returnType = resultType,
+      makeSplitFunction = body =>
+        s"""
+           |if (!$hasNull && $matched) {
+           |  $body;
+           |}
+         """.stripMargin,
+      foldFunctions = _.map { funcCall =>
+        s"""
+           |if (!$hasNull && $matched) {
+           |  $funcCall;
+           |}
+         """.stripMargin
+      }.mkString("\n"))
+    ev.copy(code =
+      code"""
+            |${valueGen.code}
+            |boolean $hasNull = false;
+            |boolean $matched = true;
+            |if (${valueGen.isNull}) {
+            |  $hasNull = true;
+            |} else {
+            |  $javaDataType $valueArg = ${valueGen.value};
+            |  $codes
+            |}
+            |final boolean ${ev.isNull} = ($hasNull == true);

Review comment:
       Yeah!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728912508






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29999:
URL: https://github.com/apache/spark/pull/29999#discussion_r526626891



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##########
@@ -178,6 +180,89 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant {
+
+  protected def patterns: Seq[UTF8String]
+
+  protected def isNotLikeAll: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+    .map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+    val exprValue = child.eval(input)
+    if (exprValue == null) {
+      null
+    } else {
+      val allMatched = if (isNotLikeAll) {
+        !cache.exists(p => p.matcher(exprValue.toString).matches())
+      } else {
+        cache.forall(p => p.matcher(exprValue.toString).matches())
+      }
+      if (allMatched && hasNull) {
+        null
+      } else {
+        allMatched
+      }
+    }
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val eval = child.genCode(ctx)
+    val patternClass = classOf[Pattern].getName
+    val javaDataType = CodeGenerator.javaType(child.dataType)
+    val pattern = ctx.freshName("pattern")
+    val allMatched = ctx.freshName("allMatched")
+    val valueIsNull = ctx.freshName("valueIsNull")
+    val valueArg = ctx.freshName("valueArg")
+    val patternCache = ctx.addReferenceObj("patternCache", cache.asJava)
+
+    val matchCode = if (isNotLikeAll) {
+      s"$pattern.matcher($valueArg.toString()).matches()"
+    } else {
+      s"!$pattern.matcher($valueArg.toString()).matches()"
+    }
+
+    ev.copy(code =
+      code"""
+            |${eval.code}
+            |boolean $allMatched = true;

Review comment:
       I learned more!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29999:
URL: https://github.com/apache/spark/pull/29999#issuecomment-728680575






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org