You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by heary-cao <gi...@git.apache.org> on 2017/08/16 10:24:27 UTC

[GitHub] spark pull request #18961: [SQL]nondeterministic expressions correctly for f...

GitHub user heary-cao opened a pull request:

    https://github.com/apache/spark/pull/18961

    [SQL]nondeterministic expressions correctly for filter predicates

    ## What changes were proposed in this pull request?
    Currently, We do interpretedpredicate optimization, but not very well, because when our filter contained an indeterminate expression, it would have an exception. This PR describes solving this problem by adding the initialize method in InterpretedPredicate.
    
    java.lang.IllegalArgumentException:
    ```
    java.lang.IllegalArgumentException: requirement failed: Nondeterministic expression org.apache.spark.sql.catalyst.expressions.Rand should be initialized before eval.
    	at scala.Predef$.require(Predef.scala:224)
    	at org.apache.spark.sql.catalyst.expressions.Nondeterministic$class.eval(Expression.scala:291)
    	at org.apache.spark.sql.catalyst.expressions.RDG.eval(randomExpressions.scala:34)
    	at org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:415)
    	at org.apache.spark.sql.catalyst.expressions.InterpretedPredicate.eval(predicates.scala:38)
    	at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$$anonfun$prunePartitionsByFilter$1.apply(ExternalCatalogUtils.scala:158)
    	at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$$anonfun$prunePartitionsByFilter$1.apply(ExternalCatalogUtils.scala:157)
    	at scala.collection.immutable.Stream.filter(Stream.scala:519)
    	at scala.collection.immutable.Stream.filter(Stream.scala:202)
    	at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.prunePartitionsByFilter(ExternalCatalogUtils.scala:157)
    	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:1129)
    	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$listPartitionsByFilter$1.apply(HiveExternalCatalog.scala:1119)
    	at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
    	at org.apache.spark.sql.hive.HiveExternalCatalog.listPartitionsByFilter(HiveExternalCatalog.scala:1119)
    	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.listPartitionsByFilter(SessionCatalog.scala:925)
    	at org.apache.spark.sql.execution.datasources.CatalogFileIndex.filterPartitions(CatalogFileIndex.scala:73)
    	at org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions$$anonfun$apply$1.applyOrElse(PruneFileSourcePartitions.scala:60)
    	at org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions$$anonfun$apply$1.applyOrElse(PruneFileSourcePartitions.scala:27)
    	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
    	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
    	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
    	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:266)
    	at org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions$.apply(PruneFileSourcePartitions.scala:27)
    	at org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions$.apply(PruneFileSourcePartitions.scala:26)
    
    ```
    ## How was this patch tested?
    
    Should be covered existing test cases and add new test cases.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/heary-cao/spark Predicate

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18961.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18961
    
----
commit f42ede0c4bb68534b9a28dca322a833d2b253f10
Author: caoxuewen <ca...@zte.com.cn>
Date:   2017-08-16T10:12:26Z

    nondeterministic expressions correctly for filter predicates

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r134702688
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    I think that predicates is not empty.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135771515
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    I'm updating to the latest version for validation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r134592009
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ---
    @@ -31,11 +31,22 @@ object InterpretedPredicate {
       def create(expression: Expression, inputSchema: Seq[Attribute]): InterpretedPredicate =
         create(BindReferences.bindReference(expression, inputSchema))
     
    -  def create(expression: Expression): InterpretedPredicate = new InterpretedPredicate(expression)
    +  def create(expression: Expression): InterpretedPredicate = {
    +    val predicate = new InterpretedPredicate(expression)
    +    predicate.initialize(0)
    --- End diff --
    
    `0` is not a `partitionIndex` and  `0` will go to `n.initialize(0)` in line 46. Is it correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Leave a comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao closed the pull request at:

    https://github.com/apache/spark/pull/18961


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135737817
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/sources/SimpleTextRelation.scala ---
    @@ -90,6 +90,7 @@ class SimpleTextSource extends TextBasedFileFormat with DataSourceRegister {
             }.reduceOption(expressions.And).getOrElse(Literal(true))
             InterpretedPredicate.create(filterCondition, inputAttributes)
           }
    +      predicate.initialize(0)
    --- End diff --
    
    thanks, modify it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r134641150
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ---
    @@ -31,11 +31,22 @@ object InterpretedPredicate {
       def create(expression: Expression, inputSchema: Seq[Attribute]): InterpretedPredicate =
         create(BindReferences.bindReference(expression, inputSchema))
     
    -  def create(expression: Expression): InterpretedPredicate = new InterpretedPredicate(expression)
    +  def create(expression: Expression): InterpretedPredicate = {
    +    val predicate = new InterpretedPredicate(expression)
    +    predicate.initialize(0)
    --- End diff --
    
    we not need InterpretedPredicate.initialize and modify _object InterpretedPredicate.create_
    ```
      def create(expression: Expression): InterpretedPredicate = {
        expression.foreach {
          case n: Nondeterministic => n.initialize(0)
          case _ =>
        }
        new InterpretedPredicate(expression)
      }
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r134639568
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,15 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
    --- End diff --
    
    aha,you are right.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    cc @gatorsmile  @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    @dongjoon-hyun @cloud-fan Do you have any suggestions?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r134672900
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    why this test case will trigger `InterpretedPredicate`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    **[Test build #81203 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81203/testReport)** for PR 18961 at commit [`8be86e5`](https://github.com/apache/spark/commit/8be86e5c597c825e0bb5f3886eddb70f1ec6bea3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135737972
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ---
    @@ -362,7 +362,9 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializ
           str
         }
         logWarning(s"Codegen disabled for this expression:\n $logMessage")
    -    InterpretedPredicate.create(expression, inputSchema)
    +    val predicate = InterpretedPredicate.create(expression, inputSchema)
    +    predicate.initialize(0)
    +    predicate
    --- End diff --
    
    thanks, remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    **[Test build #80810 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80810/testReport)** for PR 18961 at commit [`a1fd5e1`](https://github.com/apache/spark/commit/a1fd5e1b76c353520f51c282ba026ae0af706db2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    **[Test build #81210 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81210/testReport)** for PR 18961 at commit [`349a2d2`](https://github.com/apache/spark/commit/349a2d2724bf6b830c9d3423435f1501cbcf5021).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135751702
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    The test you add should trigger `InterpretedPredicate` as @cloud-fan mentioned. It should hit the code path you modified in this PR, not depending on another PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81203/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    You do not need to open multiple PRs for the same issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Will review it tonight.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135720735
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ---
    @@ -362,7 +362,9 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializ
           str
         }
         logWarning(s"Codegen disabled for this expression:\n $logMessage")
    -    InterpretedPredicate.create(expression, inputSchema)
    +    val predicate = InterpretedPredicate.create(expression, inputSchema)
    +    predicate.initialize(0)
    +    predicate
    --- End diff --
    
    I checked the few places calling `newPredicate`. They are either already do `initialize` properly, or don't need to do it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Any reason you closed it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135718190
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    I ran this test without any exception. Are you sure this test can reproduce this issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    **[Test build #81011 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81011/testReport)** for PR 18961 at commit [`954f0a7`](https://github.com/apache/spark/commit/954f0a724784ea7ae10d26f39ab76876cefdb9cb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    @cloud-fan @gatorsmile 
    Could you take a look again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135785562
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    @viirya, @cloud-fan 
    This trigger condition is associated with #18918. It will be more prone to this exception
    The current spark master branch does not trigger this code path.
    put this change on #18918 and close this PR.
    thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80810/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135743719
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    sure, this exception comes from #18918 Unit testing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    **[Test build #81210 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81210/testReport)** for PR 18961 at commit [`349a2d2`](https://github.com/apache/spark/commit/349a2d2724bf6b830c9d3423435f1501cbcf5021).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135715971
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ---
    @@ -362,7 +362,9 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializ
           str
         }
         logWarning(s"Codegen disabled for this expression:\n $logMessage")
    -    InterpretedPredicate.create(expression, inputSchema)
    +    val predicate = InterpretedPredicate.create(expression, inputSchema)
    +    predicate.initialize(0)
    +    predicate
    --- End diff --
    
    The partition index is not always 0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    **[Test build #81011 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81011/testReport)** for PR 18961 at commit [`954f0a7`](https://github.com/apache/spark/commit/954f0a724784ea7ae10d26f39ab76876cefdb9cb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    **[Test build #81203 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81203/testReport)** for PR 18961 at commit [`8be86e5`](https://github.com/apache/spark/commit/8be86e5c597c825e0bb5f3886eddb70f1ec6bea3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135749554
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    First, you don't add any comment to explain it. Reviewers don't know that.
    Second, a new test should test against some regressions or the proposed changes. This test is neither of both.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135753087
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    sorry,I update the description of PR. thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135764611
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    can this test expose the bug?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    @cloud-fan, Thank you for your suggest. I found 4 calls to InterpretedPredicate.create and modify it. Can you take a look again if you have time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SQL]nondeterministic expressions correctly for filter p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    @gatorsmile , This should be a problem for code execution, semantics, and consistency.
    This trigger condition: _FileSourceScanExec.partitionFilters_  is not null and contains nondeterministic function. however, the current spark master branch does not trigger this code path.  because _PhysicalOperation_ excluded that the condition is nondeterministic of Filter.
    Similar spark sql: ` SELECT t1.i3 from tableorc t1 where rand(10) <= 0.5`
     the current spark master branch  executed Plan:
    ```
    *Project [i3#0]
    +- *Filter (rand(10) <= 0.5)
       +- *FileScan orc default.tableorc[i3#0,i4#1,i5#2,i6#3,i7#4,i8#5,i9#6,i10#7,i11#8,s2#9,s3#10,s4#11,s5#12,s6#13,s7#14,d2#15,d3#16,d4#17,d5#18,d6#19,d7#20,i2#21] Batched: false, Format: ORC, Location: CatalogFileIndex[file:/home/cxw/spark/bin/spark-warehouse/tableorc], PartitionCount: 2, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<i3:int,i4:int,i5:int,i6:int,i7:int,i8:int,i9:int,i10:int,i11:int,s2:string,s3:string,s4:st...
    ```
     PartitionFilters is **[]**
    
    Thus, this trigger condition is associated with #18918. it will be more prone to this exception.
    #18918 PR executed Plan:
    ```
    *Project [i3#0]
    +- *Filter (rand(10) <= 0.5)
       +- *FileScan orc default.tableorc[i3#0,i2#21] Batched: false, Format: ORC, Location: PrunedInMemoryFileIndex[file:/home/cxw/spark/bin/spark-warehouse/tableorc/i2=0], PartitionCount: 1, PartitionFilters: [(rand(10) <= 0.5)], PushedFilters: [], ReadSchema: struct<i3:int>
    ```
    PartitionFilters is **[(rand(10) <= 0.5)]**
    thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    **[Test build #80810 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80810/testReport)** for PR 18961 at commit [`a1fd5e1`](https://github.com/apache/spark/commit/a1fd5e1b76c353520f51c282ba026ae0af706db2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r134590273
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,15 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
    --- End diff --
    
    Do we need to turn off this? It looks irrelevant to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r134643234
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ---
    @@ -31,11 +31,22 @@ object InterpretedPredicate {
       def create(expression: Expression, inputSchema: Seq[Attribute]): InterpretedPredicate =
         create(BindReferences.bindReference(expression, inputSchema))
     
    -  def create(expression: Expression): InterpretedPredicate = new InterpretedPredicate(expression)
    +  def create(expression: Expression): InterpretedPredicate = {
    +    val predicate = new InterpretedPredicate(expression)
    +    predicate.initialize(0)
    --- End diff --
    
    The consideration then was the consistency of class InterpretedPredicate, so add initialize method of the class InterpretedPredicate.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81210/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r135717524
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/sources/SimpleTextRelation.scala ---
    @@ -90,6 +90,7 @@ class SimpleTextSource extends TextBasedFileFormat with DataSourceRegister {
             }.reduceOption(expressions.And).getOrElse(Literal(true))
             InterpretedPredicate.create(filterCondition, inputAttributes)
           }
    +      predicate.initialize(0)
    --- End diff --
    
    The predicate here can only be `sources.GreaterThan`. It can't include non-deterministic expressions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18961: [SPARK-21746][SQL]there is an java.lang.IllegalAr...

Posted by heary-cao <gi...@git.apache.org>.
Github user heary-cao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18961#discussion_r134707744
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2029,4 +2029,13 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
           Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
       }
    +
    +  test("SPARK-21746: nondeterministic expressions correctly for filter predicates") {
    +    withTempPath { path =>
    +      val p = path.getAbsolutePath
    +      Seq(1 -> "a").toDF("a", "b").write.partitionBy("a").parquet(p)
    +      val df = spark.read.parquet(p)
    +      checkAnswer(df.filter(rand(10) <= 1.0).select($"a"), Row(1))
    --- End diff --
    
    In addition, I tried to validate the spark 2.0.2 version, and it won't trigger  InterpretedPredicate Exception. 
    spark 2.0.2  _InterpretedPredicate.create_
    ```
      def create(expression: Expression): (InternalRow => Boolean) = {
        expression.foreach {
          case n: Nondeterministic => n.setInitialValues()
          case _ =>
        }
        (r: InternalRow) => expression.eval(r).asInstanceOf[Boolean]
      }
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18961
  
    Good catch! I think we do need `InterpretedPredicate.initialize`, but I don't think we should call `initialize` in `InterpretedPredicate.create`. Can you check out where we create `InterpretedPredicate` and call `initialize` there?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org