You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jliwork <gi...@git.apache.org> on 2017/11/18 01:17:26 UTC

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

GitHub user jliwork opened a pull request:

    https://github.com/apache/spark/pull/19776

    [SPARK-22548][SQL] Incorrect nested AND expression pushed down to JDB…

    …C data source
    
    ## What changes were proposed in this pull request?
    
    Let’s say I have a nested AND expression shown below and p2 can not be pushed down, 
    
    (p1 AND p2) OR p3
    
    In current Spark code, during data source filter translation, (p1 AND p2) is returned as p1 only and p2 is simply lost. This issue occurs with JDBC data source and is similar to SPARK-12218 for Parquet. When we have AND nested below another expression, we should either push both legs or nothing. Note that 1) the current Spark code will always split conjunctive predicate before it determines if a predicate can be pushed down or not; 2) the same translation method is also called by Data Source V2. 
    
    ## How was this patch tested?
    
    Added new unit test cases to JDBCSuite


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jliwork/spark spark-22548

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19776.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19776
    
----
commit 58de88c21210d469b8ef14b1f23764c31ca5651e
Author: Jia Li <ji...@us.ibm.com>
Date:   2017-11-18T01:15:01Z

    [SPARK-22548][SQL] Incorrect nested AND expression pushed down to JDBC data source

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152261544
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala ---
    @@ -296,9 +296,15 @@ class JDBCSuite extends SparkFunSuite
         // The older versions of spark have this kind of bugs in parquet data source.
         val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 'mary')")
    --- End diff --
    
    As I leave the comment in https://github.com/apache/spark/pull/10468#discussion_r151900175, the above test doesn't actually test against SPARK-12218 issue.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152421982
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala ---
    @@ -296,9 +296,15 @@ class JDBCSuite extends SparkFunSuite
         // The older versions of spark have this kind of bugs in parquet data source.
         val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 'mary')")
    --- End diff --
    
    Fixed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84086/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84018/testReport)** for PR 19776 at commit [`635768e`](https://github.com/apache/spark/commit/635768e65542cf66f5db92ccf5088136a55443f5).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152422048
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,307 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(attrStr, Literal(null)))
    +    }
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(Literal(null), attrStr))
    +    }
    +
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.InSet(attrInt, Set(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.In(attrInt, Seq(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.IsNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNull(attrInt))
    +    }
    +    assertResult(Some(sources.IsNotNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNotNull(attrInt))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.GreaterThan("cint", 1),
    +      sources.LessThan("cint", 10)))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.GreaterThan(attrInt, 1),
    +        expressions.LessThan(attrInt, 10)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.GreaterThanOrEqual("cint", 8),
    +      sources.LessThanOrEqual("cint", 2)))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.GreaterThanOrEqual(attrInt, 8),
    +        expressions.LessThanOrEqual(attrInt, 2)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Not(
    +      sources.GreaterThanOrEqual("cint", 8)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.GreaterThanOrEqual(attrInt, 8)
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringStartsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.StartsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringEndsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EndsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringContains("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Contains(attrStr, "a"
    +        ))
    +    }
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +    // Functions such as 'Abs' are not supported
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(
    +          expressions.Subtract(expressions.Abs(attrInt), 2), 1))
    --- End diff --
    
    Fixed. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84055 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84055/testReport)** for PR 19776 at commit [`fc34568`](https://github.com/apache/spark/commit/fc34568e0e99f8455bc2e3b030401330b9dc430a).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19776


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84047/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84087 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84087/testReport)** for PR 19776 at commit [`7a19ac6`](https://github.com/apache/spark/commit/7a19ac63fcdae6b67ff989ca90d4a3652c7d02f3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Few comments, otherwise LGTM.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152193833
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,305 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.{sources, QueryTest}
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.test.SharedSQLContext
    +import org.apache.spark.sql.types._
    +
    +
    +class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = AttributeReference("cint", IntegerType)()
    +    val attrStr = AttributeReference("cstr", StringType)()
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, Literal(1)))
    --- End diff --
    
    Fixed. Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84018/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152263922
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,307 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    --- End diff --
    
    As you import `expressions._`, I think we can write `EqualTo` instead of `expressions.EqualTo` for catalyst predicates below?
    
    Because you always write `sources.`EqualTo`, I think we don't confuse with them?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152156397
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,19 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        // See SPARK-12218 for detailed discussion
    +        // It is not safe to just convert one side if we do not understand the
    +        // other side. Here is an example used to explain the reason.
    +        // Let's say we have (a = 2 AND trim(b) = 'blah') OR (c > 0)
    +        // and we do not understand how to convert trim(b) = 'blah'.
    +        // If we only convert a = 2, we will end up with
    +        // (a = 2) OR (c > 0), which will generate wrong results.
    +        // Pushing one leg of AND down is only safe to do at the top level.
    +        // You can see ParquetFilters' createFilter for more details.
    +        for {
    +          leftFilter <- translateFilter(left)
    +          rightFilter <- translateFilter(right)
    +        } yield sources.And(leftFilter, rightFilter)
    --- End diff --
    
    We do not need to clean up the codes in this PR. Let us minimize the code changes and it can simplify the backport.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151920212
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,11 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        // See SPARK-12218 and PR 10362 for detailed discussion
    --- End diff --
    
    Usually we don't list PR number but just JIRA number is enough.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151901160
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,10 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        for {
    --- End diff --
    
    Sure. Will do. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152186116
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala ---
    @@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
         // The older versions of spark have this kind of bugs in parquet data source.
         val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 'mary')")
         val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT (NAME != 'mary')")
    +    val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 'mary') OR (NAME = 'fred')")
    +    val df4 = sql("SELECT * FROM foobar " +
    +      "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
    +    val df5 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
    +    val df6 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
    +    val df7 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
    +    val df8 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 'fred'))")
    +    val df9 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR TRIM(NAME) != 'fred'))")
    +    val df10 = sql("SELECT * FROM foobar " +
    +      "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND NAME = 'fred')")
    +
         assert(df1.collect.toSet === Set(Row("mary", 2)))
         assert(df2.collect.toSet === Set(Row("mary", 2)))
    +    assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df5.collect.toSet === Set(Row("mary", 2)))
    +    assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    --- End diff --
    
    I went ahead and added a new ```DataSourceStrategySuite``` to test the ```translateFilter```. Please free feel to let me know of any further comments. Thanks! 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152200223
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,302 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(attrStr, Literal(null)))
    +    }
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(Literal(null), attrStr))
    +    }
    +
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.InSet(attrInt, Set(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.In(attrInt, Seq(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.IsNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNull(attrInt))
    +    }
    +    assertResult(Some(sources.IsNotNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNotNull(attrInt))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.GreaterThan("cint", 1),
    +      sources.LessThan("cint", 10)))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.GreaterThan(attrInt, 1),
    +        expressions.LessThan(attrInt, 10)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.GreaterThanOrEqual("cint", 8),
    +      sources.LessThanOrEqual("cint", 2)))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.GreaterThanOrEqual(attrInt, 8),
    +        expressions.LessThanOrEqual(attrInt, 2)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Not(
    +      sources.GreaterThanOrEqual("cint", 8)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.GreaterThanOrEqual(attrInt, 8)
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringStartsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.StartsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringEndsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EndsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringContains("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Contains(attrStr, "a"
    +        ))
    +    }
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(
    +          expressions.Subtract(expressions.Abs(attrInt), 2), 1))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.And(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10)),
    +      sources.And(
    +        sources.GreaterThan("cint", 50),
    +        sources.LessThan("cint", 100))))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 50),
    +          expressions.LessThan(attrInt, 100)
    +        )
    +      ))
    +    }
    +    // SPARK-22548 Incorrect nested AND expression pushed down to JDBC data source
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(
    +            expressions.Abs(attrInt), 10)
    +        ),
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 50),
    +          expressions.LessThan(attrInt, 100)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.And(
    +          expressions.Or(
    +            expressions.LessThanOrEqual(attrInt, 1),
    +            expressions.GreaterThanOrEqual(
    +              expressions.Abs(attrInt),
    +              10)
    +          ),
    +          expressions.Or(
    +            expressions.LessThanOrEqual(attrInt, 50),
    +            expressions.GreaterThanOrEqual(attrInt, 100)
    +          )
    +        )))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.Or(
    +        sources.EqualTo("cint", 1),
    +        sources.EqualTo("cint", 10)),
    +      sources.Or(
    +        sources.GreaterThan("cint", 0),
    +        sources.LessThan("cint", -10))))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.Or(
    +          expressions.EqualTo(attrInt, 1),
    +          expressions.EqualTo(attrInt, 10)
    +        ),
    +        expressions.Or(
    +          expressions.GreaterThan(attrInt, 0),
    +          expressions.LessThan(attrInt, -10)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.Or(
    +          expressions.EqualTo(attrInt, 1),
    +          expressions.EqualTo(
    +            expressions.Abs(attrInt), 10)
    +        ),
    +        expressions.Or(
    +          expressions.GreaterThan(attrInt, 0),
    +          expressions.LessThan(attrInt, -10)
    +        )
    +      ))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.And(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10)),
    +      sources.And(
    +        sources.EqualTo("cint", 6),
    +        sources.IsNotNull("cint"))))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.And(
    +          expressions.EqualTo(attrInt, 6),
    +          expressions.IsNotNull(attrInt)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.And(
    +          expressions.EqualTo(expressions.Abs(attrInt), 6),
    +          expressions.IsNotNull(attrInt)
    +        )
    +      ))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.Or(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10)),
    +      sources.Or(
    +        sources.EqualTo("cint", 6),
    +        sources.IsNotNull("cint"))))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.Or(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.Or(
    +          expressions.EqualTo(attrInt, 6),
    +          expressions.IsNotNull(attrInt)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.Or(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.Or(
    +          expressions.EqualTo(expressions.Abs(attrInt), 6),
    --- End diff --
    
    ditto


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    @viirya I'm fine with backport to 2.2 unless anyone objects.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152192018
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,306 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +import org.apache.spark.sql.types._
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = AttributeReference("cint", IntegerType)()
    +    val attrStr = AttributeReference("cstr", StringType)()
    --- End diff --
    
    ```Scala
    import org.apache.spark.sql.catalyst.dsl.expressions._
    ```
    
    You can simplify your test cases. 
    
    ```Scala
        val attrInt = 'cint.int
        val attrStr = 'cstr.string
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84017/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152260557
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,19 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        // See SPARK-12218 for detailed discussion
    +        // It is not safe to just convert one side if we do not understand the
    +        // other side. Here is an example used to explain the reason.
    +        // Let's say we have (a = 2 AND trim(b) = 'blah') OR (c > 0)
    +        // and we do not understand how to convert trim(b) = 'blah'.
    +        // If we only convert a = 2, we will end up with
    +        // (a = 2) OR (c > 0), which will generate wrong results.
    +        // Pushing one leg of AND down is only safe to do at the top level.
    +        // You can see ParquetFilters' createFilter for more details.
    +        for {
    +          leftFilter <- translateFilter(left)
    +          rightFilter <- translateFilter(right)
    +        } yield sources.And(leftFilter, rightFilter)
    --- End diff --
    
    Although Catalyst predicate expressions are all converted to `sources.Filter` when we try to push down them. Not all convertible filters can be handled by Parquet and ORC. So I think we still can face the case only one sub-filter of `AND` can be pushed down by the file format.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84053/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84019 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84019/testReport)** for PR 19776 at commit [`3bb7d3c`](https://github.com/apache/spark/commit/3bb7d3cc83d660e1fb2aefd9681243fb17ed18fa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    LGTM otherwise too.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84087/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152200107
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,302 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(attrStr, Literal(null)))
    +    }
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(Literal(null), attrStr))
    +    }
    +
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.InSet(attrInt, Set(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.In(attrInt, Seq(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.IsNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNull(attrInt))
    +    }
    +    assertResult(Some(sources.IsNotNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNotNull(attrInt))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.GreaterThan("cint", 1),
    +      sources.LessThan("cint", 10)))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.GreaterThan(attrInt, 1),
    +        expressions.LessThan(attrInt, 10)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.GreaterThanOrEqual("cint", 8),
    +      sources.LessThanOrEqual("cint", 2)))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.GreaterThanOrEqual(attrInt, 8),
    +        expressions.LessThanOrEqual(attrInt, 2)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Not(
    +      sources.GreaterThanOrEqual("cint", 8)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.GreaterThanOrEqual(attrInt, 8)
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringStartsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.StartsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringEndsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EndsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringContains("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Contains(attrStr, "a"
    +        ))
    +    }
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(
    +          expressions.Subtract(expressions.Abs(attrInt), 2), 1))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.And(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10)),
    +      sources.And(
    +        sources.GreaterThan("cint", 50),
    +        sources.LessThan("cint", 100))))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 50),
    +          expressions.LessThan(attrInt, 100)
    +        )
    +      ))
    +    }
    +    // SPARK-22548 Incorrect nested AND expression pushed down to JDBC data source
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(
    +            expressions.Abs(attrInt), 10)
    --- End diff --
    
    ditto


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Thanks for everyone's comments! I have polished the test cases. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    @jliwork Let's see what @cloud-fan @felixcheung think about it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84053 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84053/testReport)** for PR 19776 at commit [`ba06181`](https://github.com/apache/spark/commit/ba061815352b617b129a46c6482b27c111cba88c).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class DataSourceStrategySuite extends PlanTest with SharedSQLContext `


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152421773
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,307 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    --- End diff --
    
    Thanks for the suggestion. Fixed. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84017 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84017/testReport)** for PR 19776 at commit [`e540790`](https://github.com/apache/spark/commit/e5407902c92b3936395ae1d38eba8bf20dfcbb43).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152200166
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,302 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(attrStr, Literal(null)))
    +    }
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(Literal(null), attrStr))
    +    }
    +
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.InSet(attrInt, Set(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.In(attrInt, Seq(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.IsNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNull(attrInt))
    +    }
    +    assertResult(Some(sources.IsNotNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNotNull(attrInt))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.GreaterThan("cint", 1),
    +      sources.LessThan("cint", 10)))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.GreaterThan(attrInt, 1),
    +        expressions.LessThan(attrInt, 10)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.GreaterThanOrEqual("cint", 8),
    +      sources.LessThanOrEqual("cint", 2)))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.GreaterThanOrEqual(attrInt, 8),
    +        expressions.LessThanOrEqual(attrInt, 2)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Not(
    +      sources.GreaterThanOrEqual("cint", 8)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.GreaterThanOrEqual(attrInt, 8)
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringStartsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.StartsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringEndsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EndsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringContains("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Contains(attrStr, "a"
    +        ))
    +    }
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(
    +          expressions.Subtract(expressions.Abs(attrInt), 2), 1))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.And(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10)),
    +      sources.And(
    +        sources.GreaterThan("cint", 50),
    +        sources.LessThan("cint", 100))))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 50),
    +          expressions.LessThan(attrInt, 100)
    +        )
    +      ))
    +    }
    +    // SPARK-22548 Incorrect nested AND expression pushed down to JDBC data source
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(
    +            expressions.Abs(attrInt), 10)
    +        ),
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 50),
    +          expressions.LessThan(attrInt, 100)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.And(
    +          expressions.Or(
    +            expressions.LessThanOrEqual(attrInt, 1),
    +            expressions.GreaterThanOrEqual(
    +              expressions.Abs(attrInt),
    +              10)
    +          ),
    +          expressions.Or(
    +            expressions.LessThanOrEqual(attrInt, 50),
    +            expressions.GreaterThanOrEqual(attrInt, 100)
    +          )
    +        )))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.Or(
    +        sources.EqualTo("cint", 1),
    +        sources.EqualTo("cint", 10)),
    +      sources.Or(
    +        sources.GreaterThan("cint", 0),
    +        sources.LessThan("cint", -10))))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.Or(
    +          expressions.EqualTo(attrInt, 1),
    +          expressions.EqualTo(attrInt, 10)
    +        ),
    +        expressions.Or(
    +          expressions.GreaterThan(attrInt, 0),
    +          expressions.LessThan(attrInt, -10)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.Or(
    +          expressions.EqualTo(attrInt, 1),
    +          expressions.EqualTo(
    +            expressions.Abs(attrInt), 10)
    --- End diff --
    
    ditto


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84087 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84087/testReport)** for PR 19776 at commit [`7a19ac6`](https://github.com/apache/spark/commit/7a19ac63fcdae6b67ff989ca90d4a3652c7d02f3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152263135
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,307 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    --- End diff --
    
    Looks like we can have a small helper function:
    
    ```scala
    def testTranslateFilter(catalystFilter: Expression, result: Option[sources.Filter]): Unit = {
      assertResult(result) {
        DataSourceStrategy.translateFilter(catalystFilter)
      }
    }
    ```
    
    So the tests can be rewritten as:
    ```scala
    testTranslateFilter(expressions.EqualTo(attrInt, 1), Some(sources.EqualTo("cint", 1)))
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    @viirya Thanks for letting me know, Simon. I've fixed the title. Can someone please help trigger the tests please? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152193847
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,306 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +import org.apache.spark.sql.types._
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = AttributeReference("cint", IntegerType)()
    +    val attrStr = AttributeReference("cstr", StringType)()
    --- End diff --
    
    Fixed. Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    @cloud-fan Thank you for your comments! I have updated the test cases as you suggested. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152429647
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,231 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    testTranslateFilter(EqualTo(attrInt, 1), Some(sources.EqualTo("cint", 1)))
    +    testTranslateFilter(EqualTo(1, attrInt), Some(sources.EqualTo("cint", 1)))
    +
    +    testTranslateFilter(EqualNullSafe(attrStr, Literal(null)),
    +      Some(sources.EqualNullSafe("cstr", null)))
    +    testTranslateFilter(EqualNullSafe(Literal(null), attrStr),
    +      Some(sources.EqualNullSafe("cstr", null)))
    +
    +    testTranslateFilter(GreaterThan(attrInt, 1), Some(sources.GreaterThan("cint", 1)))
    +    testTranslateFilter(GreaterThan(1, attrInt), Some(sources.LessThan("cint", 1)))
    +
    +    testTranslateFilter(LessThan(attrInt, 1), Some(sources.LessThan("cint", 1)))
    +    testTranslateFilter(LessThan(1, attrInt), Some(sources.GreaterThan("cint", 1)))
    +
    +    testTranslateFilter(GreaterThanOrEqual(attrInt, 1), Some(sources.GreaterThanOrEqual("cint", 1)))
    +    testTranslateFilter(GreaterThanOrEqual(1, attrInt), Some(sources.LessThanOrEqual("cint", 1)))
    +
    +    testTranslateFilter(LessThanOrEqual(attrInt, 1), Some(sources.LessThanOrEqual("cint", 1)))
    +    testTranslateFilter(LessThanOrEqual(1, attrInt), Some(sources.GreaterThanOrEqual("cint", 1)))
    +
    +    testTranslateFilter(InSet(attrInt, Set(1, 2, 3)), Some(sources.In("cint", Array(1, 2, 3))))
    +
    +    testTranslateFilter(In(attrInt, Seq(1, 2, 3)), Some(sources.In("cint", Array(1, 2, 3))))
    +
    +    testTranslateFilter(IsNull(attrInt), Some(sources.IsNull("cint")))
    +    testTranslateFilter(IsNotNull(attrInt), Some(sources.IsNotNull("cint")))
    +
    +    // cint > 1 AND cint < 10
    +    testTranslateFilter(And(
    +      GreaterThan(attrInt, 1),
    +      LessThan(attrInt, 10)),
    +      Some(sources.And(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10))))
    +
    +    // cint >= 8 OR cint <= 2
    +    testTranslateFilter(Or(
    +      GreaterThanOrEqual(attrInt, 8),
    +      LessThanOrEqual(attrInt, 2)),
    +      Some(sources.Or(
    +        sources.GreaterThanOrEqual("cint", 8),
    +        sources.LessThanOrEqual("cint", 2))))
    +
    +    testTranslateFilter(Not(GreaterThanOrEqual(attrInt, 8)),
    +      Some(sources.Not(sources.GreaterThanOrEqual("cint", 8))))
    +
    +    testTranslateFilter(StartsWith(attrStr, "a"), Some(sources.StringStartsWith("cstr", "a")))
    +
    +    testTranslateFilter(EndsWith(attrStr, "a"), Some(sources.StringEndsWith("cstr", "a")))
    +
    +    testTranslateFilter(Contains(attrStr, "a"), Some(sources.StringContains("cstr", "a")))
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +
    +    // ABS(cint) - 2 = 1
    +    testTranslateFilter(LessThanOrEqual(
    +      // Expressions are not supported
    --- End diff --
    
    good catch @_@ fixed the typo. Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152200310
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,302 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(attrStr, Literal(null)))
    +    }
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(Literal(null), attrStr))
    +    }
    +
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.InSet(attrInt, Set(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.In(attrInt, Seq(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.IsNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNull(attrInt))
    +    }
    +    assertResult(Some(sources.IsNotNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNotNull(attrInt))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.GreaterThan("cint", 1),
    +      sources.LessThan("cint", 10)))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.GreaterThan(attrInt, 1),
    +        expressions.LessThan(attrInt, 10)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.GreaterThanOrEqual("cint", 8),
    +      sources.LessThanOrEqual("cint", 2)))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.GreaterThanOrEqual(attrInt, 8),
    +        expressions.LessThanOrEqual(attrInt, 2)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Not(
    +      sources.GreaterThanOrEqual("cint", 8)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.GreaterThanOrEqual(attrInt, 8)
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringStartsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.StartsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringEndsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EndsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringContains("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Contains(attrStr, "a"
    +        ))
    +    }
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(
    +          expressions.Subtract(expressions.Abs(attrInt), 2), 1))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.And(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10)),
    +      sources.And(
    +        sources.GreaterThan("cint", 50),
    +        sources.LessThan("cint", 100))))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 50),
    +          expressions.LessThan(attrInt, 100)
    +        )
    +      ))
    +    }
    +    // SPARK-22548 Incorrect nested AND expression pushed down to JDBC data source
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(
    +            expressions.Abs(attrInt), 10)
    +        ),
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 50),
    +          expressions.LessThan(attrInt, 100)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.And(
    +          expressions.Or(
    +            expressions.LessThanOrEqual(attrInt, 1),
    +            expressions.GreaterThanOrEqual(
    +              expressions.Abs(attrInt),
    +              10)
    +          ),
    +          expressions.Or(
    +            expressions.LessThanOrEqual(attrInt, 50),
    +            expressions.GreaterThanOrEqual(attrInt, 100)
    +          )
    +        )))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.Or(
    +        sources.EqualTo("cint", 1),
    +        sources.EqualTo("cint", 10)),
    +      sources.Or(
    +        sources.GreaterThan("cint", 0),
    +        sources.LessThan("cint", -10))))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.Or(
    +          expressions.EqualTo(attrInt, 1),
    +          expressions.EqualTo(attrInt, 10)
    +        ),
    +        expressions.Or(
    +          expressions.GreaterThan(attrInt, 0),
    +          expressions.LessThan(attrInt, -10)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.Or(
    +          expressions.EqualTo(attrInt, 1),
    +          expressions.EqualTo(
    +            expressions.Abs(attrInt), 10)
    +        ),
    +        expressions.Or(
    +          expressions.GreaterThan(attrInt, 0),
    +          expressions.LessThan(attrInt, -10)
    +        )
    +      ))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.And(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10)),
    +      sources.And(
    +        sources.EqualTo("cint", 6),
    +        sources.IsNotNull("cint"))))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.And(
    +          expressions.EqualTo(attrInt, 6),
    +          expressions.IsNotNull(attrInt)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.And(
    +          expressions.EqualTo(expressions.Abs(attrInt), 6),
    --- End diff --
    
    ditto


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151917654
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,11 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        // See SPARK-12218 and PR 10362 for detailed discussion
    --- End diff --
    
    In the comment, you need to give an example to explain why. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84017 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84017/testReport)** for PR 19776 at commit [`e540790`](https://github.com/apache/spark/commit/e5407902c92b3936395ae1d38eba8bf20dfcbb43).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Looks like a bug and be there for a long while. cc @cloud-fan @HyukjinKwon can you help trigger the test? Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    @jliwork Can you fix the PR title? The title is cut when pasting on.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152421950
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,307 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    --- End diff --
    
    Thanks! I've followed your suggestion and the test suite looks cleaner now. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84086 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84086/testReport)** for PR 19776 at commit [`a0b3d4e`](https://github.com/apache/spark/commit/a0b3d4e990cd7024b532593bca321499001fc89b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84019 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84019/testReport)** for PR 19776 at commit [`3bb7d3c`](https://github.com/apache/spark/commit/3bb7d3cc83d660e1fb2aefd9681243fb17ed18fa).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84010/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152200090
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,302 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(attrStr, Literal(null)))
    +    }
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(Literal(null), attrStr))
    +    }
    +
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.InSet(attrInt, Set(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.In(attrInt, Seq(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.IsNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNull(attrInt))
    +    }
    +    assertResult(Some(sources.IsNotNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNotNull(attrInt))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.GreaterThan("cint", 1),
    +      sources.LessThan("cint", 10)))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.GreaterThan(attrInt, 1),
    +        expressions.LessThan(attrInt, 10)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.GreaterThanOrEqual("cint", 8),
    +      sources.LessThanOrEqual("cint", 2)))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.GreaterThanOrEqual(attrInt, 8),
    +        expressions.LessThanOrEqual(attrInt, 2)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Not(
    +      sources.GreaterThanOrEqual("cint", 8)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.GreaterThanOrEqual(attrInt, 8)
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringStartsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.StartsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringEndsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EndsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringContains("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Contains(attrStr, "a"
    +        ))
    +    }
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(
    +          expressions.Subtract(expressions.Abs(attrInt), 2), 1))
    --- End diff --
    
    would be better to add a comment to say that `abs` is not supported


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152137096
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala ---
    @@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
         // The older versions of spark have this kind of bugs in parquet data source.
         val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 'mary')")
         val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT (NAME != 'mary')")
    +    val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 'mary') OR (NAME = 'fred')")
    +    val df4 = sql("SELECT * FROM foobar " +
    +      "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
    +    val df5 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
    +    val df6 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
    +    val df7 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
    +    val df8 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 'fred'))")
    +    val df9 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR TRIM(NAME) != 'fred'))")
    +    val df10 = sql("SELECT * FROM foobar " +
    +      "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND NAME = 'fred')")
    +
         assert(df1.collect.toSet === Set(Row("mary", 2)))
         assert(df2.collect.toSet === Set(Row("mary", 2)))
    +    assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df5.collect.toSet === Set(Row("mary", 2)))
    +    assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    --- End diff --
    
    Sure. I can help.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151900595
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,10 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        for {
    --- End diff --
    
    Let's add a small comment like the PR you pointed out. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151962055
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala ---
    @@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
         // The older versions of spark have this kind of bugs in parquet data source.
         val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 'mary')")
         val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT (NAME != 'mary')")
    +    val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 'mary') OR (NAME = 'fred')")
    +    val df4 = sql("SELECT * FROM foobar " +
    +      "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
    +    val df5 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
    +    val df6 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
    +    val df7 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
    +    val df8 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 'fred'))")
    +    val df9 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR TRIM(NAME) != 'fred'))")
    +    val df10 = sql("SELECT * FROM foobar " +
    +      "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND NAME = 'fred')")
    +
         assert(df1.collect.toSet === Set(Row("mary", 2)))
         assert(df2.collect.toSet === Set(Row("mary", 2)))
    +    assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df5.collect.toSet === Set(Row("mary", 2)))
    +    assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    --- End diff --
    
    I'd like to create a new `DataSourceStrategySuite` to test the `translateFilter`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84086 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84086/testReport)** for PR 19776 at commit [`a0b3d4e`](https://github.com/apache/spark/commit/a0b3d4e990cd7024b532593bca321499001fc89b).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84010 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84010/testReport)** for PR 19776 at commit [`58de88c`](https://github.com/apache/spark/commit/58de88c21210d469b8ef14b1f23764c31ca5651e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84047 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84047/testReport)** for PR 19776 at commit [`0aebdfb`](https://github.com/apache/spark/commit/0aebdfb9e823cc6cf062a41866040b7f37d1c7a4).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class DataSourceStrategySuite extends QueryTest with SharedSQLContext `


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152191430
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,305 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.{sources, QueryTest}
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.test.SharedSQLContext
    +import org.apache.spark.sql.types._
    +
    +
    +class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
    --- End diff --
    
    Fixed. Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    LGTM except a few minor comments


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151960214
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,19 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        // See SPARK-12218 for detailed discussion
    +        // It is not safe to just convert one side if we do not understand the
    +        // other side. Here is an example used to explain the reason.
    +        // Let's say we have (a = 2 AND trim(b) = 'blah') OR (c > 0)
    +        // and we do not understand how to convert trim(b) = 'blah'.
    +        // If we only convert a = 2, we will end up with
    +        // (a = 2) OR (c > 0), which will generate wrong results.
    +        // Pushing one leg of AND down is only safe to do at the top level.
    +        // You can see ParquetFilters' createFilter for more details.
    +        for {
    +          leftFilter <- translateFilter(left)
    +          rightFilter <- translateFilter(right)
    +        } yield sources.And(leftFilter, rightFilter)
    --- End diff --
    
    do we still need SPARK-12218 after this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84053 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84053/testReport)** for PR 19776 at commit [`ba06181`](https://github.com/apache/spark/commit/ba061815352b617b129a46c6482b27c111cba88c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84063 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84063/testReport)** for PR 19776 at commit [`0cbb528`](https://github.com/apache/spark/commit/0cbb528d974167bfe3740d10e15f784dcb40f2f1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151916997
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,10 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        for {
    --- End diff --
    
    Yeah. Follow what @yhuai  wrote in the PR https://github.com/apache/spark/pull/10362/files


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152425778
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,231 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    testTranslateFilter(EqualTo(attrInt, 1), Some(sources.EqualTo("cint", 1)))
    +    testTranslateFilter(EqualTo(1, attrInt), Some(sources.EqualTo("cint", 1)))
    +
    +    testTranslateFilter(EqualNullSafe(attrStr, Literal(null)),
    +      Some(sources.EqualNullSafe("cstr", null)))
    +    testTranslateFilter(EqualNullSafe(Literal(null), attrStr),
    +      Some(sources.EqualNullSafe("cstr", null)))
    +
    +    testTranslateFilter(GreaterThan(attrInt, 1), Some(sources.GreaterThan("cint", 1)))
    +    testTranslateFilter(GreaterThan(1, attrInt), Some(sources.LessThan("cint", 1)))
    +
    +    testTranslateFilter(LessThan(attrInt, 1), Some(sources.LessThan("cint", 1)))
    +    testTranslateFilter(LessThan(1, attrInt), Some(sources.GreaterThan("cint", 1)))
    +
    +    testTranslateFilter(GreaterThanOrEqual(attrInt, 1), Some(sources.GreaterThanOrEqual("cint", 1)))
    +    testTranslateFilter(GreaterThanOrEqual(1, attrInt), Some(sources.LessThanOrEqual("cint", 1)))
    +
    +    testTranslateFilter(LessThanOrEqual(attrInt, 1), Some(sources.LessThanOrEqual("cint", 1)))
    +    testTranslateFilter(LessThanOrEqual(1, attrInt), Some(sources.GreaterThanOrEqual("cint", 1)))
    +
    +    testTranslateFilter(InSet(attrInt, Set(1, 2, 3)), Some(sources.In("cint", Array(1, 2, 3))))
    +
    +    testTranslateFilter(In(attrInt, Seq(1, 2, 3)), Some(sources.In("cint", Array(1, 2, 3))))
    +
    +    testTranslateFilter(IsNull(attrInt), Some(sources.IsNull("cint")))
    +    testTranslateFilter(IsNotNull(attrInt), Some(sources.IsNotNull("cint")))
    +
    +    // cint > 1 AND cint < 10
    +    testTranslateFilter(And(
    +      GreaterThan(attrInt, 1),
    +      LessThan(attrInt, 10)),
    +      Some(sources.And(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10))))
    +
    +    // cint >= 8 OR cint <= 2
    +    testTranslateFilter(Or(
    +      GreaterThanOrEqual(attrInt, 8),
    +      LessThanOrEqual(attrInt, 2)),
    +      Some(sources.Or(
    +        sources.GreaterThanOrEqual("cint", 8),
    +        sources.LessThanOrEqual("cint", 2))))
    +
    +    testTranslateFilter(Not(GreaterThanOrEqual(attrInt, 8)),
    +      Some(sources.Not(sources.GreaterThanOrEqual("cint", 8))))
    +
    +    testTranslateFilter(StartsWith(attrStr, "a"), Some(sources.StringStartsWith("cstr", "a")))
    +
    +    testTranslateFilter(EndsWith(attrStr, "a"), Some(sources.StringEndsWith("cstr", "a")))
    +
    +    testTranslateFilter(Contains(attrStr, "a"), Some(sources.StringContains("cstr", "a")))
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +
    +    // ABS(cint) - 2 = 1
    +    testTranslateFilter(LessThanOrEqual(
    +      // Expressions are not supported
    --- End diff --
    
    ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152191659
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,305 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.{sources, QueryTest}
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.test.SharedSQLContext
    +import org.apache.spark.sql.types._
    +
    +
    +class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = AttributeReference("cint", IntegerType)()
    +    val attrStr = AttributeReference("cstr", StringType)()
    --- End diff --
    
    ```Scala
    import org.apache.spark.sql.catalyst.dsl.expressions._
    ```
    
    You can simplify your test cases. 
    
    ```Scala
        val attrInt = 'cint.int
        val attrStr = 'cstr.string
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84018/testReport)** for PR 19776 at commit [`635768e`](https://github.com/apache/spark/commit/635768e65542cf66f5db92ccf5088136a55443f5).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152191858
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,305 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.{sources, QueryTest}
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.test.SharedSQLContext
    +import org.apache.spark.sql.types._
    +
    +
    +class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = AttributeReference("cint", IntegerType)()
    +    val attrStr = AttributeReference("cstr", StringType)()
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, Literal(1)))
    --- End diff --
    
    No need to call `Literal` here. It will be implicitly casted to Literal
    ```
    expressions.EqualTo(attrInt,1))
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84055 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84055/testReport)** for PR 19776 at commit [`fc34568`](https://github.com/apache/spark/commit/fc34568e0e99f8455bc2e3b030401330b9dc430a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Thanks! Merged to master/2.2/2.1


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151920126
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,11 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        // See SPARK-12218 and PR 10362 for detailed discussion
    --- End diff --
    
    Sure. I have added more comments there with an example. Thanks, Sean!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84063 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84063/testReport)** for PR 19776 at commit [`0cbb528`](https://github.com/apache/spark/commit/0cbb528d974167bfe3740d10e15f784dcb40f2f1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84019/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84010 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84010/testReport)** for PR 19776 at commit [`58de88c`](https://github.com/apache/spark/commit/58de88c21210d469b8ef14b1f23764c31ca5651e).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84063/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84055/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151922970
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,11 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        // See SPARK-12218 and PR 10362 for detailed discussion
    --- End diff --
    
    @viirya I see. Thanks, Simon! I've removed the PR number from the comment.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152190564
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,305 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.{sources, QueryTest}
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.test.SharedSQLContext
    +import org.apache.spark.sql.types._
    +
    +
    +class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
    --- End diff --
    
    `extends PlanTest`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152156940
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala ---
    @@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
         // The older versions of spark have this kind of bugs in parquet data source.
         val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 'mary')")
         val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT (NAME != 'mary')")
    +    val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 'mary') OR (NAME = 'fred')")
    +    val df4 = sql("SELECT * FROM foobar " +
    +      "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
    +    val df5 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
    +    val df6 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
    +    val df7 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
    +    val df8 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 'fred'))")
    +    val df9 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR TRIM(NAME) != 'fred'))")
    +    val df10 = sql("SELECT * FROM foobar " +
    +      "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND NAME = 'fred')")
    +
         assert(df1.collect.toSet === Set(Row("mary", 2)))
         assert(df2.collect.toSet === Set(Row("mary", 2)))
    +    assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df5.collect.toSet === Set(Row("mary", 2)))
    +    assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    --- End diff --
    
    They are end-to-end test cases. 
    
    If you can, we should also add such a unit test suite. In the future, we can add more unit test cases for verifying more complex cases. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152200130
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,302 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(attrStr, Literal(null)))
    +    }
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(Literal(null), attrStr))
    +    }
    +
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.InSet(attrInt, Set(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.In(attrInt, Seq(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.IsNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNull(attrInt))
    +    }
    +    assertResult(Some(sources.IsNotNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNotNull(attrInt))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.GreaterThan("cint", 1),
    +      sources.LessThan("cint", 10)))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.GreaterThan(attrInt, 1),
    +        expressions.LessThan(attrInt, 10)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.GreaterThanOrEqual("cint", 8),
    +      sources.LessThanOrEqual("cint", 2)))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.GreaterThanOrEqual(attrInt, 8),
    +        expressions.LessThanOrEqual(attrInt, 2)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Not(
    +      sources.GreaterThanOrEqual("cint", 8)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.GreaterThanOrEqual(attrInt, 8)
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringStartsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.StartsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringEndsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EndsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringContains("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Contains(attrStr, "a"
    +        ))
    +    }
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(
    +          expressions.Subtract(expressions.Abs(attrInt), 2), 1))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.And(
    +        sources.GreaterThan("cint", 1),
    +        sources.LessThan("cint", 10)),
    +      sources.And(
    +        sources.GreaterThan("cint", 50),
    +        sources.LessThan("cint", 100))))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(attrInt, 10)
    +        ),
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 50),
    +          expressions.LessThan(attrInt, 100)
    +        )
    +      ))
    +    }
    +    // SPARK-22548 Incorrect nested AND expression pushed down to JDBC data source
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 1),
    +          expressions.LessThan(
    +            expressions.Abs(attrInt), 10)
    +        ),
    +        expressions.And(
    +          expressions.GreaterThan(attrInt, 50),
    +          expressions.LessThan(attrInt, 100)
    +        )
    +      ))
    +    }
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.And(
    +          expressions.Or(
    +            expressions.LessThanOrEqual(attrInt, 1),
    +            expressions.GreaterThanOrEqual(
    +              expressions.Abs(attrInt),
    --- End diff --
    
    ditto


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    This affects correctness, should we also backport to 2.2?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152200656
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala ---
    @@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
         // The older versions of spark have this kind of bugs in parquet data source.
         val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 'mary')")
         val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT (NAME != 'mary')")
    +    val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 'mary') OR (NAME = 'fred')")
    +    val df4 = sql("SELECT * FROM foobar " +
    +      "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
    +    val df5 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
    +    val df6 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
    +    val df7 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
    +    val df8 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 'fred'))")
    +    val df9 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR TRIM(NAME) != 'fred'))")
    +    val df10 = sql("SELECT * FROM foobar " +
    --- End diff --
    
    why do we need to test so many cases? as an end-to-end test, I think we only need a typical case.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    @gatorsmile @cloud-fan @viirya @HyukjinKwon Thanks a lot! =)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152136470
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,19 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        // See SPARK-12218 for detailed discussion
    +        // It is not safe to just convert one side if we do not understand the
    +        // other side. Here is an example used to explain the reason.
    +        // Let's say we have (a = 2 AND trim(b) = 'blah') OR (c > 0)
    +        // and we do not understand how to convert trim(b) = 'blah'.
    +        // If we only convert a = 2, we will end up with
    +        // (a = 2) OR (c > 0), which will generate wrong results.
    +        // Pushing one leg of AND down is only safe to do at the top level.
    +        // You can see ParquetFilters' createFilter for more details.
    +        for {
    +          leftFilter <- translateFilter(left)
    +          rightFilter <- translateFilter(right)
    +        } yield sources.And(leftFilter, rightFilter)
    --- End diff --
    
    I would think so.  SPARK-12218 put fixes into ParquetFilters.createFilter and OrcFilters.createFilter. They're similar to DataSourceStrategy.translateFilter but have different signature customized for Parquet and Orc. For all datasources including JDBC, Parquet, etc, translateFilter is called to determine if a predicate Expression can be pushed down as a Filter or not. Next for Parquet and ORC, Fitlers get mapped to Parquet or ORC specific filters with their own createFilter method. 
    
    So this PR does help all datasources to get the correct set of push down predicates. Without this PR we simply got lucky with Parquet and ORC in terms of result correctness because 1) it looks like we always apply Filter on top of scan; 2) we end up with same number of or more rows returned with one leg missing from AND. 
    
    JDBC data source does not always come with Filter on top of scan therefore exposed the bug. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r151920447
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -497,7 +497,10 @@ object DataSourceStrategy {
             Some(sources.IsNotNull(a.name))
     
           case expressions.And(left, right) =>
    -        (translateFilter(left) ++ translateFilter(right)).reduceOption(sources.And)
    +        for {
    --- End diff --
    
    Thanks. Just did that as you suggested.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152206765
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala ---
    @@ -0,0 +1,307 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.execution.datasources
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.sources
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +
    +class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
    +
    +  test("translate simple expression") {
    +    val attrInt = 'cint.int
    +    val attrStr = 'cstr.string
    +
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(attrInt, 1))
    +    }
    +    assertResult(Some(sources.EqualTo("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualTo(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(attrStr, Literal(null)))
    +    }
    +    assertResult(Some(sources.EqualNullSafe("cstr", null))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EqualNullSafe(Literal(null), attrStr))
    +    }
    +
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThan(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThan("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThan(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(attrInt, 1))
    +    }
    +    assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.GreaterThanOrEqual(1, attrInt))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.InSet(attrInt, Set(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.In("cint", Array(1, 2, 3)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.In(attrInt, Seq(1, 2, 3)))
    +    }
    +
    +    assertResult(Some(sources.IsNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNull(attrInt))
    +    }
    +    assertResult(Some(sources.IsNotNull("cint"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.IsNotNull(attrInt))
    +    }
    +
    +    assertResult(Some(sources.And(
    +      sources.GreaterThan("cint", 1),
    +      sources.LessThan("cint", 10)))) {
    +      DataSourceStrategy.translateFilter(expressions.And(
    +        expressions.GreaterThan(attrInt, 1),
    +        expressions.LessThan(attrInt, 10)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Or(
    +      sources.GreaterThanOrEqual("cint", 8),
    +      sources.LessThanOrEqual("cint", 2)))) {
    +      DataSourceStrategy.translateFilter(expressions.Or(
    +        expressions.GreaterThanOrEqual(attrInt, 8),
    +        expressions.LessThanOrEqual(attrInt, 2)
    +      ))
    +    }
    +
    +    assertResult(Some(sources.Not(
    +      sources.GreaterThanOrEqual("cint", 8)))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Not(expressions.GreaterThanOrEqual(attrInt, 8)
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringStartsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.StartsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringEndsWith("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.EndsWith(attrStr, "a"
    +        ))
    +    }
    +
    +    assertResult(Some(sources.StringContains("cstr", "a"))) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.Contains(attrStr, "a"
    +        ))
    +    }
    +  }
    +
    +  test("translate complex expression") {
    +    val attrInt = 'cint.int
    +    // Functions such as 'Abs' are not supported
    +    assertResult(None) {
    +      DataSourceStrategy.translateFilter(
    +        expressions.LessThanOrEqual(
    +          expressions.Subtract(expressions.Abs(attrInt), 2), 1))
    --- End diff --
    
    can we move the comment to this line? i.e.
    ```
    // `Abs` expression cannot be pushed down
    expressions.Subtract(expressions.Abs(attrInt), 2), 1))
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    good catch! It's a long-standing bug and I think we should backport it all the way to 2.0


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19776
  
    **[Test build #84047 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84047/testReport)** for PR 19776 at commit [`0aebdfb`](https://github.com/apache/spark/commit/0aebdfb9e823cc6cf062a41866040b7f37d1c7a4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

Posted by jliwork <gi...@git.apache.org>.
Github user jliwork commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19776#discussion_r152158653
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala ---
    @@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
         // The older versions of spark have this kind of bugs in parquet data source.
         val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 'mary')")
         val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT (NAME != 'mary')")
    +    val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 'mary') OR (NAME = 'fred')")
    +    val df4 = sql("SELECT * FROM foobar " +
    +      "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
    +    val df5 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
    +    val df6 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
    +    val df7 = sql("SELECT * FROM foobar " +
    +      "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
    +    val df8 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 'fred'))")
    +    val df9 = sql("SELECT * FROM foobar " +
    +      "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR TRIM(NAME) != 'fred'))")
    +    val df10 = sql("SELECT * FROM foobar " +
    +      "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND NAME = 'fred')")
    +
         assert(df1.collect.toSet === Set(Row("mary", 2)))
         assert(df2.collect.toSet === Set(Row("mary", 2)))
    +    assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df5.collect.toSet === Set(Row("mary", 2)))
    +    assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    +    assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
    --- End diff --
    
    @gatorsmile Thanks for comments, Sean =) You mean you'd like to see test cases calling ```DataSourceStrategy.translateFilter``` directly or end-to-end like ```sql("SELECT * FROM foobar WHERE THEID < 1")``` which goes thru the translateFilter code path to verify. I'm fine either way, just want to clarify. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org