You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/14 23:36:09 UTC

[GitHub] [spark] huaxingao opened a new pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

huaxingao opened a new pull request #34001:
URL: https://github.com/apache/spark/pull/34001


   Co-Authored-By: DB Tsai d_tsai@apple.com
   Co-Authored-By: Huaxin Gao huaxin_gao@apple.com
   ### What changes were proposed in this pull request?
   This is the 2nd PR for V2 Filter support. These PR does the following:
   
   - Add V1 Filter.toV2 method
   - Add V2Filter.toV1 method
   - Add translateFilter method in DataSourceV2Strategy to use the V2 Filters
   
   ### Why are the changes needed?
   To eliminate the unnecessary conversion between Catalyst types and Scala types when using Filters.
   
   
   ### Does this PR introduce _any_ user-facing change?
   yes
   V1 Filter.toV2 and V2Filter.toV1 methods
   
   
   ### How was this patch tested?
   Added new UT
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920468405


   **[Test build #143326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143326/testReport)** for PR 34001 at commit [`899875a`](https://github.com/apache/spark/commit/899875a6ad8894e3058b01a5d9bbd6e932df6899).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922159587


   **[Test build #143429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143429/testReport)** for PR 34001 at commit [`71d6619`](https://github.com/apache/spark/commit/71d6619aea271f128b3c7abef1adde079dd27f0f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922228160


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143429/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920369944


   **[Test build #143312 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143312/testReport)** for PR 34001 at commit [`e644eff`](https://github.com/apache/spark/commit/e644eff5e3eda78debce72e74e56f94933e3a7ac).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920144718


   **[Test build #143311 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143311/testReport)** for PR 34001 at commit [`4cbf845`](https://github.com/apache/spark/commit/4cbf845cabe5be6ffd1b0dbab1a12256f6f20520).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920371210


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143312/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r713739870



##########
File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownV2Filters.java
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.read;
+
+import org.apache.spark.annotation.Evolving;
+import org.apache.spark.sql.connector.expressions.filter.Filter;
+
+/**
+ * A mix-in interface for {@link ScanBuilder}. Data sources can implement this interface to
+ * push down filters to the data source and reduce the size of the data to be read.

Review comment:
       We should highlight the difference between this new interface and the old one. We can do this in a followup.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920149652


   **[Test build #143312 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143312/testReport)** for PR 34001 at commit [`e644eff`](https://github.com/apache/spark/commit/e644eff5e3eda78debce72e74e56f94933e3a7ac).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920203158


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47816/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920203158


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47816/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919587934


   **[Test build #143285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143285/testReport)** for PR 34001 at commit [`7a90d37`](https://github.com/apache/spark/commit/7a90d37fd21000ac7ba1b276a83a6fc25c336b47).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920203113


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47816/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920382837


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143315/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922228160


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143429/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922166851


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47937/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920468405


   **[Test build #143326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143326/testReport)** for PR 34001 at commit [`899875a`](https://github.com/apache/spark/commit/899875a6ad8894e3058b01a5d9bbd6e932df6899).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-921660282


   **[Test build #143395 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143395/testReport)** for PR 34001 at commit [`5df37fb`](https://github.com/apache/spark/commit/5df37fbd0fb12120d659e3e33242e252869d9500).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-921664929


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143395/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920362503


   **[Test build #143311 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143311/testReport)** for PR 34001 at commit [`4cbf845`](https://github.com/apache/spark/commit/4cbf845cabe5be6ffd1b0dbab1a12256f6f20520).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920195607


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47814/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920364285


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143311/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920378925


   **[Test build #143315 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143315/testReport)** for PR 34001 at commit [`a704a05`](https://github.com/apache/spark/commit/a704a051d3e64e60343a388e57d541a850153624).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920203770


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47813/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920572187


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143326/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r709766470



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
##########
@@ -261,4 +267,116 @@ object DataSourceUtils extends PredicateHelper {
       dataFilters.flatMap(extractPredicatesWithinOutputSet(_, partitionSet))
     (ExpressionSet(partitionFilters ++ extraPartitionFilter).toSeq, dataFilters)
   }
+
+  def convertV1FilterToV2(v1Filter: sources.Filter): V2Filter = {
+    v1Filter match {
+      case _: sources.AlwaysFalse =>
+        new V2AlwaysFalse
+      case _: sources.AlwaysTrue =>
+        new V2AlwaysTrue
+      case e: sources.EqualNullSafe =>
+        new V2EqualNullSafe(FieldReference(e.attribute), getLiteralValue(e.value))
+      case equal: sources.EqualTo =>
+        new V2EqualTo(FieldReference(equal.attribute), getLiteralValue(equal.value))
+      case g: sources.GreaterThan =>
+        new V2GreaterThan(FieldReference(g.attribute), getLiteralValue(g.value))
+      case ge: sources.GreaterThanOrEqual =>
+        new V2GreaterThanOrEqual(FieldReference(ge.attribute), getLiteralValue(ge.value))
+      case in: sources.In =>
+        new V2In(FieldReference(
+          in.attribute), in.values.map(value => getLiteralValue(value)))
+      case notNull: sources.IsNotNull =>
+        new V2IsNotNull(FieldReference(notNull.attribute))
+      case isNull: sources.IsNull =>
+        new V2IsNull(FieldReference(isNull.attribute))
+      case l: sources.LessThan =>
+        new V2LessThan(FieldReference(l.attribute), getLiteralValue(l.value))
+      case le: sources.LessThanOrEqual =>
+        new V2LessThanOrEqual(FieldReference(le.attribute), getLiteralValue(le.value))
+      case contains: sources.StringContains =>
+        new V2StringContains(
+          FieldReference(contains.attribute), UTF8String.fromString(contains.value))
+      case ends: sources.StringEndsWith =>
+        new V2StringEndsWith(FieldReference(ends.attribute), UTF8String.fromString(ends.value))
+      case starts: sources.StringStartsWith =>
+        new V2StringStartsWith(
+          FieldReference(starts.attribute), UTF8String.fromString(starts.value))
+      case and: sources.And =>
+        new V2And(convertV1FilterToV2(and.left), convertV1FilterToV2(and.right))
+      case or: sources.Or =>
+        new V2Or(convertV1FilterToV2(or.left), convertV1FilterToV2(or.right))
+      case not: sources.Not =>
+        new V2Not(convertV1FilterToV2(not.child))
+      case _ => throw new IllegalStateException("Invalid v1Filter: " + v1Filter)
+    }
+  }
+
+  def getLiteralValue(value: Any): LiteralValue[_] = value match {
+    case _: JavaBigDecimal =>

Review comment:
       According to `DataSourceStrategy.translateLeafNodeFilter`, the value of decimal type can only be `Decimal`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-921664929


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143395/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r709342997



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
##########
@@ -261,4 +267,116 @@ object DataSourceUtils extends PredicateHelper {
       dataFilters.flatMap(extractPredicatesWithinOutputSet(_, partitionSet))
     (ExpressionSet(partitionFilters ++ extraPartitionFilter).toSeq, dataFilters)
   }
+
+  def convertV1FilterToV2(v1Filter: sources.Filter): V2Filter = {
+    v1Filter match {
+      case _: sources.AlwaysFalse =>
+        new V2AlwaysFalse
+      case _: sources.AlwaysTrue =>
+        new V2AlwaysTrue
+      case e: sources.EqualNullSafe =>
+        new V2EqualNullSafe(FieldReference(e.attribute), getLiteralValue(e.value))
+      case equal: sources.EqualTo =>
+        new V2EqualTo(FieldReference(equal.attribute), getLiteralValue(equal.value))
+      case g: sources.GreaterThan =>
+        new V2GreaterThan(FieldReference(g.attribute), getLiteralValue(g.value))
+      case ge: sources.GreaterThanOrEqual =>
+        new V2GreaterThanOrEqual(FieldReference(ge.attribute), getLiteralValue(ge.value))
+      case in: sources.In =>
+        new V2In(FieldReference(
+          in.attribute), in.values.map(value => getLiteralValue(value)))
+      case notNull: sources.IsNotNull =>
+        new V2IsNotNull(FieldReference(notNull.attribute))
+      case isNull: sources.IsNull =>
+        new V2IsNull(FieldReference(isNull.attribute))
+      case l: sources.LessThan =>
+        new V2LessThan(FieldReference(l.attribute), getLiteralValue(l.value))
+      case le: sources.LessThanOrEqual =>
+        new V2LessThanOrEqual(FieldReference(le.attribute), getLiteralValue(le.value))
+      case contains: sources.StringContains =>
+        new V2StringContains(
+          FieldReference(contains.attribute), UTF8String.fromString(contains.value))
+      case ends: sources.StringEndsWith =>
+        new V2StringEndsWith(FieldReference(ends.attribute), UTF8String.fromString(ends.value))
+      case starts: sources.StringStartsWith =>
+        new V2StringStartsWith(
+          FieldReference(starts.attribute), UTF8String.fromString(starts.value))
+      case and: sources.And =>
+        new V2And(convertV1FilterToV2(and.left), convertV1FilterToV2(and.right))
+      case or: sources.Or =>
+        new V2Or(convertV1FilterToV2(or.left), convertV1FilterToV2(or.right))
+      case not: sources.Not =>
+        new V2Not(convertV1FilterToV2(not.child))
+      case _ => throw QueryCompilationErrors.invalidFilter(v1Filter)

Review comment:
       We can only hit this if there is a bug, so it's not a user-facing error and we shouldn't put it in `QueryCompilationErrors`.
   
   We can just throw IllegleStateException here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r709767560



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
##########
@@ -261,4 +267,116 @@ object DataSourceUtils extends PredicateHelper {
       dataFilters.flatMap(extractPredicatesWithinOutputSet(_, partitionSet))
     (ExpressionSet(partitionFilters ++ extraPartitionFilter).toSeq, dataFilters)
   }
+
+  def convertV1FilterToV2(v1Filter: sources.Filter): V2Filter = {
+    v1Filter match {
+      case _: sources.AlwaysFalse =>
+        new V2AlwaysFalse
+      case _: sources.AlwaysTrue =>
+        new V2AlwaysTrue
+      case e: sources.EqualNullSafe =>
+        new V2EqualNullSafe(FieldReference(e.attribute), getLiteralValue(e.value))
+      case equal: sources.EqualTo =>
+        new V2EqualTo(FieldReference(equal.attribute), getLiteralValue(equal.value))
+      case g: sources.GreaterThan =>
+        new V2GreaterThan(FieldReference(g.attribute), getLiteralValue(g.value))
+      case ge: sources.GreaterThanOrEqual =>
+        new V2GreaterThanOrEqual(FieldReference(ge.attribute), getLiteralValue(ge.value))
+      case in: sources.In =>
+        new V2In(FieldReference(
+          in.attribute), in.values.map(value => getLiteralValue(value)))
+      case notNull: sources.IsNotNull =>
+        new V2IsNotNull(FieldReference(notNull.attribute))
+      case isNull: sources.IsNull =>
+        new V2IsNull(FieldReference(isNull.attribute))
+      case l: sources.LessThan =>
+        new V2LessThan(FieldReference(l.attribute), getLiteralValue(l.value))
+      case le: sources.LessThanOrEqual =>
+        new V2LessThanOrEqual(FieldReference(le.attribute), getLiteralValue(le.value))
+      case contains: sources.StringContains =>
+        new V2StringContains(
+          FieldReference(contains.attribute), UTF8String.fromString(contains.value))
+      case ends: sources.StringEndsWith =>
+        new V2StringEndsWith(FieldReference(ends.attribute), UTF8String.fromString(ends.value))
+      case starts: sources.StringStartsWith =>
+        new V2StringStartsWith(
+          FieldReference(starts.attribute), UTF8String.fromString(starts.value))
+      case and: sources.And =>
+        new V2And(convertV1FilterToV2(and.left), convertV1FilterToV2(and.right))
+      case or: sources.Or =>
+        new V2Or(convertV1FilterToV2(or.left), convertV1FilterToV2(or.right))
+      case not: sources.Not =>
+        new V2Not(convertV1FilterToV2(not.child))
+      case _ => throw new IllegalStateException("Invalid v1Filter: " + v1Filter)
+    }
+  }
+
+  def getLiteralValue(value: Any): LiteralValue[_] = value match {
+    case _: JavaBigDecimal =>
+      LiteralValue(Decimal(value.asInstanceOf[JavaBigDecimal]), DecimalType.SYSTEM_DEFAULT)
+    case _: JavaBigInteger =>
+      LiteralValue(Decimal(value.asInstanceOf[JavaBigInteger]), DecimalType.SYSTEM_DEFAULT)
+    case _: BigDecimal =>
+      LiteralValue(Decimal(value.asInstanceOf[BigDecimal]), DecimalType.SYSTEM_DEFAULT)
+    case _: Boolean => LiteralValue(value, BooleanType)
+    case _: Byte => LiteralValue(value, ByteType)
+    case _: Array[Byte] => LiteralValue(value, BinaryType)

Review comment:
       This can be ambiguous, as it can be array type.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920489734


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47830/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920160561


   **[Test build #143315 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143315/testReport)** for PR 34001 at commit [`a704a05`](https://github.com/apache/spark/commit/a704a051d3e64e60343a388e57d541a850153624).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r709768914



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
##########
@@ -261,4 +267,116 @@ object DataSourceUtils extends PredicateHelper {
       dataFilters.flatMap(extractPredicatesWithinOutputSet(_, partitionSet))
     (ExpressionSet(partitionFilters ++ extraPartitionFilter).toSeq, dataFilters)
   }
+
+  def convertV1FilterToV2(v1Filter: sources.Filter): V2Filter = {
+    v1Filter match {
+      case _: sources.AlwaysFalse =>
+        new V2AlwaysFalse
+      case _: sources.AlwaysTrue =>
+        new V2AlwaysTrue
+      case e: sources.EqualNullSafe =>
+        new V2EqualNullSafe(FieldReference(e.attribute), getLiteralValue(e.value))
+      case equal: sources.EqualTo =>
+        new V2EqualTo(FieldReference(equal.attribute), getLiteralValue(equal.value))
+      case g: sources.GreaterThan =>
+        new V2GreaterThan(FieldReference(g.attribute), getLiteralValue(g.value))
+      case ge: sources.GreaterThanOrEqual =>
+        new V2GreaterThanOrEqual(FieldReference(ge.attribute), getLiteralValue(ge.value))
+      case in: sources.In =>
+        new V2In(FieldReference(
+          in.attribute), in.values.map(value => getLiteralValue(value)))
+      case notNull: sources.IsNotNull =>
+        new V2IsNotNull(FieldReference(notNull.attribute))
+      case isNull: sources.IsNull =>
+        new V2IsNull(FieldReference(isNull.attribute))
+      case l: sources.LessThan =>
+        new V2LessThan(FieldReference(l.attribute), getLiteralValue(l.value))
+      case le: sources.LessThanOrEqual =>
+        new V2LessThanOrEqual(FieldReference(le.attribute), getLiteralValue(le.value))
+      case contains: sources.StringContains =>
+        new V2StringContains(
+          FieldReference(contains.attribute), UTF8String.fromString(contains.value))
+      case ends: sources.StringEndsWith =>
+        new V2StringEndsWith(FieldReference(ends.attribute), UTF8String.fromString(ends.value))
+      case starts: sources.StringStartsWith =>
+        new V2StringStartsWith(
+          FieldReference(starts.attribute), UTF8String.fromString(starts.value))
+      case and: sources.And =>
+        new V2And(convertV1FilterToV2(and.left), convertV1FilterToV2(and.right))
+      case or: sources.Or =>
+        new V2Or(convertV1FilterToV2(or.left), convertV1FilterToV2(or.right))
+      case not: sources.Not =>
+        new V2Not(convertV1FilterToV2(not.child))
+      case _ => throw new IllegalStateException("Invalid v1Filter: " + v1Filter)
+    }
+  }
+
+  def getLiteralValue(value: Any): LiteralValue[_] = value match {
+    case _: JavaBigDecimal =>
+      LiteralValue(Decimal(value.asInstanceOf[JavaBigDecimal]), DecimalType.SYSTEM_DEFAULT)
+    case _: JavaBigInteger =>
+      LiteralValue(Decimal(value.asInstanceOf[JavaBigInteger]), DecimalType.SYSTEM_DEFAULT)
+    case _: BigDecimal =>
+      LiteralValue(Decimal(value.asInstanceOf[BigDecimal]), DecimalType.SYSTEM_DEFAULT)
+    case _: Boolean => LiteralValue(value, BooleanType)
+    case _: Byte => LiteralValue(value, ByteType)
+    case _: Array[Byte] => LiteralValue(value, BinaryType)
+    case _: Date =>
+      val date = DateTimeUtils.fromJavaDate(value.asInstanceOf[Date])
+      LiteralValue(date, DateType)
+    case _: LocalDate =>
+      val date = DateTimeUtils.localDateToDays(value.asInstanceOf[LocalDate])
+      LiteralValue(date, DateType)
+    case _: Double => LiteralValue(value, DoubleType)
+    case _: Float => LiteralValue(value, FloatType)
+    case _: Integer => LiteralValue(value, IntegerType)
+    case _: Long => LiteralValue(value, LongType)
+    case _: Short => LiteralValue(value, ShortType)
+    case _: String => LiteralValue(UTF8String.fromString(value.toString), StringType)
+    case _: Timestamp =>
+      val ts = DateTimeUtils.fromJavaTimestamp(value.asInstanceOf[Timestamp])
+      LiteralValue(ts, TimestampType)
+    case _: Instant =>
+      val ts = DateTimeUtils.instantToMicros(value.asInstanceOf[Instant])
+      LiteralValue(ts, TimestampType)
+    case _ =>

Review comment:
       how about array/map/struct?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r710827740



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/csv/CSVScanBuilder.scala
##########
@@ -42,16 +42,18 @@ case class CSVScanBuilder(
       readDataSchema(),
       readPartitionSchema(),
       options,
-      pushedDataFilters,
+      pushedDataFilters.map(DataSourceUtils.convertV2FilterToV1),
       partitionFilters,
       dataFilters)
   }
 
-  override def pushDataFilters(dataFilters: Array[Filter]): Array[Filter] = {
+  override def pushDataFilters(dataFilters: Array[V2Filter]): Array[V2Filter] = {
     if (sparkSession.sessionState.conf.csvFilterPushDown) {
-      StructFilters.pushedFilters(dataFilters, dataSchema)
+      val pushedFilterInV1Format = StructFilters.pushedFilters(
+        dataFilters.map(DataSourceUtils.convertV2FilterToV1), dataSchema)
+      convertPushFilterFromV1ToV2(pushedFilterInV1Format, dataFilters)

Review comment:
       Why do we need to convert v2 filter to v1, then convert it back to v2?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919613857


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47788/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919695451


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143285/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920364285


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143311/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-924724941


   The last commit just changes the since version. I'm merging it to master, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922227177


   **[Test build #143429 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143429/testReport)** for PR 34001 at commit [`71d6619`](https://github.com/apache/spark/commit/71d6619aea271f128b3c7abef1adde079dd27f0f).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919687729


   **[Test build #143285 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143285/testReport)** for PR 34001 at commit [`7a90d37`](https://github.com/apache/spark/commit/7a90d37fd21000ac7ba1b276a83a6fc25c336b47).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919695451


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143285/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-921544034


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47903/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920194448


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47816/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-921544034


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47903/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920486852


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47830/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922159233


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47936/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922159587


   **[Test build #143429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143429/testReport)** for PR 34001 at commit [`71d6619`](https://github.com/apache/spark/commit/71d6619aea271f128b3c7abef1adde079dd27f0f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-921530402


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47903/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-921505325


   **[Test build #143395 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143395/testReport)** for PR 34001 at commit [`5df37fb`](https://github.com/apache/spark/commit/5df37fbd0fb12120d659e3e33242e252869d9500).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920195607


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47814/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920200511


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47818/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920567889


   **[Test build #143326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143326/testReport)** for PR 34001 at commit [`899875a`](https://github.com/apache/spark/commit/899875a6ad8894e3058b01a5d9bbd6e932df6899).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920149652


   **[Test build #143312 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143312/testReport)** for PR 34001 at commit [`e644eff`](https://github.com/apache/spark/commit/e644eff5e3eda78debce72e74e56f94933e3a7ac).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-921534475


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47903/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920144718


   **[Test build #143311 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143311/testReport)** for PR 34001 at commit [`4cbf845`](https://github.com/apache/spark/commit/4cbf845cabe5be6ffd1b0dbab1a12256f6f20520).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920195507


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47814/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919587934


   **[Test build #143285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143285/testReport)** for PR 34001 at commit [`7a90d37`](https://github.com/apache/spark/commit/7a90d37fd21000ac7ba1b276a83a6fc25c336b47).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919610708


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47788/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920200486


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47818/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920160561


   **[Test build #143315 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143315/testReport)** for PR 34001 at commit [`a704a05`](https://github.com/apache/spark/commit/a704a051d3e64e60343a388e57d541a850153624).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919613838


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47788/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920382837


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143315/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920484059


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47830/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920489734


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47830/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #34001:
URL: https://github.com/apache/spark/pull/34001


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r709343205



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
##########
@@ -261,4 +267,116 @@ object DataSourceUtils extends PredicateHelper {
       dataFilters.flatMap(extractPredicatesWithinOutputSet(_, partitionSet))
     (ExpressionSet(partitionFilters ++ extraPartitionFilter).toSeq, dataFilters)
   }
+
+  def convertV1FilterToV2(v1Filter: sources.Filter): V2Filter = {
+    v1Filter match {
+      case _: sources.AlwaysFalse =>
+        new V2AlwaysFalse
+      case _: sources.AlwaysTrue =>
+        new V2AlwaysTrue
+      case e: sources.EqualNullSafe =>
+        new V2EqualNullSafe(FieldReference(e.attribute), getLiteralValue(e.value))
+      case equal: sources.EqualTo =>
+        new V2EqualTo(FieldReference(equal.attribute), getLiteralValue(equal.value))
+      case g: sources.GreaterThan =>
+        new V2GreaterThan(FieldReference(g.attribute), getLiteralValue(g.value))
+      case ge: sources.GreaterThanOrEqual =>
+        new V2GreaterThanOrEqual(FieldReference(ge.attribute), getLiteralValue(ge.value))
+      case in: sources.In =>
+        new V2In(FieldReference(
+          in.attribute), in.values.map(value => getLiteralValue(value)))
+      case notNull: sources.IsNotNull =>
+        new V2IsNotNull(FieldReference(notNull.attribute))
+      case isNull: sources.IsNull =>
+        new V2IsNull(FieldReference(isNull.attribute))
+      case l: sources.LessThan =>
+        new V2LessThan(FieldReference(l.attribute), getLiteralValue(l.value))
+      case le: sources.LessThanOrEqual =>
+        new V2LessThanOrEqual(FieldReference(le.attribute), getLiteralValue(le.value))
+      case contains: sources.StringContains =>
+        new V2StringContains(
+          FieldReference(contains.attribute), UTF8String.fromString(contains.value))
+      case ends: sources.StringEndsWith =>
+        new V2StringEndsWith(FieldReference(ends.attribute), UTF8String.fromString(ends.value))
+      case starts: sources.StringStartsWith =>
+        new V2StringStartsWith(
+          FieldReference(starts.attribute), UTF8String.fromString(starts.value))
+      case and: sources.And =>
+        new V2And(convertV1FilterToV2(and.left), convertV1FilterToV2(and.right))
+      case or: sources.Or =>
+        new V2Or(convertV1FilterToV2(or.left), convertV1FilterToV2(or.right))
+      case not: sources.Not =>
+        new V2Not(convertV1FilterToV2(not.child))
+      case _ => throw QueryCompilationErrors.invalidFilter(v1Filter)
+    }
+  }
+
+  def getLiteralValue(value: Any): LiteralValue[_] = value match {
+    case _: JavaBigDecimal =>
+      LiteralValue(Decimal(value.asInstanceOf[JavaBigDecimal]), DecimalType.SYSTEM_DEFAULT)
+    case _: JavaBigInteger =>
+      LiteralValue(Decimal(value.asInstanceOf[JavaBigInteger]), DecimalType.SYSTEM_DEFAULT)
+    case _: BigDecimal =>
+      LiteralValue(Decimal(value.asInstanceOf[BigDecimal]), DecimalType.SYSTEM_DEFAULT)
+    case _: Boolean => LiteralValue(value, BooleanType)
+    case _: Byte => LiteralValue(value, ByteType)
+    case _: Array[Byte] => LiteralValue(value, BinaryType)
+    case _: Date =>
+      val date = DateTimeUtils.fromJavaDate(value.asInstanceOf[Date])
+      LiteralValue(date, DateType)
+    case _: LocalDate =>
+      val date = DateTimeUtils.localDateToDays(value.asInstanceOf[LocalDate])
+      LiteralValue(date, DateType)
+    case _: Double => LiteralValue(value, DoubleType)
+    case _: Float => LiteralValue(value, FloatType)
+    case _: Integer => LiteralValue(value, IntegerType)
+    case _: Long => LiteralValue(value, LongType)
+    case _: Short => LiteralValue(value, ShortType)
+    case _: String => LiteralValue(UTF8String.fromString(value.toString), StringType)
+    case _: Timestamp =>
+      val ts = DateTimeUtils.fromJavaTimestamp(value.asInstanceOf[Timestamp])
+      LiteralValue(ts, TimestampType)
+    case _: Instant =>
+      val ts = DateTimeUtils.instantToMicros(value.asInstanceOf[Instant])
+      LiteralValue(ts, TimestampType)
+    case _ =>
+      throw QueryCompilationErrors.invalidDataTypeForFilterValue(value)

Review comment:
       ditto

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
##########
@@ -261,4 +267,116 @@ object DataSourceUtils extends PredicateHelper {
       dataFilters.flatMap(extractPredicatesWithinOutputSet(_, partitionSet))
     (ExpressionSet(partitionFilters ++ extraPartitionFilter).toSeq, dataFilters)
   }
+
+  def convertV1FilterToV2(v1Filter: sources.Filter): V2Filter = {
+    v1Filter match {
+      case _: sources.AlwaysFalse =>
+        new V2AlwaysFalse
+      case _: sources.AlwaysTrue =>
+        new V2AlwaysTrue
+      case e: sources.EqualNullSafe =>
+        new V2EqualNullSafe(FieldReference(e.attribute), getLiteralValue(e.value))
+      case equal: sources.EqualTo =>
+        new V2EqualTo(FieldReference(equal.attribute), getLiteralValue(equal.value))
+      case g: sources.GreaterThan =>
+        new V2GreaterThan(FieldReference(g.attribute), getLiteralValue(g.value))
+      case ge: sources.GreaterThanOrEqual =>
+        new V2GreaterThanOrEqual(FieldReference(ge.attribute), getLiteralValue(ge.value))
+      case in: sources.In =>
+        new V2In(FieldReference(
+          in.attribute), in.values.map(value => getLiteralValue(value)))
+      case notNull: sources.IsNotNull =>
+        new V2IsNotNull(FieldReference(notNull.attribute))
+      case isNull: sources.IsNull =>
+        new V2IsNull(FieldReference(isNull.attribute))
+      case l: sources.LessThan =>
+        new V2LessThan(FieldReference(l.attribute), getLiteralValue(l.value))
+      case le: sources.LessThanOrEqual =>
+        new V2LessThanOrEqual(FieldReference(le.attribute), getLiteralValue(le.value))
+      case contains: sources.StringContains =>
+        new V2StringContains(
+          FieldReference(contains.attribute), UTF8String.fromString(contains.value))
+      case ends: sources.StringEndsWith =>
+        new V2StringEndsWith(FieldReference(ends.attribute), UTF8String.fromString(ends.value))
+      case starts: sources.StringStartsWith =>
+        new V2StringStartsWith(
+          FieldReference(starts.attribute), UTF8String.fromString(starts.value))
+      case and: sources.And =>
+        new V2And(convertV1FilterToV2(and.left), convertV1FilterToV2(and.right))
+      case or: sources.Or =>
+        new V2Or(convertV1FilterToV2(or.left), convertV1FilterToV2(or.right))
+      case not: sources.Not =>
+        new V2Not(convertV1FilterToV2(not.child))
+      case _ => throw QueryCompilationErrors.invalidFilter(v1Filter)
+    }
+  }
+
+  def getLiteralValue(value: Any): LiteralValue[_] = value match {
+    case _: JavaBigDecimal =>
+      LiteralValue(Decimal(value.asInstanceOf[JavaBigDecimal]), DecimalType.SYSTEM_DEFAULT)
+    case _: JavaBigInteger =>
+      LiteralValue(Decimal(value.asInstanceOf[JavaBigInteger]), DecimalType.SYSTEM_DEFAULT)
+    case _: BigDecimal =>
+      LiteralValue(Decimal(value.asInstanceOf[BigDecimal]), DecimalType.SYSTEM_DEFAULT)
+    case _: Boolean => LiteralValue(value, BooleanType)
+    case _: Byte => LiteralValue(value, ByteType)
+    case _: Array[Byte] => LiteralValue(value, BinaryType)
+    case _: Date =>
+      val date = DateTimeUtils.fromJavaDate(value.asInstanceOf[Date])
+      LiteralValue(date, DateType)
+    case _: LocalDate =>
+      val date = DateTimeUtils.localDateToDays(value.asInstanceOf[LocalDate])
+      LiteralValue(date, DateType)
+    case _: Double => LiteralValue(value, DoubleType)
+    case _: Float => LiteralValue(value, FloatType)
+    case _: Integer => LiteralValue(value, IntegerType)
+    case _: Long => LiteralValue(value, LongType)
+    case _: Short => LiteralValue(value, ShortType)
+    case _: String => LiteralValue(UTF8String.fromString(value.toString), StringType)
+    case _: Timestamp =>
+      val ts = DateTimeUtils.fromJavaTimestamp(value.asInstanceOf[Timestamp])
+      LiteralValue(ts, TimestampType)
+    case _: Instant =>
+      val ts = DateTimeUtils.instantToMicros(value.asInstanceOf[Instant])
+      LiteralValue(ts, TimestampType)
+    case _ =>
+      throw QueryCompilationErrors.invalidDataTypeForFilterValue(value)
+  }
+
+  def convertV2FilterToV1(v2Filter: V2Filter): sources.Filter = {
+    v2Filter match {
+      case _: V2AlwaysFalse => sources.AlwaysFalse
+      case _: V2AlwaysTrue => sources.AlwaysTrue
+      case e: V2EqualNullSafe => sources.EqualNullSafe(e.column.describe,
+        CatalystTypeConverters.convertToScala(e.value.value, e.value.dataType))
+      case equal: V2EqualTo => sources.EqualTo(equal.column.describe,
+        CatalystTypeConverters.convertToScala(equal.value.value, equal.value.dataType))
+      case g: V2GreaterThan => sources.GreaterThan(g.column.describe,
+        CatalystTypeConverters.convertToScala(g.value.value, g.value.dataType))
+      case ge: V2GreaterThanOrEqual => sources.GreaterThanOrEqual(ge.column.describe,
+        CatalystTypeConverters.convertToScala(ge.value.value, ge.value.dataType))
+      case in: V2In =>
+        var array: Array[Any] = Array.empty
+        for (value <- in.values) {
+          array = array :+ CatalystTypeConverters.convertToScala(value.value, value.dataType)
+        }
+        sources.In(in.column.describe, array)
+      case notNull: V2IsNotNull => sources.IsNotNull(notNull.column.describe)
+      case isNull: V2IsNull => sources.IsNull(isNull.column.describe)
+      case l: V2LessThan => sources.LessThan(l.column.describe,
+        CatalystTypeConverters.convertToScala(l.value.value, l.value.dataType))
+      case le: V2LessThanOrEqual => sources.LessThanOrEqual(le.column.describe,
+        CatalystTypeConverters.convertToScala(le.value.value, le.value.dataType))
+      case contains: V2StringContains =>
+        sources.StringContains(contains.column.describe, contains.value.toString)
+      case ends: V2StringEndsWith =>
+        sources.StringEndsWith(ends.column.describe, ends.value.toString)
+      case starts: V2StringStartsWith =>
+        sources.StringStartsWith(starts.column.describe, starts.value.toString)
+      case and: V2And => sources.And(convertV2FilterToV1(and.left), convertV2FilterToV1(and.right))
+      case or: V2Or => sources.Or(convertV2FilterToV1(or.left), convertV2FilterToV1(or.right))
+      case not: V2Not => sources.Not(convertV2FilterToV1(not.child))
+      case _ => throw QueryCompilationErrors.invalidFilter(v2Filter)

Review comment:
       ditto




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920572187


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143326/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
huaxingao commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-925356890


   Thanks you all!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920371210


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143312/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920193811


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47813/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r713730655



##########
File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownV2Filters.java
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.read;
+
+import org.apache.spark.annotation.Evolving;
+import org.apache.spark.sql.connector.expressions.filter.Filter;
+
+/**
+ * A mix-in interface for {@link ScanBuilder}. Data sources can implement this interface to
+ * push down filters to the data source and reduce the size of the data to be read.
+ *
+ * @since 3.0.0

Review comment:
       ```suggestion
    * @since 3.3.0
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922155653


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47936/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920188392


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47814/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919613857


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47788/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r709343963



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
##########
@@ -427,3 +433,158 @@ class DataSourceV2Strategy(session: SparkSession) extends Strategy with Predicat
     case _ => Nil
   }
 }
+
+private[sql] object DataSourceV2Strategy {
+
+  private def translateLeafNodeFilterV2(
+      predicate: Expression,
+      pushableColumn: PushableColumnBase): Option[V2Filter] = predicate match {
+    case expressions.EqualTo(pushableColumn(name), Literal(v, t)) =>
+      Some(new V2EqualTo(FieldReference(name), LiteralValue(v, t)))
+    case expressions.EqualTo(Literal(v, t), pushableColumn(name)) =>
+      Some(new V2EqualTo(FieldReference(name), LiteralValue(v, t)))
+
+    case expressions.EqualNullSafe(pushableColumn(name), Literal(v, t)) =>
+      Some(new V2EqualNullSafe(FieldReference(name), LiteralValue(v, t)))
+    case expressions.EqualNullSafe(Literal(v, t), pushableColumn(name)) =>
+      Some(new V2EqualNullSafe(FieldReference(name), LiteralValue(v, t)))
+
+    case expressions.GreaterThan(pushableColumn(name), Literal(v, t)) =>
+      Some(new V2GreaterThan(FieldReference(name), LiteralValue(v, t)))
+    case expressions.GreaterThan(Literal(v, t), pushableColumn(name)) =>
+      Some(new V2LessThan(FieldReference(name), LiteralValue(v, t)))
+
+    case expressions.LessThan(pushableColumn(name), Literal(v, t)) =>
+      Some(new V2LessThan(FieldReference(name), LiteralValue(v, t)))
+    case expressions.LessThan(Literal(v, t), pushableColumn(name)) =>
+      Some(new V2GreaterThan(FieldReference(name), LiteralValue(v, t)))
+
+    case expressions.GreaterThanOrEqual(pushableColumn(name), Literal(v, t)) =>
+      Some(new V2GreaterThanOrEqual(FieldReference(name), LiteralValue(v, t)))
+    case expressions.GreaterThanOrEqual(Literal(v, t), pushableColumn(name)) =>
+      Some(new V2LessThanOrEqual(FieldReference(name), LiteralValue(v, t)))
+
+    case expressions.LessThanOrEqual(pushableColumn(name), Literal(v, t)) =>
+      Some(new V2LessThanOrEqual(FieldReference(name), LiteralValue(v, t)))
+    case expressions.LessThanOrEqual(Literal(v, t), pushableColumn(name)) =>
+      Some(new V2GreaterThanOrEqual(FieldReference(name), LiteralValue(v, t)))
+
+    case in @ expressions.InSet(pushableColumn(name), set) =>
+      val values: Array[V2Literal[_]] =
+        set.toSeq.map(elem => LiteralValue(elem, in.dataType)).toArray
+      Some(new V2In(FieldReference(name), values))
+
+    // Because we only convert In to InSet in Optimizer when there are more than certain
+    // items. So it is possible we still get an In expression here that needs to be pushed
+    // down.
+    case in @ expressions.In(pushableColumn(name), list) if list.forall(_.isInstanceOf[Literal]) =>
+      val hSet = list.map(_.eval(EmptyRow))
+      Some(new V2In(FieldReference(name),
+        hSet.toArray.map(LiteralValue(_, in.value.dataType))))
+
+    case expressions.IsNull(pushableColumn(name)) =>
+      Some(new V2IsNull(FieldReference(name)))
+    case expressions.IsNotNull(pushableColumn(name)) =>
+      Some(new V2IsNotNull(FieldReference(name)))
+
+    case expressions.StartsWith(pushableColumn(name), Literal(v: UTF8String, StringType)) =>
+      Some(new V2StringStartsWith(FieldReference(name), v))
+
+    case expressions.EndsWith(pushableColumn(name), Literal(v: UTF8String, StringType)) =>
+      Some(new V2StringEndsWith(FieldReference(name), v))
+
+    case expressions.Contains(pushableColumn(name), Literal(v: UTF8String, StringType)) =>
+      Some(new V2StringContains(FieldReference(name), v))
+
+    case expressions.Literal(true, BooleanType) =>
+      Some(new V2AlwaysTrue)
+
+    case expressions.Literal(false, BooleanType) =>
+      Some(new V2AlwaysFalse)
+
+    case _ => None
+  }
+
+    /**
+     * Tries to translate a Catalyst [[Expression]] into data source [[Filter]].
+     *
+     * @return a `Some[Filter]` if the input [[Expression]] is convertible, otherwise a `None`.
+     */
+    protected[sql] def translateFilterV2(
+        predicate: Expression,
+        supportNestedPredicatePushdown: Boolean): Option[V2Filter] = {
+      translateFilterV2WithMapping(predicate, None, supportNestedPredicatePushdown)
+    }
+
+  /**
+   * Tries to translate a Catalyst [[Expression]] into data source [[Filter]].
+   *
+   * @param predicate The input [[Expression]] to be translated as [[Filter]]
+   * @param translatedFilterToExpr An optional map from leaf node filter expressions to its
+   *                               translated [[Filter]]. The map is used for rebuilding
+   *                               [[Expression]] from [[Filter]].
+   * @return a `Some[Filter]` if the input [[Expression]] is convertible, otherwise a `None`.
+   */
+  protected[sql] def translateFilterV2WithMapping(
+      predicate: Expression,
+      translatedFilterToExpr: Option[mutable.HashMap[V2Filter, Expression]],
+      nestedPredicatePushdownEnabled: Boolean)
+  : Option[V2Filter] = {
+    predicate match {
+      case expressions.And(left, right) =>
+        // See SPARK-12218 for detailed discussion
+        // It is not safe to just convert one side if we do not understand the
+        // other side. Here is an example used to explain the reason.
+        // Let's say we have (a = 2 AND trim(b) = 'blah') OR (c > 0)
+        // and we do not understand how to convert trim(b) = 'blah'.
+        // If we only convert a = 2, we will end up with
+        // (a = 2) OR (c > 0), which will generate wrong results.
+        // Pushing one leg of AND down is only safe to do at the top level.
+        // You can see ParquetFilters' createFilter for more details.
+        for {
+          leftFilter <- translateFilterV2WithMapping(
+            left, translatedFilterToExpr, nestedPredicatePushdownEnabled)
+          rightFilter <- translateFilterV2WithMapping(
+            right, translatedFilterToExpr, nestedPredicatePushdownEnabled)
+        } yield new V2And(leftFilter, rightFilter)
+
+      case expressions.Or(left, right) =>
+        for {
+          leftFilter <- translateFilterV2WithMapping(
+            left, translatedFilterToExpr, nestedPredicatePushdownEnabled)
+          rightFilter <- translateFilterV2WithMapping(
+            right, translatedFilterToExpr, nestedPredicatePushdownEnabled)
+        } yield new V2Or(leftFilter, rightFilter)
+
+      case expressions.Not(child) =>
+        translateFilterV2WithMapping(child, translatedFilterToExpr, nestedPredicatePushdownEnabled)
+          .map(new V2Not(_))
+
+      case other =>
+        val filter = translateLeafNodeFilterV2(
+          other, PushableColumn(nestedPredicatePushdownEnabled))
+        if (filter.isDefined && translatedFilterToExpr.isDefined) {
+          translatedFilterToExpr.get(filter.get) = predicate
+        }
+        filter
+    }
+  }
+
+  protected[sql] def rebuildExpressionFromFilter(
+      filter: V2Filter,
+      translatedFilterToExpr: mutable.HashMap[V2Filter, Expression]): Expression = {
+    filter match {
+      case and: V2And =>
+        expressions.And(rebuildExpressionFromFilter(and.left, translatedFilterToExpr),
+          rebuildExpressionFromFilter(and.right, translatedFilterToExpr))
+      case or: V2Or =>
+        expressions.Or(rebuildExpressionFromFilter(or.left, translatedFilterToExpr),
+          rebuildExpressionFromFilter(or.right, translatedFilterToExpr))
+      case not: V2Not =>
+        expressions.Not(rebuildExpressionFromFilter(not.child, translatedFilterToExpr))
+      case other =>
+        translatedFilterToExpr.getOrElse(other,
+          throw QueryCompilationErrors.failedToRebuildExpressionError(filter))

Review comment:
       ditto




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
huaxingao commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-919626863


   cc @cloud-fan @sunchao @viirya 
   I have splitted the PR into smaller ones. It should be easier to review now :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34001:
URL: https://github.com/apache/spark/pull/34001#discussion_r708806656



##########
File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/filter/AlwaysFalse.java
##########
@@ -47,4 +47,9 @@ public int hashCode() {
 
   @Override
   public NamedReference[] references() { return EMPTY_REFERENCE; }
+
+  @Override
+  public org.apache.spark.sql.sources.Filter toV1() {

Review comment:
       Do we expect data source developers to use it? I thought we will add internal utils to convert between v1 and v2 filters, if it's for internal usage only.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-921505325


   **[Test build #143395 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143395/testReport)** for PR 34001 at commit [`5df37fb`](https://github.com/apache/spark/commit/5df37fbd0fb12120d659e3e33242e252869d9500).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920200511


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47818/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920203717


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47813/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add V1 Filter.toV2 and V2Filter.toV1 methods

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-920203770


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47813/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922166851


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47937/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922166412


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47937/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34001: [SPARK-36760][SQL] Add internal utils to convert between v1 and v2 filters

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922159233


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47936/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34001: [SPARK-36760][SQL] Add interface SupportsPushDownV2Filters

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34001:
URL: https://github.com/apache/spark/pull/34001#issuecomment-922165400


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47937/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org