You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/05/25 08:28:43 UTC

[GitHub] [spark] chenzhx opened a new pull request, #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

chenzhx opened a new pull request, #36663:
URL: https://github.com/apache/spark/pull/36663

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r903444969


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -121,4 +124,30 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitExtract(field: String, inputs: Array[String]): String = {
+      field match {
+        case "DAY_OF_MONTH" =>
+          "EXTRACT(DAY FROM " + inputs.mkString(",") + ")"
+        case "WEEK_OF_YEAR" =>
+          "EXTRACT(ISO_WEEK FROM " + inputs.mkString(",") + ")"
+        case "YEAR_OF_WEEK" =>
+          "EXTRACT(ISO_WEEK_YEAR FROM " + inputs.mkString(",") + ")"
+        case _ => super.visitSQLFunction(field, inputs)
+      }
+    }

Review Comment:
   ```suggestion
       override def visitExtract(field: String, source: String): String = {
         field match {
           case "DAY_OF_MONTH" => s"EXTRACT(DAY FROM $source)"
           case "WEEK_OF_YEAR" => s"EXTRACT(ISO_WEEK FROM $source)"
           case "YEAR_OF_WEEK" => s"EXTRACT(ISO_WEEK_YEAR FROM $source)"
           case _ => super.visitExtract(field, source)
         }
       }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r905936364


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * Represent an extract expression, which contains a field to be extracted
+ * and a source expression where the field should be extracted.

Review Comment:
   ```
   Represent an extract function, which extracts and returns the value of a
   specified datetime field from a datetime or interval value expression.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r881484738


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,84 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field, source, replacement)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>QUARTER</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>QUARTER(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAYOFWEEK</code>

Review Comment:
   But the name is DAYOFWEEK in spark.
   https://github.com/apache/spark/blob/2f8613f22c0750c00cf1dcfb2f31c431d8dc1be7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala#L612



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r881493340


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,84 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field, source, replacement)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>QUARTER</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>QUARTER(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAYOFWEEK</code>

Review Comment:
   I know.  `DAY_OF_WEEK` is style decision discussed offline between @cloud-fan and me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910683281


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * Represent an extract expression, which contains a field to be extracted
+ * and a source expression where the field should be extracted.

Review Comment:
   This is not resolved yet.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r893067221


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) => generateExpression(child)

Review Comment:
   It's a bit weird to expose Spark-specific functions to the data source. Spark does support `extract('dayofweek_iso', ...)`, which ends up with `Add(WeekDay(source), Literal(1))`. Shall we match it instead?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r903433839


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * The general representation of extract expressions, which contains the upper-cased expression
+ * name and all the children expressions. Please also see {@link Extract}
+ * for the supported extract expressions.
+ * <p>
+ * The currently supported predicate expressions:
+ * <ol
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field FROM source)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ * </ol>
+ * @since 3.4.0
+ */
+
+@Evolving
+public class Extract implements Expression, Serializable {
+
+  private Expression expression;
+  private String field;
+
+  public Extract(Expression expression, String field) {
+    this.expression = expression;
+    this.field = field;
+  }
+
+  public Expression expression() { return expression; }
+  public String field() { return field; }

Review Comment:
   ```suggestion
     private String field;
     private Expression source;
   
     public Extract(String field, Expression source) {
       this.field = field;
       this.source = source;
     }
   
     public String field() { return field; }
     public Expression source() { return source; }
   ```



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * The general representation of extract expressions, which contains the upper-cased expression
+ * name and all the children expressions. Please also see {@link Extract}
+ * for the supported extract expressions.
+ * <p>
+ * The currently supported predicate expressions:
+ * <ol
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field FROM source)</code></li>

Review Comment:
   Please remove these comments.



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * The general representation of extract expressions, which contains the upper-cased expression
+ * name and all the children expressions. Please also see {@link Extract}

Review Comment:
   `Represent an extract expression, which contains a field to be extracted and a source expression where the field should be extracted.`



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -43,6 +44,10 @@ public String build(Expression expr) {
     } else if (expr instanceof Cast) {
       Cast cast = (Cast) expr;
       return visitCast(build(cast.expression()), cast.dataType());
+    } else if (expr instanceof Extract) {
+      Extract e = (Extract) expr;
+      return visitExtract(e.field(),
+        Arrays.stream(e.children()).map(c -> build(c)).toArray(String[]::new));

Review Comment:
   ```suggestion
         Extract extract = (Extract) expr;
         return visitExtract(extract.field(), build(extract.source()));
   ```



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract(v, "SECOND"))

Review Comment:
   ```suggestion
         generateExpression(child).map(v => new V2Extract("SECOND", v))
   ```



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * The general representation of extract expressions, which contains the upper-cased expression
+ * name and all the children expressions. Please also see {@link Extract}
+ * for the supported extract expressions.
+ * <p>
+ * The currently supported predicate expressions:
+ * <ol
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field FROM source)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ * </ol>
+ * @since 3.4.0
+ */
+
+@Evolving
+public class Extract implements Expression, Serializable {
+
+  private Expression expression;
+  private String field;
+
+  public Extract(Expression expression, String field) {
+    this.expression = expression;
+    this.field = field;
+  }
+
+  public Expression expression() { return expression; }
+  public String field() { return field; }
+
+  @Override
+  public Expression[] children() { return new Expression[]{ expression() }; }

Review Comment:
   ```suggestion
     public Expression[] children() { return new Expression[]{ source() }; }
   ```



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -265,4 +273,8 @@ protected String visitTrim(String direction, String[] inputs) {
       return "TRIM(" + direction + " " + inputs[1] + " FROM " + inputs[0] + ")";
     }
   }
+
+  protected String visitExtract(String field, String[] inputs) {
+    return "EXTRACT(" + field + " FROM " + inputs[0] + ")";
+  }

Review Comment:
   ```suggestion
     protected String visitExtract(String field, String source) {
       return "EXTRACT(" + field + " FROM " + source + ")";
     }
   ```



##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala:
##########
@@ -270,6 +270,17 @@ abstract class JdbcDialect extends Serializable with Logging{
           s"${this.getClass.getSimpleName} does not support function: TRIM")
       }
     }
+
+    override def visitExtract(field: String, inputs: Array[String]): String = {
+      if (isSupportedFunction(field)) {
+        super.visitExtract(field, inputs)
+      } else {
+        // The framework will catch the error and give up the push-down.
+        // Please see `JdbcDialect.compileExpression(expr: Expression)` for more details.
+        throw new UnsupportedOperationException(
+          s"${this.getClass.getSimpleName} does not support function: EXTRACT")
+      }
+    }

Review Comment:
   ```suggestion
       override def visitExtract(field: String, source: String): String = {
         if (isSupportedFunction(field)) {
           super.visitExtract(field, source)
         } else {
           // The framework will catch the error and give up the push-down.
           // Please see `JdbcDialect.compileExpression(expr: Expression)` for more details.
           throw new UnsupportedOperationException(
             s"${this.getClass.getSimpleName} does not support function: EXTRACT")
         }
       }
   ```



##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -121,4 +124,30 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitExtract(field: String, inputs: Array[String]): String = {
+      field match {
+        case "DAY_OF_MONTH" =>
+          "EXTRACT(DAY FROM " + inputs.mkString(",") + ")"
+        case "WEEK_OF_YEAR" =>
+          "EXTRACT(ISO_WEEK FROM " + inputs.mkString(",") + ")"
+        case "YEAR_OF_WEEK" =>
+          "EXTRACT(ISO_WEEK_YEAR FROM " + inputs.mkString(",") + ")"
+        case _ => super.visitSQLFunction(field, inputs)
+      }
+    }

Review Comment:
   ```suggestion
       override def visitExtract(field: String, source: String): String = {
         field match {
           case "DAY_OF_MONTH" => "EXTRACT(DAY FROM " + source + ")"
           case "WEEK_OF_YEAR" => "EXTRACT(ISO_WEEK FROM " + source + ")"
           case "YEAR_OF_WEEK" => "EXTRACT(ISO_WEEK_YEAR FROM " + source + ")"
           case _ => super.visitExtract(field, source)
         }
       }
   ```



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract(v, "SECOND"))

Review Comment:
   Update other places too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r902350436


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATE_DIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_DIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>

Review Comment:
   I'm surprised if the EXTRACT function has different performance than field extraction functions. They are semantically doing the same thing, right?
   
   The principle here is to not define push-able functions that are semantically equal to each other. Keep in mind that this is API, and it's better to have less/simpler APIs but the same abilities. I think adding a single `EXTRACT` function to the pushdown API is better than adding a bunch of the field extraction functions.
   
   clickhouse data source can translate EXTRACT function to field extraction functions when generating SQL string



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r908011517


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -254,6 +254,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_WEEK", v))
+    case DayOfMonth(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_MONTH", v))
+    case DayOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_YEAR", v))
+    // The WEEK_OF_YEAR function in Spark returns the ISO week from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling WEEK_OF_YEAR.
+    case WeekOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("WEEK_OF_YEAR", v))
+    // The YEAR_OF_WEEK function in Spark returns the ISO week year from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling YEAR_OF_WEEK.
+    case YearOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR_OF_WEEK", v))

Review Comment:
   But in spark 
   WeekDay is (0 = Monday, 1 = Tuesday, ..., 6 = Sunday).
   DayOfWeek is  (1 = Sunday, 2 = Monday, ..., 7 = Saturday).
   
   It seem like:
   Spark DayOfWeek -> (EXTACT(ISO_DOW FROM ...) % 7)+ 1
   Spark WeekDay -> EXTACT(ISO_DOW FROM ...) -1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913364254


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,14 +1034,85 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
             "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df7 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "dayofyear(date1) > 100 AND dayofmonth(date1) > 10 ")
+        checkFiltersRemoved(df7)
+        val expectedPlanFragment7 =
+          "PushedFilters: [DATE1 IS NOT NULL, EXTRACT(DAY_OF_YEAR FROM DATE1) > 100, " +
+          "EXTRACT(DAY FROM DATE1) > 10]"
+        checkPushedInfo(df7, expectedPlanFragment7)
+        checkAnswer(df7, Seq(Row("amy"), Row("alex")))
+
+        val df8 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "year(date1) = 2022 AND quarter(date1) = 2")
+        checkFiltersRemoved(df8)
+        val expectedPlanFragment8 =
+          "[DATE1 IS NOT NULL, EXTRACT(YEAR FROM DATE1) = 2022, " +
+          "EXTRACT(QUARTER FROM DATE1) = 2]"
+        checkPushedInfo(df8, expectedPlanFragment8)
+        checkAnswer(df8, Seq(Row("amy"), Row("alex")))
+
+        val df9 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "second(time1) = 0 AND month(date1) = 5")
+        checkFiltersRemoved(df9)
+        val expectedPlanFragment9 =
+          "PushedFilters: [TIME1 IS NOT NULL, DATE1 IS NOT NULL, EXTRACT(SECOND FROM TIME1) = 0, " +
+          "EXTRACT(MONTH FROM DATE1) ..."

Review Comment:
   OK. I will help him.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911805738


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,14 +1034,70 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
             "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df7 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "dayofyear(date1) > 100 AND dayofmonth(date1) > 10 ")
+        checkFiltersRemoved(df7)
+        val expectedPlanFragment7 =
+          "PushedFilters: [DATE1 IS NOT NULL, EXTRACT(DOY FROM DATE1) > 100, " +
+            "EXTRACT(DAY FROM DATE1) > 10]"
+        checkPushedInfo(df7, expectedPlanFragment7)
+        checkAnswer(df7, Seq(Row("amy"), Row("alex")))
+
+        val df8 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "year(date1) = 2022 AND quarter(date1) = 2 AND month(date1) = 5")
+        checkFiltersRemoved(df8)
+        val expectedPlanFragment8 =
+          "[DATE1 IS NOT NULL, EXTRACT(YEAR FROM DATE1) = 2022, " +
+            "EXTRACT(QUARTER FROM DATE1) = 2, EXTRACT(MON...,"

Review Comment:
   can we update the test framework to not truncate it? otherwise the test coverage is not good enough.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911800754


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -327,4 +334,13 @@ protected String visitTrim(String direction, String[] inputs) {
       return "TRIM(" + direction + " " + inputs[1] + " FROM " + inputs[0] + ")";
     }
   }
+
+  protected String visitExtract(String field, String source) {
+    switch (field) {

Review Comment:
   it should be just `EXTRACT($field FROM $source)`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910686772


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -290,4 +297,15 @@ protected String visitTrim(String direction, String[] inputs) {
       return "TRIM(" + direction + " " + inputs[1] + " FROM " + inputs[0] + ")";
     }
   }
+
+  protected String visitExtract(String field, String source) {
+    switch (field) {
+      case "DAY_OF_WEEK":
+        return "(EXTRACT(ISO_DAY_OF_WEEK FROM " + source + ") % 7)+ 1";

Review Comment:
   the field names are already user-facing (DS v2 API). We should do the translation earlier, when converting catalyst expressions to v2 expressions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910737639


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -123,4 +127,26 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitExtract(field: String, source: String): String = {
+      field match {
+        case "WEEK" => s"EXTRACT(ISO_WEEK FROM $source)"
+        case "YEAR_OF_WEEK" => s"EXTRACT(ISO_WEEK_YEAR FROM $source)"

Review Comment:
   In different databases,  the iso field names is different.
   In postgresql uses  `isoyear `,  `isodow `.
   In snowflake uses  `dayofweekiso`, `weekiso `, yearofweekiso.
   In H2 uses `ISO_WEEK `,  `ISO_DAY_OF_WEEK `.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r887826927


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -542,6 +550,44 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
         }
         checkPushedInfo(df8, expectedPlanFragment8)
         checkAnswer(df8, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df9 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "dayofyear(date1) > 100 and dayofmonth(date1) > 10 and dayofweek(date1) > 1")
+        checkFiltersRemoved(df9)
+        val expectedPlanFragment9 =
+          "PushedFilters: [DATE1 IS NOT NULL, DAY_OF_YEAR(DATE1) > 100, " +
+          "DAY_OF_MONTH(DATE1) > 10, DAY_OF_WEEK(DATE1) > 1]"
+        checkPushedInfo(df9, expectedPlanFragment9)
+        checkAnswer(df9, Seq(Row("amy"), Row("alex")))
+
+        val df10 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "year(date1) = 2022 and quarter(date1) = 2 and month(date1) = 5")
+        checkFiltersRemoved(df10)
+        val expectedPlanFragment10 =
+          "PushedFilters: [DATE1 IS NOT NULL, YEAR(DATE1) = 2022, " +
+          "QUARTER(DATE1) = 2, MONTH(DATE1) = 5]"
+        checkPushedInfo(df10, expectedPlanFragment10)
+        checkAnswer(df10, Seq(Row("amy"), Row("alex")))
+
+        val df11 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "hour(time1) = 0 and minute(time1) = 0 and second(time1) = 0")
+        checkFiltersRemoved(df11)
+        val expectedPlanFragment11 =
+          "PushedFilters: [TIME1 IS NOT NULL, HOUR(TIME1) = 0, " +
+          "MINUTE(TIME1) = 0, SECOND(TIME1) = 0]"
+        checkPushedInfo(df11, expectedPlanFragment11)
+        checkAnswer(df11, Seq(Row("amy"), Row("alex")))
+
+        // H2 does not support
+        val df12 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "trunc(date1, 'week') = date'2022-05-16' and date_add(date1, 1) = date'2022-05-20' " +
+          "and datediff(date1, '2022-05-10') > 0 and extract(week from date1) = 20 " +
+          "and extract(YEAROFWEEK from date1) = 2022")

Review Comment:
   Could we add tests just test `extract(week from date1)` and `extract(YEAROFWEEK from date1)`. We can see `week` and `YEAROFWEEK` will be pushed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r890253947


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects do not need to follow ISO semantics when handling DAY_OF_WEEK.

Review Comment:
   Yes. In ISO semantics, 1 represents Monday.
   But the DayOfWeek function in Spark, 1 represents Sunday.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r889799255


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,49 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_WEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_MONTH", Array[V2Expression](v)))
+    case DayOfYear(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_YEAR", Array[V2Expression](v)))
+    case WeekOfYear(child) => generateExpression(child)

Review Comment:
   Add a comment tell users `WeekOfYear` returns ISO week and JDBC dialect should compile `WEEK_OF_YEAR` with ISO semantics.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r893141595


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATE_DIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_DIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>

Review Comment:
   Yes. In some databases these field extractions are deprecated and EXTRACT is used instead, such as H2.
   http://h2database.com/html/functions.html#month



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on PR #36663:
URL: https://github.com/apache/spark/pull/36663#issuecomment-1139695064

   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r890188771


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects do not need to follow ISO semantics when handling DAY_OF_WEEK.

Review Comment:
   hmm, do you mean the Spark behavior is non-standard for DAY_OF_WEEK function?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911802407


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -38,7 +40,9 @@ private[sql] object H2Dialect extends JdbcDialect {
     Set("ABS", "COALESCE", "GREATEST", "LEAST", "RAND", "LOG", "LOG10", "LN", "EXP",
       "POWER", "SQRT", "FLOOR", "CEIL", "ROUND", "SIN", "SINH", "COS", "COSH", "TAN",
       "TANH", "COT", "ASIN", "ACOS", "ATAN", "ATAN2", "DEGREES", "RADIANS", "SIGN",
-      "PI", "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM")
+      "PI", "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM", "SECOND", "MINUTE",
+      "HOUR", "MONTH", "QUARTER", "YEAR", "DAY", "DOY", "ISO_DAY_OF_WEEK", "YEAR_OF_WEEK",

Review Comment:
   we should put function names, not extract field names.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910737639


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -123,4 +127,26 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitExtract(field: String, source: String): String = {
+      field match {
+        case "WEEK" => s"EXTRACT(ISO_WEEK FROM $source)"
+        case "YEAR_OF_WEEK" => s"EXTRACT(ISO_WEEK_YEAR FROM $source)"

Review Comment:
   In different databases,  the iso field names is different.
   In postgresql uses  `isodow `.
   In snowflake uses  `dayofweekiso`.
   In H2 uses  `ISO_DAY_OF_WEEK `.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911718141


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -344,6 +344,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new GeneralScalarExpression("+",
+        Array[V2Expression](new GeneralScalarExpression("%",
+          Array[V2Expression](new V2Extract("DAY_OF_WEEK", v), LiteralValue(7, IntegerType))),

Review Comment:
   Do we need `ISO_DAY_OF_WEEK` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r905937667


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -254,6 +254,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.

Review Comment:
   It's better to put this doc in the user-facing class, i.e. `org.apache.spark.sql.connector.expressions.Extract`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r902350436


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATE_DIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_DIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>

Review Comment:
   I'm surprised if the EXTRACT function has different performance than field extraction functions. They are semantically doing the same thing, right?
   
   The principle here is to not define push-able functions that are semantically equal to each other. Keep in mind that this is API, and it's better to have less APIs but the same abilities. I think adding a single `EXTRACT` function to the pushdown API is better than adding a bunch of the field extraction functions.
   
   clickhouse data source can translate EXTRACT function to field extraction functions when generating SQL string



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r907173999


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -254,6 +254,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.

Review Comment:
   OK



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -254,6 +254,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911800128


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * Represent an extract function, which extracts and returns the value of a
+ * specified datetime field from a datetime or interval value expression.
+ * <p>
+ * The currently supported fields names following the ISO standard:
+ * <ol>
+ *  <li> <code>SECOND</code> Since 3.4.0 </li>
+ *  <li> <code>MINUTE</code> Since 3.4.0 </li>
+ *  <li> <code>HOUR</code> Since 3.4.0 </li>
+ *  <li> <code>MONTH</code> Since 3.4.0 </li>
+ *  <li> <code>QUARTER</code> Since 3.4.0 </li>
+ *  <li> <code>YEAR</code> Since 3.4.0 </li>
+ *  <li> <code>ISO_DAY_OF_WEEK</code> Since 3.4.0 </li>
+ *  <li> <code>DAY</code> Since 3.4.0 </li>
+ *  <li> <code>DOY</code> Since 3.4.0 </li>

Review Comment:
   to be consistent with `DAY_OF_WEEK` and `YEAR_OF_WEEK`, how about `DAY_OF_YEAR`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911797821


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * Represent an extract function, which extracts and returns the value of a
+ * specified datetime field from a datetime or interval value expression.
+ * <p>
+ * The currently supported fields names following the ISO standard:
+ * <ol>
+ *  <li> <code>SECOND</code> Since 3.4.0 </li>
+ *  <li> <code>MINUTE</code> Since 3.4.0 </li>
+ *  <li> <code>HOUR</code> Since 3.4.0 </li>
+ *  <li> <code>MONTH</code> Since 3.4.0 </li>
+ *  <li> <code>QUARTER</code> Since 3.4.0 </li>
+ *  <li> <code>YEAR</code> Since 3.4.0 </li>
+ *  <li> <code>ISO_DAY_OF_WEEK</code> Since 3.4.0 </li>

Review Comment:
   shall we just use `DAY_OF_WEEK`? The doc already says we follow ISO standard.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r889863328


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -121,4 +124,34 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitSQLFunction(funcName: String, inputs: Array[String]): String = {
+      funcName match {
+        case "WEEK_OF_YEAR" => visitWeekOfYear(inputs)
+        case "YEAR_OF_WEEK" => visitYearOfWeek(inputs)
+        case _ => super.visitSQLFunction(funcName, inputs)
+      }
+    }
+
+    def visitWeekOfYear(inputs: Array[String]): String = {

Review Comment:
   no need create new def



##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -121,4 +124,34 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitSQLFunction(funcName: String, inputs: Array[String]): String = {
+      funcName match {
+        case "WEEK_OF_YEAR" => visitWeekOfYear(inputs)
+        case "YEAR_OF_WEEK" => visitYearOfWeek(inputs)
+        case _ => super.visitSQLFunction(funcName, inputs)
+      }
+    }
+
+    def visitWeekOfYear(inputs: Array[String]): String = {
+      "ISO_WEEK(" + inputs.mkString(",") + ")"
+    }
+
+    def visitYearOfWeek(inputs: Array[String]): String = {

Review Comment:
   ditto.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r889799255


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,49 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_WEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_MONTH", Array[V2Expression](v)))
+    case DayOfYear(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_YEAR", Array[V2Expression](v)))
+    case WeekOfYear(child) => generateExpression(child)

Review Comment:
   Add a comments tell users `WeekOfYear` returns ISO week and JDBC dialect should compile `WEEK_OF_YEAR` with ISO semantics.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r907122926


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -254,6 +254,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_WEEK", v))
+    case DayOfMonth(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_MONTH", v))
+    case DayOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_YEAR", v))
+    // The WEEK_OF_YEAR function in Spark returns the ISO week from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling WEEK_OF_YEAR.
+    case WeekOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("WEEK_OF_YEAR", v))
+    // The YEAR_OF_WEEK function in Spark returns the ISO week year from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling YEAR_OF_WEEK.
+    case YearOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR_OF_WEEK", v))

Review Comment:
   Yes. We can replace "Week_Of_Year" with "WEEK", "Day_Of_Year" with "DOY", and "Day_Of_Month" with "DAY"
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910685251


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * Represent an extract function, which extracts and returns the value of a
+ * and a source expression where the field should be extracted.
+ * <p>
+ * The currently supported field names:

Review Comment:
   how about:
   ```
   The currently supported fields names following the ISO standard:
   <ol>
   ...
   <li> <code>SECOND</code> Since 3.4.0 </li>
   ...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913413624


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,18 +1034,97 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
             "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
       }
     }
   }
 
+  test("scan with filter push-down with date time functions") {
+    Seq(false, true).foreach { ansiMode =>
+      withSQLConf(SQLConf.ANSI_ENABLED.key -> ansiMode.toString,

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913430428


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,18 +1034,93 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
             "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
       }
     }
   }
 
+  test("scan with filter push-down with date time functions") {
+    withSQLConf(SQLConf.MAX_METADATA_STRING_LENGTH.key -> "200") {

Review Comment:
   can we do it for the entire `JDBCV2Suite`? I think we just need to wrap the invocation of `checkKeywordsExistsInExplain` with this `withSQLConf`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #36663:
URL: https://github.com/apache/spark/pull/36663#issuecomment-1139764288

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913411941


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,18 +1034,97 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
             "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
       }
     }
   }
 
+  test("scan with filter push-down with date time functions") {
+    Seq(false, true).foreach { ansiMode =>
+      withSQLConf(SQLConf.ANSI_ENABLED.key -> ansiMode.toString,

Review Comment:
   Could we avoid ANSI mode ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r893244434


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATE_DIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_DIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>

Review Comment:
   But in different databases, the performance of EXTRACT function and field extraction functions is different. For example, pg, mysql, snowflake and the like are used together. And something like clickhouse only uses the field extraction functions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r908137438


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -254,6 +254,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_WEEK", v))
+    case DayOfMonth(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_MONTH", v))
+    case DayOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_YEAR", v))
+    // The WEEK_OF_YEAR function in Spark returns the ISO week from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling WEEK_OF_YEAR.
+    case WeekOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("WEEK_OF_YEAR", v))
+    // The YEAR_OF_WEEK function in Spark returns the ISO week year from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling YEAR_OF_WEEK.
+    case YearOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR_OF_WEEK", v))

Review Comment:
   sorry typo, should be `-1`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r907174118


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -254,6 +254,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911801946


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -344,6 +344,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new GeneralScalarExpression("+",
+        Array[V2Expression](new GeneralScalarExpression("%",
+          Array[V2Expression](new V2Extract("ISO_DAY_OF_WEEK", v), LiteralValue(7, IntegerType))),
+          LiteralValue(1, IntegerType))))
+    case WeekDay(child) =>
+      generateExpression(child).map(v => new GeneralScalarExpression("-",

Review Comment:
   ditto



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r883471600


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,96 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>EXTRACT</code>

Review Comment:
   Extract replaced by Year, YearOfWeek, Quarter and so on.
   So we should not do the translate.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
huaxingao commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r882790464


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -539,6 +547,44 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
         }
         checkPushedInfo(df8, expectedPlanFragment8)
         checkAnswer(df8, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df9 = sql("select name FROM h2.test.datetime where " +

Review Comment:
   nit: capitalize sql keywords such as `select` and `where`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r882418668


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,96 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>EXTRACT</code>

Review Comment:
   seems this PR does not support EXTRACT



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r887826927


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -542,6 +550,44 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
         }
         checkPushedInfo(df8, expectedPlanFragment8)
         checkAnswer(df8, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df9 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "dayofyear(date1) > 100 and dayofmonth(date1) > 10 and dayofweek(date1) > 1")
+        checkFiltersRemoved(df9)
+        val expectedPlanFragment9 =
+          "PushedFilters: [DATE1 IS NOT NULL, DAY_OF_YEAR(DATE1) > 100, " +
+          "DAY_OF_MONTH(DATE1) > 10, DAY_OF_WEEK(DATE1) > 1]"
+        checkPushedInfo(df9, expectedPlanFragment9)
+        checkAnswer(df9, Seq(Row("amy"), Row("alex")))
+
+        val df10 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "year(date1) = 2022 and quarter(date1) = 2 and month(date1) = 5")
+        checkFiltersRemoved(df10)
+        val expectedPlanFragment10 =
+          "PushedFilters: [DATE1 IS NOT NULL, YEAR(DATE1) = 2022, " +
+          "QUARTER(DATE1) = 2, MONTH(DATE1) = 5]"
+        checkPushedInfo(df10, expectedPlanFragment10)
+        checkAnswer(df10, Seq(Row("amy"), Row("alex")))
+
+        val df11 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "hour(time1) = 0 and minute(time1) = 0 and second(time1) = 0")
+        checkFiltersRemoved(df11)
+        val expectedPlanFragment11 =
+          "PushedFilters: [TIME1 IS NOT NULL, HOUR(TIME1) = 0, " +
+          "MINUTE(TIME1) = 0, SECOND(TIME1) = 0]"
+        checkPushedInfo(df11, expectedPlanFragment11)
+        checkAnswer(df11, Seq(Row("amy"), Row("alex")))
+
+        // H2 does not support
+        val df12 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "trunc(date1, 'week') = date'2022-05-16' and date_add(date1, 1) = date'2022-05-20' " +
+          "and datediff(date1, '2022-05-10') > 0 and extract(week from date1) = 20 " +
+          "and extract(YEAROFWEEK from date1) = 2022")

Review Comment:
   Could we add a test case just test `extract(week from date1)` and `extract(YEAROFWEEK from date1)`. We can see `week` and `YEAROFWEEK` will be pushed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r905936600


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * Represent an extract expression, which contains a field to be extracted
+ * and a source expression where the field should be extracted.

Review Comment:
   let's also document all the possible field names here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910737639


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -123,4 +127,26 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitExtract(field: String, source: String): String = {
+      field match {
+        case "WEEK" => s"EXTRACT(ISO_WEEK FROM $source)"
+        case "YEAR_OF_WEEK" => s"EXTRACT(ISO_WEEK_YEAR FROM $source)"

Review Comment:
   But in different databases,  the iso field names is different.
   In postgresql uses  `isoyear `,  `isodow `.
   In snowflake uses  `dayofweekiso`, `weekiso `, yearofweekiso.
   In H2 uses `ISO_WEEK `,  `ISO_DAY_OF_WEEK `.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911798925


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * Represent an extract function, which extracts and returns the value of a
+ * specified datetime field from a datetime or interval value expression.
+ * <p>
+ * The currently supported fields names following the ISO standard:
+ * <ol>
+ *  <li> <code>SECOND</code> Since 3.4.0 </li>
+ *  <li> <code>MINUTE</code> Since 3.4.0 </li>
+ *  <li> <code>HOUR</code> Since 3.4.0 </li>
+ *  <li> <code>MONTH</code> Since 3.4.0 </li>
+ *  <li> <code>QUARTER</code> Since 3.4.0 </li>
+ *  <li> <code>YEAR</code> Since 3.4.0 </li>
+ *  <li> <code>ISO_DAY_OF_WEEK</code> Since 3.4.0 </li>

Review Comment:
   or `DOW`, to be consistent with `DOY`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911805060


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,14 +1034,70 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
             "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df7 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "dayofyear(date1) > 100 AND dayofmonth(date1) > 10 ")
+        checkFiltersRemoved(df7)
+        val expectedPlanFragment7 =
+          "PushedFilters: [DATE1 IS NOT NULL, EXTRACT(DOY FROM DATE1) > 100, " +
+            "EXTRACT(DAY FROM DATE1) > 10]"
+        checkPushedInfo(df7, expectedPlanFragment7)
+        checkAnswer(df7, Seq(Row("amy"), Row("alex")))
+
+        val df8 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "year(date1) = 2022 AND quarter(date1) = 2 AND month(date1) = 5")
+        checkFiltersRemoved(df8)
+        val expectedPlanFragment8 =
+          "[DATE1 IS NOT NULL, EXTRACT(YEAR FROM DATE1) = 2022, " +
+            "EXTRACT(QUARTER FROM DATE1) = 2, EXTRACT(MON...,"
+        checkPushedInfo(df8, expectedPlanFragment8)
+        checkAnswer(df8, Seq(Row("amy"), Row("alex")))
+
+        val df9 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "hour(time1) = 0 AND minute(time1) = 0 AND second(time1) = 0")
+        checkFiltersRemoved(df9)
+        val expectedPlanFragment9 =
+          "PushedFilters: [TIME1 IS NOT NULL, EXTRACT(HOUR FROM TIME1) = 0, " +
+            "EXTRACT(MINUTE FROM TIME1) = 0, EXTRACT(SECOND ..."
+        checkPushedInfo(df9, expectedPlanFragment9)
+        checkAnswer(df9, Seq(Row("amy"), Row("alex")))
+
+        val df10 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "extract(WEEk from date1) > 10 AND extract(YEAROFWEEK from date1) = 2022")
+        checkFiltersRemoved(df10)
+        val expectedPlanFragment10 =
+          "PushedFilters: [DATE1 IS NOT NULL, EXTRACT(WEEK FROM DATE1) > 10, " +
+            "EXTRACT(YEAR_OF_WEEK FROM DATE1) = 2022]"
+        checkPushedInfo(df10, expectedPlanFragment10)
+        checkAnswer(df10, Seq(Row("alex"), Row("amy")))
+
+        // H2 does not support
+        val df11 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "trunc(date1, 'week') = date'2022-05-16' AND date_add(date1, 1) = date'2022-05-20' " +
+          "AND datediff(date1, '2022-05-10') > 0")
+        checkFiltersRemoved(df11, false)
+        val expectedPlanFragment11 =
+          "PushedFilters: [DATE1 IS NOT NULL]"
+        checkPushedInfo(df11, expectedPlanFragment11)
+        checkAnswer(df11, Seq(Row("amy")))
+
+        val df12 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "weekday(date1) = 2 AND dayofweek(date1) = 4")
+        checkFiltersRemoved(df12)
+        val expectedPlanFragment12 =
+          "PushedFilters: [DATE1 IS NOT NULL, " +
+          "(((EXTRACT(DAY_OF_WEEK FROM DATE1) + 5) % 7)+ 1 - 1) = 2, " +

Review Comment:
   why there is a `-1` right after `+1`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r912328995


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -123,4 +125,27 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitExtract(field: String, source: String): String = {
+      field match {
+        case "DAY_OF_WEEK" => s"EXTRACT(ISO_DAY_OF_WEEK FROM $source)"
+        case "WEEK" => s"EXTRACT(ISO_WEEK FROM $source)"
+        case "YEAR_OF_WEEK" => s"EXTRACT(ISO_WEEK_YEAR FROM $source)"

Review Comment:
   ```
   val newField = field match {
     case "DAY_OF_WEEK" => "ISO_DAY_OF_WEEK"
     case ...
   }
   s"EXTRACT($newField FROM $source)"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911583956


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -290,4 +297,15 @@ protected String visitTrim(String direction, String[] inputs) {
       return "TRIM(" + direction + " " + inputs[1] + " FROM " + inputs[0] + ")";
     }
   }
+
+  protected String visitExtract(String field, String source) {
+    switch (field) {
+      case "DAY_OF_WEEK":
+        return "(EXTRACT(ISO_DAY_OF_WEEK FROM " + source + ") % 7)+ 1";

Review Comment:
   Then I think `DAY_OF_WEEK` is fine. Different dialects must override it anyway.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on PR #36663:
URL: https://github.com/apache/spark/pull/36663#issuecomment-1178506261

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on PR #36663:
URL: https://github.com/apache/spark/pull/36663#issuecomment-1178542638

   @cloud-fan @beliefer @huaxingao  Thank you for you review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r903309304


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+/**
+ * The general representation of extract expressions, which contains the upper-cased expression
+ * name and all the children expressions. Please also see {@link Extract}
+ * for the supported extract expressions.
+ * <p>
+ * The currently supported predicate expressions:
+ * <ol
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field FROM source)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ * </ol>
+ * @since 3.4.0
+ */
+
+@Evolving
+public class Extract extends GeneralScalarExpression {
+
+  public Extract(String field, Expression[] children) {
+    super(field, children);
+  }
+
+}

Review Comment:
   newline at the end of file.



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+/**
+ * The general representation of extract expressions, which contains the upper-cased expression
+ * name and all the children expressions. Please also see {@link Extract}
+ * for the supported extract expressions.
+ * <p>
+ * The currently supported predicate expressions:
+ * <ol
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field FROM source)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ * </ol>
+ * @since 3.4.0
+ */
+
+@Evolving
+public class Extract extends GeneralScalarExpression {
+

Review Comment:
   If we need add `Extract`, please reference `Cast`.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r903240469


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))

Review Comment:
   We can translate `SECOND` to `EXTRACT` directly.



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))

Review Comment:
   ditto



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))

Review Comment:
   ditto



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))

Review Comment:
   ditto



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_WEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_MONTH", Array[V2Expression](v)))
+    case DayOfYear(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_YEAR", Array[V2Expression](v)))

Review Comment:
   ditto



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_WEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_MONTH", Array[V2Expression](v)))
+    case DayOfYear(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_YEAR", Array[V2Expression](v)))
+    // The WEEK_OF_YEAR function in Spark returns the ISO week from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling WEEK_OF_YEAR.
+    case WeekOfYear(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("WEEK_OF_YEAR", Array[V2Expression](v)))

Review Comment:
   ditto



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_WEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_MONTH", Array[V2Expression](v)))
+    case DayOfYear(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_YEAR", Array[V2Expression](v)))
+    // The WEEK_OF_YEAR function in Spark returns the ISO week from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling WEEK_OF_YEAR.
+    case WeekOfYear(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("WEEK_OF_YEAR", Array[V2Expression](v)))
+    // The YEAR_OF_WEEK function in Spark returns the ISO week year from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling YEAR_OF_WEEK.
+    case YearOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR_OF_WEEK", Array[V2Expression](v)))

Review Comment:
   ditto



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_WEEK", Array[V2Expression](v)))

Review Comment:
   ditto



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_WEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAY_OF_MONTH", Array[V2Expression](v)))

Review Comment:
   ditto



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))

Review Comment:
   ditto



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))

Review Comment:
   ditto



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r903241878


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))

Review Comment:
   let's translate to `EXTRACT` earlier. We should translate these catalyst field extraction functions to v2 `EXTRACT` function, which has 2 parameters: the input datetime, and the field string. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r882456701


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -103,6 +103,14 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
         "(1, 'bottle', 11111111111111111111.123)").executeUpdate()
       conn.prepareStatement("INSERT INTO \"test\".\"item\" VALUES " +
         "(1, 'bottle', 99999999999999999999.123)").executeUpdate()
+
+      conn.prepareStatement(
+        "CREATE TABLE \"test\".\"datetime\" (name TEXT(32),date1 DATE , time1 TIMESTAMP)")

Review Comment:
   ```suggestion
           "CREATE TABLE \"test\".\"datetime\" (name TEXT(32), date1 DATE, time1 TIMESTAMP)")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r883473729


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -539,6 +547,44 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
         }
         checkPushedInfo(df8, expectedPlanFragment8)
         checkAnswer(df8, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df9 = sql("select name FROM h2.test.datetime where " +

Review Comment:
   OK 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r881416450


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -723,6 +731,53 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
   }
 
+  test("scan with filter push-down with datetime functions") {

Review Comment:
   We can put these. tests after `test("scan with filter push-down with ansi mode") {`.



##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -32,7 +32,9 @@ private object H2Dialect extends JdbcDialect {
 
   private val supportedFunctions =
     Set("ABS", "COALESCE", "LN", "EXP", "POWER", "SQRT", "FLOOR", "CEIL",
-      "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM")
+      "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM",
+      "EXTRACT", "SECOND", "MINUTE", "HOUR", "MONTH", "QUARTER", "YEAR",
+      "DAYOFYEAR", "DAYOFMONTH", "DAYOFWEEK")

Review Comment:
   ```
   Set("ABS", "COALESCE", "LN", "EXP", "POWER", "SQRT", "FLOOR", "CEIL", "SUBSTRING", "UPPER",
     "LOWER", "TRANSLATE", "TRIM", "EXTRACT", "SECOND", "MINUTE", "HOUR", "MONTH",
     "QUARTER", "YEAR", "DAYOFYEAR", "DAYOFMONTH", "DAYOFWEEK")
   ```



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,52 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATEDIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case extract: Extract =>

Review Comment:
   `Extract` replaced by `Year`, `YearOfWeek`, `Quarter` and so on.
   So we should not do the translate.
   Could you investigate the function such as `YearOfWeek`, `WeekOfYear` and translate them.



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,52 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATEDIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case extract: Extract =>
+      val childrenExpressions = extract.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == extract.children.length) {
+        Some(new GeneralScalarExpression("EXTRACT", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFWEEK", Array[V2Expression](v)))

Review Comment:
   `DAY_OF_WEEK`



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,84 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field, source, replacement)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>QUARTER</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>QUARTER(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAYOFWEEK</code>

Review Comment:
   The name seems not correct



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,84 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field, source, replacement)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>QUARTER</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>QUARTER(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAYOFWEEK</code>

Review Comment:
   Refer http://www.h2database.com/html/functions.html#day_of_week
   https://prestodb.io/docs/current/functions/datetime.html#day_of_week



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,52 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATEDIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case extract: Extract =>
+      val childrenExpressions = extract.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == extract.children.length) {
+        Some(new GeneralScalarExpression("EXTRACT", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFWEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFMONTH", Array[V2Expression](v)))

Review Comment:
   `DAY_OF_MONTH`



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,52 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATEDIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case extract: Extract =>
+      val childrenExpressions = extract.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == extract.children.length) {
+        Some(new GeneralScalarExpression("EXTRACT", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFWEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFMONTH", Array[V2Expression](v)))
+    case DayOfYear(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFYEAR", Array[V2Expression](v)))

Review Comment:
   `DAY_OF_YEAR`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r882407577


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -263,4 +280,8 @@ protected String visitTrim(String direction, String[] inputs) {
       return "TRIM(" + direction + " " + inputs[1] + " FROM " + inputs[0] + ")";
     }
   }
+
+  protected String visitExtract(String[] inputs) {
+    return "EXTRACT(" + inputs[0] + " FROM " + inputs[1] + ")";

Review Comment:
   the doc says EXTRACT has 3 parameters: https://github.com/apache/spark/pull/36663/files#diff-85a02a33061cf83f3f1cac24e0ba805eabb8d833905c76a4181258431d894755R219



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r889908186


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects do not need to follow ISO semantics when handling DAY_OF_WEEK.

Review Comment:
   ```suggestion
       // Database dialects should avoid to following ISO semantics when handling DAY_OF_WEEK.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r893067774


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATE_DIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_DIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>

Review Comment:
   I'm wondering if the `EXTRACT` function is more common that these field extraction functions?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on PR #36663:
URL: https://github.com/apache/spark/pull/36663#issuecomment-1178525722

   @cloud-fan @huaxingao Thank you for you review.
   @chenzhx Thank you for you job.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913359294


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -344,6 +344,59 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // DayOfWeek uses 1 = Sunday, 2 = Monday, ... and ISO standard is Monday=1, ...,
+    // so we use the formula ((ISO_standard % 7) + 1) to do translation.
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new GeneralScalarExpression("+",
+        Array[V2Expression](new GeneralScalarExpression("%",
+          Array[V2Expression](new V2Extract("DAY_OF_WEEK", v), LiteralValue(7, IntegerType))),
+          LiteralValue(1, IntegerType))))
+    // WeekDay uses 0 = Monday, 1 = Tuesday, ... and ISO standard is Monday=1, ...,

Review Comment:
   ```suggestion
       // WeekDay uses Monday=0, Tuesday=1, ... and ISO standard is Monday=1, ...,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions
URL: https://github.com/apache/spark/pull/36663


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r907162042


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * Represent an extract expression, which contains a field to be extracted
+ * and a source expression where the field should be extracted.

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910687319


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -123,4 +127,26 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitExtract(field: String, source: String): String = {
+      field match {
+        case "WEEK" => s"EXTRACT(ISO_WEEK FROM $source)"
+        case "YEAR_OF_WEEK" => s"EXTRACT(ISO_WEEK_YEAR FROM $source)"

Review Comment:
   is it common that databases add the `ISO_` prefix to field names?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910881528


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -290,4 +297,15 @@ protected String visitTrim(String direction, String[] inputs) {
       return "TRIM(" + direction + " " + inputs[1] + " FROM " + inputs[0] + ")";
     }
   }
+
+  protected String visitExtract(String field, String source) {
+    switch (field) {
+      case "DAY_OF_WEEK":
+        return "(EXTRACT(ISO_DAY_OF_WEEK FROM " + source + ") % 7)+ 1";

Review Comment:
   n different databases, the ISO_DAY_OF_WEEK is different.
   In postgresql uses `isodow `.
   In snowflake uses `dow_iso ` .
   In H2 uses `ISO_DAY_OF_WEEK `.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911584417


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -344,6 +344,51 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_WEEK", v))

Review Comment:
   Please make sure we follow the ISO standard when translating to v2 expressions. Spark `DayOfWeek` does not follow the ISO standard and need extra translation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r883359744


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>

Review Comment:
   `DATE_DIFF`



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>QUARTER</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>QUARTER(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAY_OF_WEEK</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DAY_OF_WEEK(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAY_OF_MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DAY_OF_MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAY_OF_YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DAY_OF_YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>WEEK_OF_YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>WEEK_OF_YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>YEAR_OF_WEEK</code>

Review Comment:
   ditto



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -539,6 +547,44 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
         }
         checkPushedInfo(df8, expectedPlanFragment8)
         checkAnswer(df8, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df9 = sql("select name FROM h2.test.datetime where " +

Review Comment:
   +1



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>QUARTER</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>QUARTER(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAY_OF_WEEK</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DAY_OF_WEEK(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAY_OF_MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DAY_OF_MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAY_OF_YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DAY_OF_YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>WEEK_OF_YEAR</code>

Review Comment:
   Could you add investigate information into PR description ?



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>

Review Comment:
   Please update it.



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,49 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATEDIFF", childrenExpressions.toArray[V2Expression]))

Review Comment:
   ditto



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r893144700


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) => generateExpression(child)

Review Comment:
   DayOfWeek is also available in other databases. 
   But dayofweek_iso is not common. 
   So only the DayOfWeek function is translated, not the WeekDay function.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r893163447


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,90 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATE_DIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_DIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>

Review Comment:
   Can we generate EXTRACT function instead? I'm more comfortable exposing some Spark-specific fields in the EXTRACT function, instead of exposing spark-specific functions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r903468324


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -262,6 +262,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))

Review Comment:
   ok 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r907164502


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -254,6 +254,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_WEEK", v))
+    case DayOfMonth(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_MONTH", v))
+    case DayOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_YEAR", v))
+    // The WEEK_OF_YEAR function in Spark returns the ISO week from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling WEEK_OF_YEAR.
+    case WeekOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("WEEK_OF_YEAR", v))
+    // The YEAR_OF_WEEK function in Spark returns the ISO week year from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling YEAR_OF_WEEK.
+    case YearOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR_OF_WEEK", v))

Review Comment:
   SGMT. My only concern is the non-standard ones: `DAY_OF_WEEK`. One idea is to translate the spark expression to standard functions. ISO DOW is 1 (Monday) to 7 (Sunday)
   Spark `DayOfWeek` -> `(EXTACT(DOW FROM ...) + 5) % 7 + 1`
   Spark `WeekDay` -> `EXTACT(DOW FROM ...) + 1`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #36663:
URL: https://github.com/apache/spark/pull/36663#issuecomment-1137840961

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r903241123


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,30 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATE_DIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_DIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field, source)</code></li>

Review Comment:
   ```suggestion
    *    <li>SQL semantic: <code>EXTRACT(field FROM source)</code></li>
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911717939


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -344,6 +344,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new GeneralScalarExpression("+",
+        Array[V2Expression](new GeneralScalarExpression("%",
+          Array[V2Expression](new V2Extract("DAY_OF_WEEK", v), LiteralValue(7, IntegerType))),

Review Comment:
   Do we need `ISO_DAY_OF_WEEK` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911801086


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -344,6 +344,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new GeneralScalarExpression("+",

Review Comment:
   can we add some comments to explain the translation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r911803758


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala:
##########
@@ -270,6 +270,17 @@ abstract class JdbcDialect extends Serializable with Logging{
           s"${this.getClass.getSimpleName} does not support function: TRIM")
       }
     }
+
+    override def visitExtract(field: String, source: String): String = {
+      if (isSupportedFunction(field)) {
+        super.visitExtract(field, source)
+      } else {
+        // The framework will catch the error and give up the push-down.
+        // Please see `JdbcDialect.compileExpression(expr: Expression)` for more details.
+        throw new UnsupportedOperationException(

Review Comment:
   I think it makes sense to fail by default in `JDBCSQLBuilder`, because the extract field names are so different in different databases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r912073385


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -202,7 +209,7 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
   private def checkPushedInfo(df: DataFrame, expectedPlanFragment: String*): Unit = {
     df.queryExecution.optimizedPlan.collect {
       case _: DataSourceV2ScanRelation =>
-        checkKeywordsExistsInExplain(df, expectedPlanFragment: _*)
+        checkKeywordsExistsInExplain(df, SimpleMode, expectedPlanFragment: _*)

Review Comment:
   does this avoid truncation? I though we need to set `SQLConf.MAX_PLAN_STRING_LENGTH`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913359517


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,14 +1034,85 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
             "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df7 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "dayofyear(date1) > 100 AND dayofmonth(date1) > 10 ")
+        checkFiltersRemoved(df7)
+        val expectedPlanFragment7 =
+          "PushedFilters: [DATE1 IS NOT NULL, EXTRACT(DAY_OF_YEAR FROM DATE1) > 100, " +
+          "EXTRACT(DAY FROM DATE1) > 10]"
+        checkPushedInfo(df7, expectedPlanFragment7)
+        checkAnswer(df7, Seq(Row("amy"), Row("alex")))
+
+        val df8 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "year(date1) = 2022 AND quarter(date1) = 2")
+        checkFiltersRemoved(df8)
+        val expectedPlanFragment8 =
+          "[DATE1 IS NOT NULL, EXTRACT(YEAR FROM DATE1) = 2022, " +
+          "EXTRACT(QUARTER FROM DATE1) = 2]"
+        checkPushedInfo(df8, expectedPlanFragment8)
+        checkAnswer(df8, Seq(Row("amy"), Row("alex")))
+
+        val df9 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "second(time1) = 0 AND month(date1) = 5")
+        checkFiltersRemoved(df9)
+        val expectedPlanFragment9 =
+          "PushedFilters: [TIME1 IS NOT NULL, DATE1 IS NOT NULL, EXTRACT(SECOND FROM TIME1) = 0, " +
+          "EXTRACT(MONTH FROM DATE1) ..."

Review Comment:
   If you don't know how to avoid explain string truncation, @beliefer can you help?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913449545


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,18 +1034,93 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
             "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
       }
     }
   }
 
+  test("scan with filter push-down with date time functions") {
+    withSQLConf(SQLConf.MAX_METADATA_STRING_LENGTH.key -> "200") {

Review Comment:
   OK 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r889908186


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects do not need to follow ISO semantics when handling DAY_OF_WEEK.

Review Comment:
   ```suggestion
       // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r887825468


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -35,8 +35,9 @@ private[sql] object H2Dialect extends JdbcDialect {
     url.toLowerCase(Locale.ROOT).startsWith("jdbc:h2")
 
   private val supportedFunctions =
-    Set("ABS", "COALESCE", "LN", "EXP", "POWER", "SQRT", "FLOOR", "CEIL",
-      "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM")
+    Set("ABS", "COALESCE", "LN", "EXP", "POWER", "SQRT", "FLOOR", "CEIL", "SUBSTRING", "UPPER",
+      "LOWER", "TRANSLATE", "TRIM", "EXTRACT", "SECOND", "MINUTE", "HOUR", "MONTH",

Review Comment:
   Please remove `EXTRACT`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on PR #36663:
URL: https://github.com/apache/spark/pull/36663#issuecomment-1162512514

   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r904495438


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Extract.java:
##########
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.expressions;
+
+import org.apache.spark.annotation.Evolving;
+
+import java.io.Serializable;
+
+/**
+ * Represent an extract expression, which contains a field to be extracted
+ * and a source expression where the field should be extracted.
+ * @since 3.4.0

Review Comment:
   ```suggestion
   
    * @since 3.4.0
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r905938389


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -254,6 +254,55 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects should avoid to follow ISO semantics when handling DAY_OF_WEEK.
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_WEEK", v))
+    case DayOfMonth(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_MONTH", v))
+    case DayOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("DAY_OF_YEAR", v))
+    // The WEEK_OF_YEAR function in Spark returns the ISO week from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling WEEK_OF_YEAR.
+    case WeekOfYear(child) =>
+      generateExpression(child).map(v => new V2Extract("WEEK_OF_YEAR", v))
+    // The YEAR_OF_WEEK function in Spark returns the ISO week year from a date/timestamp.
+    // Database dialects need to follow ISO semantics when handling YEAR_OF_WEEK.
+    case YearOfWeek(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR_OF_WEEK", v))

Review Comment:
   BTW, we don't have to follow Spark to define these field names. Is there any standard we can follow?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910737639


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -123,4 +127,26 @@ private[sql] object H2Dialect extends JdbcDialect {
     }
     super.classifyException(message, e)
   }
+
+  override def compileExpression(expr: Expression): Option[String] = {
+    val jdbcSQLBuilder = new H2JDBCSQLBuilder()
+    try {
+      Some(jdbcSQLBuilder.build(expr))
+    } catch {
+      case NonFatal(e) =>
+        logWarning("Error occurs while compiling V2 expression", e)
+        None
+    }
+  }
+
+  class H2JDBCSQLBuilder extends JDBCSQLBuilder {
+
+    override def visitExtract(field: String, source: String): String = {
+      field match {
+        case "WEEK" => s"EXTRACT(ISO_WEEK FROM $source)"
+        case "YEAR_OF_WEEK" => s"EXTRACT(ISO_WEEK_YEAR FROM $source)"

Review Comment:
   In different databases,  the iso field names is different.
   In postgresql uses  `isoyear `.
   In snowflake uses  `week_iso `.
   In H2 uses  `ISO_WEEK_YEAR `.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r910881528


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -290,4 +297,15 @@ protected String visitTrim(String direction, String[] inputs) {
       return "TRIM(" + direction + " " + inputs[1] + " FROM " + inputs[0] + ")";
     }
   }
+
+  protected String visitExtract(String field, String source) {
+    switch (field) {
+      case "DAY_OF_WEEK":
+        return "(EXTRACT(ISO_DAY_OF_WEEK FROM " + source + ") % 7)+ 1";

Review Comment:
   n different databases, the ISO_DAY_OF_WEEK is different.
   In postgresql uses `isodow `.
   In snowflake uses `dow_iso ` .
   In H2 uses `ISO_DAY_OF_WEEK `.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r912071434


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -344,6 +344,57 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // translate the DayOfWeek function in Spark using ISO standards

Review Comment:
   can we say more? e.g. `DayOfWeek` uses `Sunday=1, Monday=2, ...` and ISO standard is `Monday=1, ...`, so we use the formular ... to do translation



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -344,6 +344,57 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // translate the DayOfWeek function in Spark using ISO standards
+    case DayOfWeek(child) =>
+      generateExpression(child).map(v => new GeneralScalarExpression("+",
+        Array[V2Expression](new GeneralScalarExpression("%",
+          Array[V2Expression](new V2Extract("DAY_OF_WEEK", v), LiteralValue(7, IntegerType))),
+          LiteralValue(1, IntegerType))))
+    // translate the WeekDay function in Spark using ISO standards

Review Comment:
   ditto



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913359202


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -344,6 +344,59 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) =>
+      generateExpression(child).map(v => new V2Extract("SECOND", v))
+    case Minute(child, _) =>
+      generateExpression(child).map(v => new V2Extract("MINUTE", v))
+    case Hour(child, _) =>
+      generateExpression(child).map(v => new V2Extract("HOUR", v))
+    case Month(child) =>
+      generateExpression(child).map(v => new V2Extract("MONTH", v))
+    case Quarter(child) =>
+      generateExpression(child).map(v => new V2Extract("QUARTER", v))
+    case Year(child) =>
+      generateExpression(child).map(v => new V2Extract("YEAR", v))
+    // DayOfWeek uses 1 = Sunday, 2 = Monday, ... and ISO standard is Monday=1, ...,

Review Comment:
   ```suggestion
       // DayOfWeek uses Sunday=1, Monday=2, ... and ISO standard is Monday=1, ...,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913395615


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,14 +1035,86 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
-            "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
+          "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, CAST(DEPT AS short) > 1, " +
+          "CAST(BONUS AS decimal(20,2)) > 1200.00]"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df7 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "dayofyear(date1) > 100 AND dayofmonth(date1) > 10 ")

Review Comment:
   Could you put these test cases into another test case like `test("scan with filter push-down with date time functions") {`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r913409855


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1026,14 +1035,86 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
             |AND cast(dept as short) > 1
             |AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
         checkFiltersRemoved(df6, ansiMode)
-        val expectedPlanFragment8 = if (ansiMode) {
+        val expectedPlanFragment6 = if (ansiMode) {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
-            "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, ...,"
+          "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, CAST(DEPT AS short) > 1, " +
+          "CAST(BONUS AS decimal(20,2)) > 1200.00]"
         } else {
           "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
         }
-        checkPushedInfo(df6, expectedPlanFragment8)
+        checkPushedInfo(df6, expectedPlanFragment6)
         checkAnswer(df6, Seq(Row(2, "david", 10000, 1300, true)))
+
+        val df7 = sql("SELECT name FROM h2.test.datetime WHERE " +
+          "dayofyear(date1) > 100 AND dayofmonth(date1) > 10 ")

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

Posted by GitBox <gi...@apache.org>.
chenzhx commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r890253947


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,55 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_DIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    // The DAY_OF_WEEK function in Spark returns the day of the week for date/timestamp.
+    // Database dialects do not need to follow ISO semantics when handling DAY_OF_WEEK.

Review Comment:
   In ISO semantics, 1 represents Monday.
   But the DayOfWeek function in Spark, 1 represents Sunday.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org