You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/05/25 09:31:59 UTC

[GitHub] [spark] beliefer commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r881416450


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -723,6 +731,53 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
   }
 
+  test("scan with filter push-down with datetime functions") {

Review Comment:
   We can put these. tests after `test("scan with filter push-down with ansi mode") {`.



##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -32,7 +32,9 @@ private object H2Dialect extends JdbcDialect {
 
   private val supportedFunctions =
     Set("ABS", "COALESCE", "LN", "EXP", "POWER", "SQRT", "FLOOR", "CEIL",
-      "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM")
+      "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM",
+      "EXTRACT", "SECOND", "MINUTE", "HOUR", "MONTH", "QUARTER", "YEAR",
+      "DAYOFYEAR", "DAYOFMONTH", "DAYOFWEEK")

Review Comment:
   ```
   Set("ABS", "COALESCE", "LN", "EXP", "POWER", "SQRT", "FLOOR", "CEIL", "SUBSTRING", "UPPER",
     "LOWER", "TRANSLATE", "TRIM", "EXTRACT", "SECOND", "MINUTE", "HOUR", "MONTH",
     "QUARTER", "YEAR", "DAYOFYEAR", "DAYOFMONTH", "DAYOFWEEK")
   ```



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,52 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATEDIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case extract: Extract =>

Review Comment:
   `Extract` replaced by `Year`, `YearOfWeek`, `Quarter` and so on.
   So we should not do the translate.
   Could you investigate the function such as `YearOfWeek`, `WeekOfYear` and translate them.



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,52 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATEDIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case extract: Extract =>
+      val childrenExpressions = extract.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == extract.children.length) {
+        Some(new GeneralScalarExpression("EXTRACT", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFWEEK", Array[V2Expression](v)))

Review Comment:
   `DAY_OF_WEEK`



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,84 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field, source, replacement)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>QUARTER</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>QUARTER(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAYOFWEEK</code>

Review Comment:
   The name seems not correct



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java:
##########
@@ -196,6 +196,84 @@
  *    <li>Since version: 3.4.0</li>
  *   </ul>
  *  </li>
+ *  <li>Name: <code>DATE_ADD</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATE_ADD(start_date, num_days)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DATEDIFF</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>DATEDIFF(end_date, start_date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>TRUNC</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>TRUNC(date, format)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>EXTRACT</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>EXTRACT(field, source, replacement)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>SECOND</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>SECOND(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MINUTE</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MINUTE(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>HOUR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>HOUR(timestamp, timezone)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>MONTH</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>MONTH(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>QUARTER</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>QUARTER(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>YEAR</code>
+ *   <ul>
+ *    <li>SQL semantic: <code>YEAR(date)</code></li>
+ *    <li>Since version: 3.4.0</li>
+ *   </ul>
+ *  </li>
+ *  <li>Name: <code>DAYOFWEEK</code>

Review Comment:
   Refer http://www.h2database.com/html/functions.html#day_of_week
   https://prestodb.io/docs/current/functions/datetime.html#day_of_week



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,52 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATEDIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case extract: Extract =>
+      val childrenExpressions = extract.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == extract.children.length) {
+        Some(new GeneralScalarExpression("EXTRACT", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFWEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFMONTH", Array[V2Expression](v)))

Review Comment:
   `DAY_OF_MONTH`



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -259,6 +259,52 @@ class V2ExpressionBuilder(
       } else {
         None
       }
+    case date: DateAdd =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATE_ADD", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: DateDiff =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("DATEDIFF", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case date: TruncDate =>
+      val childrenExpressions = date.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == date.children.length) {
+        Some(new GeneralScalarExpression("TRUNC", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case extract: Extract =>
+      val childrenExpressions = extract.children.flatMap(generateExpression(_))
+      if (childrenExpressions.length == extract.children.length) {
+        Some(new GeneralScalarExpression("EXTRACT", childrenExpressions.toArray[V2Expression]))
+      } else {
+        None
+      }
+    case Second(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("SECOND", Array[V2Expression](v)))
+    case Minute(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MINUTE", Array[V2Expression](v)))
+    case Hour(child, _) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("HOUR", Array[V2Expression](v)))
+    case Month(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("MONTH", Array[V2Expression](v)))
+    case Quarter(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("QUARTER", Array[V2Expression](v)))
+    case Year(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("YEAR", Array[V2Expression](v)))
+    case DayOfWeek(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFWEEK", Array[V2Expression](v)))
+    case DayOfMonth(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFMONTH", Array[V2Expression](v)))
+    case DayOfYear(child) => generateExpression(child)
+      .map(v => new GeneralScalarExpression("DAYOFYEAR", Array[V2Expression](v)))

Review Comment:
   `DAY_OF_YEAR`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org