You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/06 09:49:04 UTC

[GitHub] [spark] zheniantoushipashi opened a new pull request, #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

zheniantoushipashi opened a new pull request, #37427:
URL: https://github.com/apache/spark/pull/37427

   **What changes were proposed in this pull request?**
   
   support more  commonly used string functions
   
   BIT_LENGTH
   CHAR_LENGTH
   CONCAT
   
   The mainstream databases support these functions show below.
   
   
   
   Function | PostgreSQL | ClickHouse | H2 | MySQL | Oracle | Redshift | Presto | Teradata | Snowflake | DB2 | Vertica | Exasol | SqlServer | Yellowbrick | Impala | Mariadb | Druid | Pig | SQLite | Influxdata | Singlestore | ElasticSearch
   -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
   BIT_LENGTH | Yes | Yes | Yes | Yes | Yes | no | no | no | no | Yes | Yes | Yes | no | Yes | no | Yes | no | no | no | no | no | Yes
   CHAR_LENGTH | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | no | Yes | Yes | Yes | Yes
   CONCAT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | no | no | no | Yes | Yes
   
   
   **Why are the changes needed?**
   DS V2 supports push down string functions
   
   **Does this PR introduce any user-facing change?**
   'No'.
   New feature.
   
   How was this patch tested?
   New tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r940847672


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,26 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")
+    checkFiltersRemoved(df6)
+    checkPushedInfo(df6, "[NAME IS NOT NULL, BIT_LENGTH(NAME) = 40]")
+    checkAnswer(df6, Seq(Row(1, "cathy", 9000, 1200, false),
+      Row(2, "david", 10000, 1300, true)))
+
+    val df7 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "char_length(name) = 5")

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r941472084


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -217,6 +217,9 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       generateExpressionWithName("SUBSTRING", children)
     case Upper(child) => generateExpressionWithName("UPPER", Seq(child))
     case Lower(child) => generateExpressionWithName("LOWER", Seq(child))
+    case BitLength(child) => generateExpressionWithName("BIT_LENGTH", Seq(child))
+    case Length(child) => generateExpressionWithName("CHAR_LENGTH", Seq(child))

Review Comment:
   we should make sure the chid data type is string type, to safely translate to `CHAR_LENGTH`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r940072424


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,35 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")
+    checkFiltersRemoved(df6)
+    val expectedPlanFragment6 = {
+      "[NAME IS NOT NULL, BIT_LENGTH(NAME) = 40]"
+    }
+    checkPushedInfo(df6, expectedPlanFragment6)

Review Comment:
   ```suggestion
       checkPushedInfo(df6, "PushedFilters: [NAME IS NOT NULL, BIT_LENGTH(NAME) = 40]")
   ```



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,35 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")
+    checkFiltersRemoved(df6)
+    val expectedPlanFragment6 = {
+      "[NAME IS NOT NULL, BIT_LENGTH(NAME) = 40]"
+    }
+    checkPushedInfo(df6, expectedPlanFragment6)
+    checkAnswer(df6, Seq(Row(1, "cathy", 9000, 1200, false),
+      Row(2, "david", 10000, 1300, true)))
+
+    val df7 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "char_length(name) = 5")
+    checkFiltersRemoved(df7)
+    val expectedPlanFragment7 = {
+      "[NAME IS NOT NULL, CHAR_LENGTH(NAME) = 5]"
+    }
+    checkPushedInfo(df7, expectedPlanFragment7)
+    checkAnswer(df6, Seq(Row(1, "cathy", 9000, 1200, false),
+      Row(2, "david", 10000, 1300, true)))
+
+    val df8 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "concat(name, ',' , cast(salary as string)) = 'cathy,9000.00'")
+    checkFiltersRemoved(df8)
+    val expectedPlanFragment8 = {
+      "[(CONCAT(NAME, ',', CAST(SALARY AS string))) = 'cathy,9000.00']"
+    }
+    checkPushedInfo(df8, expectedPlanFragment8)

Review Comment:
   ditto



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,35 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")
+    checkFiltersRemoved(df6)
+    val expectedPlanFragment6 = {
+      "[NAME IS NOT NULL, BIT_LENGTH(NAME) = 40]"
+    }
+    checkPushedInfo(df6, expectedPlanFragment6)
+    checkAnswer(df6, Seq(Row(1, "cathy", 9000, 1200, false),
+      Row(2, "david", 10000, 1300, true)))
+
+    val df7 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "char_length(name) = 5")
+    checkFiltersRemoved(df7)
+    val expectedPlanFragment7 = {
+      "[NAME IS NOT NULL, CHAR_LENGTH(NAME) = 5]"
+    }
+    checkPushedInfo(df7, expectedPlanFragment7)

Review Comment:
   ditto



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r939783611


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -50,7 +50,9 @@ private[sql] object H2Dialect extends JdbcDialect {
     Set("ABS", "COALESCE", "GREATEST", "LEAST", "RAND", "LOG", "LOG10", "LN", "EXP",
       "POWER", "SQRT", "FLOOR", "CEIL", "ROUND", "SIN", "SINH", "COS", "COSH", "TAN",
       "TANH", "COT", "ASIN", "ACOS", "ATAN", "ATAN2", "DEGREES", "RADIANS", "SIGN",
-      "PI", "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM", "MD5", "SHA1", "SHA2")
+

Review Comment:
   done



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1404,6 +1416,7 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
   }
 
   test("scan with filter push-down with string functions") {
+

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r939789711


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -154,6 +154,18 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (1, 1)").executeUpdate()
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (2, 1)").executeUpdate()
 
+
+      conn.prepareStatement(
+        "CREATE TABLE \"test\".\"employee1\" (dept INTEGER, name TEXT(32), " +

Review Comment:
   You can cast other column to string.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
beliefer commented on PR #37427:
URL: https://github.com/apache/spark/pull/37427#issuecomment-1207584645

   @HyukjinKwon Thank you for ping me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r940094134


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,35 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")
+    checkFiltersRemoved(df6)
+    val expectedPlanFragment6 = {
+      "[NAME IS NOT NULL, BIT_LENGTH(NAME) = 40]"
+    }
+    checkPushedInfo(df6, expectedPlanFragment6)
+    checkAnswer(df6, Seq(Row(1, "cathy", 9000, 1200, false),
+      Row(2, "david", 10000, 1300, true)))
+
+    val df7 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "char_length(name) = 5")
+    checkFiltersRemoved(df7)
+    val expectedPlanFragment7 = {
+      "[NAME IS NOT NULL, CHAR_LENGTH(NAME) = 5]"
+    }
+    checkPushedInfo(df7, expectedPlanFragment7)
+    checkAnswer(df6, Seq(Row(1, "cathy", 9000, 1200, false),
+      Row(2, "david", 10000, 1300, true)))
+
+    val df8 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "concat(name, ',' , cast(salary as string)) = 'cathy,9000.00'")
+    checkFiltersRemoved(df8)
+    val expectedPlanFragment8 = {
+      "[(CONCAT(NAME, ',', CAST(SALARY AS string))) = 'cathy,9000.00']"
+    }
+    checkPushedInfo(df8, expectedPlanFragment8)

Review Comment:
   done



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,35 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")
+    checkFiltersRemoved(df6)
+    val expectedPlanFragment6 = {
+      "[NAME IS NOT NULL, BIT_LENGTH(NAME) = 40]"
+    }
+    checkPushedInfo(df6, expectedPlanFragment6)
+    checkAnswer(df6, Seq(Row(1, "cathy", 9000, 1200, false),
+      Row(2, "david", 10000, 1300, true)))
+
+    val df7 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "char_length(name) = 5")
+    checkFiltersRemoved(df7)
+    val expectedPlanFragment7 = {
+      "[NAME IS NOT NULL, CHAR_LENGTH(NAME) = 5]"
+    }
+    checkPushedInfo(df7, expectedPlanFragment7)

Review Comment:
   done



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,35 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")
+    checkFiltersRemoved(df6)
+    val expectedPlanFragment6 = {
+      "[NAME IS NOT NULL, BIT_LENGTH(NAME) = 40]"
+    }
+    checkPushedInfo(df6, expectedPlanFragment6)

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r939784483


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -154,6 +154,18 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (1, 1)").executeUpdate()
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (2, 1)").executeUpdate()
 
+
+      conn.prepareStatement(
+        "CREATE TABLE \"test\".\"employee1\" (dept INTEGER, name TEXT(32), " +

Review Comment:
   employee table does not have two string columns that can be used to test the concat function



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r940854763


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,26 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r939774001


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -50,7 +50,9 @@ private[sql] object H2Dialect extends JdbcDialect {
     Set("ABS", "COALESCE", "GREATEST", "LEAST", "RAND", "LOG", "LOG10", "LN", "EXP",
       "POWER", "SQRT", "FLOOR", "CEIL", "ROUND", "SIN", "SINH", "COS", "COSH", "TAN",
       "TANH", "COT", "ASIN", "ACOS", "ATAN", "ATAN2", "DEGREES", "RADIANS", "SIGN",
-      "PI", "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM", "MD5", "SHA1", "SHA2")
+

Review Comment:
   ```suggestion
   ```



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -155,13 +155,18 @@ public String build(Expression expr) {
         case "SHA2":
         case "MD5":
         case "CRC32":
+        case "BIT_LENGTH":
+        case "CHAR_LENGTH":
           return visitSQLFunction(name,
             Arrays.stream(e.children()).map(c -> build(c)).toArray(String[]::new));
         case "CASE_WHEN": {
           List<String> children =
             Arrays.stream(e.children()).map(c -> build(c)).collect(Collectors.toList());
           return visitCaseWhen(children.toArray(new String[e.children().length]));
         }
+        case "CONCAT":
+          return visitConcat("CONCAT",

Review Comment:
   Please use `visitSQLFunction` instead.



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -154,6 +154,18 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (1, 1)").executeUpdate()
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (2, 1)").executeUpdate()
 
+
+      conn.prepareStatement(
+        "CREATE TABLE \"test\".\"employee1\" (dept INTEGER, name TEXT(32), " +

Review Comment:
   Why not use `employee` ?



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1404,6 +1416,7 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
   }
 
   test("scan with filter push-down with string functions") {
+

Review Comment:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
beliefer commented on PR #37427:
URL: https://github.com/apache/spark/pull/37427#issuecomment-1207585728

   @zheniantoushipashi Please update the PR title with `[SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string  functions(non ANSI)
URL: https://github.com/apache/spark/pull/37427


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r941937521


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -217,6 +217,9 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       generateExpressionWithName("SUBSTRING", children)
     case Upper(child) => generateExpressionWithName("UPPER", Seq(child))
     case Lower(child) => generateExpressionWithName("LOWER", Seq(child))
+    case BitLength(child) => generateExpressionWithName("BIT_LENGTH", Seq(child))
+    case Length(child) => generateExpressionWithName("CHAR_LENGTH", Seq(child))

Review Comment:
   fixed



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -217,6 +217,9 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       generateExpressionWithName("SUBSTRING", children)
     case Upper(child) => generateExpressionWithName("UPPER", Seq(child))
     case Lower(child) => generateExpressionWithName("LOWER", Seq(child))
+    case BitLength(child) => generateExpressionWithName("BIT_LENGTH", Seq(child))

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r939863089


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##########
@@ -155,13 +155,18 @@ public String build(Expression expr) {
         case "SHA2":
         case "MD5":
         case "CRC32":
+        case "BIT_LENGTH":
+        case "CHAR_LENGTH":
           return visitSQLFunction(name,
             Arrays.stream(e.children()).map(c -> build(c)).toArray(String[]::new));
         case "CASE_WHEN": {
           List<String> children =
             Arrays.stream(e.children()).map(c -> build(c)).collect(Collectors.toList());
           return visitCaseWhen(children.toArray(new String[e.children().length]));
         }
+        case "CONCAT":
+          return visitConcat("CONCAT",

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r941937071


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -217,6 +217,9 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       generateExpressionWithName("SUBSTRING", children)
     case Upper(child) => generateExpressionWithName("UPPER", Seq(child))
     case Lower(child) => generateExpressionWithName("LOWER", Seq(child))
+    case BitLength(child) => generateExpressionWithName("BIT_LENGTH", Seq(child))

Review Comment:
   
   Yes, you are right, I have researched some databases, both BIT_LENGTH and CHAR_LENGTH functions only support string type, there should be a type restriction here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on PR #37427:
URL: https://github.com/apache/spark/pull/37427#issuecomment-1210182705

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r939887502


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -154,6 +154,18 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (1, 1)").executeUpdate()
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (2, 1)").executeUpdate()
 
+
+      conn.prepareStatement(
+        "CREATE TABLE \"test\".\"employee1\" (dept INTEGER, name TEXT(32), " +

Review Comment:
   done  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on a diff in pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r939886048


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -154,6 +154,18 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (1, 1)").executeUpdate()
       conn.prepareStatement("INSERT INTO \"test\".\"dept\" VALUES (2, 1)").executeUpdate()
 
+
+      conn.prepareStatement(
+        "CREATE TABLE \"test\".\"employee1\" (dept INTEGER, name TEXT(32), " +

Review Comment:
   ok, I thought it didn't support cast pushdown yet.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
beliefer commented on PR #37427:
URL: https://github.com/apache/spark/pull/37427#issuecomment-1209188259

   LGTM. cc @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cody195 commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
cody195 commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r941979858


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##########
@@ -50,7 +50,9 @@ private[sql] object H2Dialect extends JdbcDialect {
     Set("ABS", "COALESCE", "GREATEST", "LEAST", "RAND", "LOG", "LOG10", "LN", "EXP",
       "POWER", "SQRT", "FLOOR", "CEIL", "ROUND", "SIN", "SINH", "COS", "COSH", "TAN",
       "TANH", "COT", "ASIN", "ACOS", "ATAN", "ATAN2", "DEGREES", "RADIANS", "SIGN",
-      "PI", "SUBSTRING", "UPPER", "LOWER", "TRANSLATE", "TRIM", "MD5", "SHA1", "SHA2")
+

Review Comment:
   👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
beliefer commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r940161522


##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,26 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")

Review Comment:
   ```suggestion
       val df6 = sql("SELECT * FROM h2.test.employee WHERE bit_length(name) = 40")
   ```



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##########
@@ -1449,6 +1449,26 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
       "PushedFilters: [NAME IS NOT NULL]"
     checkPushedInfo(df5, expectedPlanFragment5)
     checkAnswer(df5, Seq(Row(6, "jen", 12000, 1200, true)))
+
+    val df6 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "bit_length(name) = 40")
+    checkFiltersRemoved(df6)
+    checkPushedInfo(df6, "[NAME IS NOT NULL, BIT_LENGTH(NAME) = 40]")
+    checkAnswer(df6, Seq(Row(1, "cathy", 9000, 1200, false),
+      Row(2, "david", 10000, 1300, true)))
+
+    val df7 = sql("SELECT * FROM h2.test.employee WHERE " +
+      "char_length(name) = 5")

Review Comment:
   ditto



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37427: [SPARK-39929][SQL] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37427:
URL: https://github.com/apache/spark/pull/37427#discussion_r941472590


##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##########
@@ -217,6 +217,9 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) {
       generateExpressionWithName("SUBSTRING", children)
     case Upper(child) => generateExpressionWithName("UPPER", Seq(child))
     case Lower(child) => generateExpressionWithName("LOWER", Seq(child))
+    case BitLength(child) => generateExpressionWithName("BIT_LENGTH", Seq(child))

Review Comment:
   does `BIT_LENGTH` in other databases take both string and binary inputs?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #37427:
URL: https://github.com/apache/spark/pull/37427#issuecomment-1207245051

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #37427:
URL: https://github.com/apache/spark/pull/37427#issuecomment-1207521594

   cc @beliefer


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zheniantoushipashi commented on pull request #37427: [SPARK-39929][sql] DS V2 supports push down string functions(non ANSI)

Posted by GitBox <gi...@apache.org>.
zheniantoushipashi commented on PR #37427:
URL: https://github.com/apache/spark/pull/37427#issuecomment-1207549227

   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org