You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2023/03/10 05:55:31 UTC

[GitHub] [spark] beliefer opened a new pull request, #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

beliefer opened a new pull request, #40359:
URL: https://github.com/apache/spark/pull/40359

   ### What changes were proposed in this pull request?
   Currently, the DS V2 pushdown framework pushed offset as `OFFSET n` in default and pushed it with limit as `LIMIT m OFFSET n`. But some built-in dialect doesn't support these syntax. So, when Spark pushdown offset into these databases, them throwing errors.
   
   
   ### Why are the changes needed?
   Fix the bug that pushdown offset or paging is invalid for some built-in dialect.
   
   
   ### Does this PR introduce _any_ user-facing change?
   'Yes'.
   The bug will be fixed.
   
   
   ### How was this patch tested?
   New test cases.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132004161


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala:
##########
@@ -410,6 +410,15 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
     assert(sorts.isEmpty)
   }
 
+  private def checkOffsetPushed(df: DataFrame, offset: Option[Int]): Unit = {

Review Comment:
   can we rename `limitPushed` to `checkLimitPushed` and follow the implementation here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132005986


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala:
##########
@@ -181,19 +181,35 @@ private case object OracleDialect extends JdbcDialect {
     if (limit > 0) s"WHERE rownum <= $limit" else ""
   }
 
+  override def getOffsetClause(offset: Integer): String = {
+    // Oracle doesn't support OFFSET clause.
+    // We can use rownum > n to skip some rows in the result set.
+    // Note: rn is an alias of rownum.

Review Comment:
   nvm, I see the implementation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132157169


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala:
##########
@@ -291,4 +291,22 @@ private case object MySQLDialect extends JdbcDialect with SQLConfHelper {
       throw QueryExecutionErrors.unsupportedDropNamespaceRestrictError()
     }
   }
+
+  class MySQLSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
+    extends JdbcSQLQueryBuilder(dialect, options) {
+
+    override def build(): String = {
+      if (limit < 1 && offset > 0) {
+        val offsetClause = dialect.getOffsetClause(offset)
+        options.prepareQuery +
+          s"SELECT $columnList FROM ${options.tableOrQuery} $tableSampleClause" +
+          s" $whereClause $groupByClause $orderByClause LIMIT 18446744073709551610 $offsetClause"

Review Comment:
   MySQL doesn't support offset without limit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1133011143


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala:
##########
@@ -291,4 +291,26 @@ private case object MySQLDialect extends JdbcDialect with SQLConfHelper {
       throw QueryExecutionErrors.unsupportedDropNamespaceRestrictError()
     }
   }
+
+  class MySQLSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
+    extends JdbcSQLQueryBuilder(dialect, options) {
+
+    override def build(): String = {
+      if (limit < 1 && offset > 0) {

Review Comment:
   if limit = 0, Optimizer will convert it to empty relation. But it's OK to use `limit < 0`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1133011143


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala:
##########
@@ -291,4 +291,26 @@ private case object MySQLDialect extends JdbcDialect with SQLConfHelper {
       throw QueryExecutionErrors.unsupportedDropNamespaceRestrictError()
     }
   }
+
+  class MySQLSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
+    extends JdbcSQLQueryBuilder(dialect, options) {
+
+    override def build(): String = {
+      if (limit < 1 && offset > 0) {

Review Comment:
   if limit = 0, Optimizer will convert it to empty relation. But it's OK to use `limit < 0` too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40359:
URL: https://github.com/apache/spark/pull/40359#issuecomment-1465460479

   > thanks, merged to master! can you open a backport PR for 3.4?
   
   Thank you! I will create it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40359:
URL: https://github.com/apache/spark/pull/40359#issuecomment-1463340665

   ping @cloud-fan cc @sadikovi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132005699


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala:
##########
@@ -181,19 +181,35 @@ private case object OracleDialect extends JdbcDialect {
     if (limit > 0) s"WHERE rownum <= $limit" else ""
   }
 
+  override def getOffsetClause(offset: Integer): String = {
+    // Oracle doesn't support OFFSET clause.
+    // We can use rownum > n to skip some rows in the result set.
+    // Note: rn is an alias of rownum.

Review Comment:
   every table in oracle has the `rn` column?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #40359:
URL: https://github.com/apache/spark/pull/40359#issuecomment-1465448190

   thanks, merged to master! can you open a backport PR for 3.4?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132156623


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala:
##########
@@ -410,6 +410,15 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
     assert(sorts.isEmpty)
   }
 
+  private def checkOffsetPushed(df: DataFrame, offset: Option[Int]): Unit = {

Review Comment:
   Let's do it in another PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132460230


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala:
##########
@@ -181,19 +181,35 @@ private case object OracleDialect extends JdbcDialect {
     if (limit > 0) s"WHERE rownum <= $limit" else ""
   }
 
+  override def getOffsetClause(offset: Integer): String = {
+    // Oracle doesn't support OFFSET clause.
+    // We can use rownum > n to skip some rows in the result set.
+    // Note: rn is an alias of rownum.
+    if (offset > 0) s"WHERE rn > $offset" else ""
+  }
+
   class OracleSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
     extends JdbcSQLQueryBuilder(dialect, options) {
 
-    // TODO[SPARK-42289]: DS V2 pushdown could let JDBC dialect decide to push down offset
     override def build(): String = {
       val selectStmt = s"SELECT $columnList FROM ${options.tableOrQuery} $tableSampleClause" +
         s" $whereClause $groupByClause $orderByClause"
-      if (limit > 0) {
-        val limitClause = dialect.getLimitClause(limit)
-        options.prepareQuery + s"SELECT tab.* FROM ($selectStmt) tab $limitClause"
+      val finalSelectStmt = if (limit > 0) {
+        if (offset > 0) {
+          s"SELECT $columnList FROM (SELECT tab.*, rownum rn FROM ($selectStmt) tab)" +

Review Comment:
   or we can use the new syntax in oracle 12+, which should be the widely used versions today.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1133009134


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala:
##########
@@ -181,19 +181,35 @@ private case object OracleDialect extends JdbcDialect {
     if (limit > 0) s"WHERE rownum <= $limit" else ""
   }
 
+  override def getOffsetClause(offset: Integer): String = {
+    // Oracle doesn't support OFFSET clause.
+    // We can use rownum > n to skip some rows in the result set.
+    // Note: rn is an alias of rownum.
+    if (offset > 0) s"WHERE rn > $offset" else ""
+  }
+
   class OracleSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
     extends JdbcSQLQueryBuilder(dialect, options) {
 
-    // TODO[SPARK-42289]: DS V2 pushdown could let JDBC dialect decide to push down offset
     override def build(): String = {
       val selectStmt = s"SELECT $columnList FROM ${options.tableOrQuery} $tableSampleClause" +
         s" $whereClause $groupByClause $orderByClause"
-      if (limit > 0) {
-        val limitClause = dialect.getLimitClause(limit)
-        options.prepareQuery + s"SELECT tab.* FROM ($selectStmt) tab $limitClause"
+      val finalSelectStmt = if (limit > 0) {
+        if (offset > 0) {
+          s"SELECT $columnList FROM (SELECT tab.*, rownum rn FROM ($selectStmt) tab)" +

Review Comment:
   > or we can use the new syntax in oracle 12+, which should be the widely used versions today.
   
   Could we upgrade the version of oracle in another PR if it is widely used.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132457170


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala:
##########
@@ -181,19 +181,35 @@ private case object OracleDialect extends JdbcDialect {
     if (limit > 0) s"WHERE rownum <= $limit" else ""
   }
 
+  override def getOffsetClause(offset: Integer): String = {
+    // Oracle doesn't support OFFSET clause.
+    // We can use rownum > n to skip some rows in the result set.
+    // Note: rn is an alias of rownum.
+    if (offset > 0) s"WHERE rn > $offset" else ""
+  }
+
   class OracleSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
     extends JdbcSQLQueryBuilder(dialect, options) {
 
-    // TODO[SPARK-42289]: DS V2 pushdown could let JDBC dialect decide to push down offset
     override def build(): String = {
       val selectStmt = s"SELECT $columnList FROM ${options.tableOrQuery} $tableSampleClause" +
         s" $whereClause $groupByClause $orderByClause"
-      if (limit > 0) {
-        val limitClause = dialect.getLimitClause(limit)
-        options.prepareQuery + s"SELECT tab.* FROM ($selectStmt) tab $limitClause"
+      val finalSelectStmt = if (limit > 0) {
+        if (offset > 0) {
+          s"SELECT $columnList FROM (SELECT tab.*, rownum rn FROM ($selectStmt) tab)" +

Review Comment:
   https://stackoverflow.com/questions/31186166/oracle-sql-filtering-by-rownum-not-returning-results-when-it-should
   
   Let's mention the reason as well: the `rownum` is calculated when the value is returned.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132448651


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala:
##########
@@ -291,4 +291,26 @@ private case object MySQLDialect extends JdbcDialect with SQLConfHelper {
       throw QueryExecutionErrors.unsupportedDropNamespaceRestrictError()
     }
   }
+
+  class MySQLSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
+    extends JdbcSQLQueryBuilder(dialect, options) {
+
+    override def build(): String = {
+      if (limit < 1 && offset > 0) {

Review Comment:
   is it possible to have limit = 0? seems safer to use `limit < 0` to indicate no limit.



##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala:
##########
@@ -291,4 +291,26 @@ private case object MySQLDialect extends JdbcDialect with SQLConfHelper {
       throw QueryExecutionErrors.unsupportedDropNamespaceRestrictError()
     }
   }
+
+  class MySQLSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
+    extends JdbcSQLQueryBuilder(dialect, options) {
+
+    override def build(): String = {
+      if (limit < 1 && offset > 0) {

Review Comment:
   is it possible to have limit = 0? seems safer to use `limit < 0` to indicate no limit, as the default value is -1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan closed pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect
URL: https://github.com/apache/spark/pull/40359


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132007081


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala:
##########
@@ -181,19 +181,35 @@ private case object OracleDialect extends JdbcDialect {
     if (limit > 0) s"WHERE rownum <= $limit" else ""
   }
 
+  override def getOffsetClause(offset: Integer): String = {
+    // Oracle doesn't support OFFSET clause.
+    // We can use rownum > n to skip some rows in the result set.
+    // Note: rn is an alias of rownum.
+    if (offset > 0) s"WHERE rn > $offset" else ""
+  }
+
   class OracleSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
     extends JdbcSQLQueryBuilder(dialect, options) {
 
-    // TODO[SPARK-42289]: DS V2 pushdown could let JDBC dialect decide to push down offset
     override def build(): String = {
       val selectStmt = s"SELECT $columnList FROM ${options.tableOrQuery} $tableSampleClause" +
         s" $whereClause $groupByClause $orderByClause"
-      if (limit > 0) {
-        val limitClause = dialect.getLimitClause(limit)
-        options.prepareQuery + s"SELECT tab.* FROM ($selectStmt) tab $limitClause"
+      val finalSelectStmt = if (limit > 0) {
+        if (offset > 0) {
+          s"SELECT $columnList FROM (SELECT tab.*, rownum rn FROM ($selectStmt) tab)" +

Review Comment:
   how about
   ```
   SELECT * FROM ($selectStmt) tab WHERE rownum > ...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132005206


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala:
##########
@@ -291,4 +291,22 @@ private case object MySQLDialect extends JdbcDialect with SQLConfHelper {
       throw QueryExecutionErrors.unsupportedDropNamespaceRestrictError()
     }
   }
+
+  class MySQLSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
+    extends JdbcSQLQueryBuilder(dialect, options) {
+
+    override def build(): String = {
+      if (limit < 1 && offset > 0) {
+        val offsetClause = dialect.getOffsetClause(offset)
+        options.prepareQuery +
+          s"SELECT $columnList FROM ${options.tableOrQuery} $tableSampleClause" +
+          s" $whereClause $groupByClause $orderByClause LIMIT 18446744073709551610 $offsetClause"

Review Comment:
   what does this `LIMIT 18446744073709551610` mean?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40359: [SPARK-42740][SQL] Fix the bug that pushdown offset or paging is invalid for some built-in dialect

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40359:
URL: https://github.com/apache/spark/pull/40359#discussion_r1132181650


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala:
##########
@@ -181,19 +181,35 @@ private case object OracleDialect extends JdbcDialect {
     if (limit > 0) s"WHERE rownum <= $limit" else ""
   }
 
+  override def getOffsetClause(offset: Integer): String = {
+    // Oracle doesn't support OFFSET clause.
+    // We can use rownum > n to skip some rows in the result set.
+    // Note: rn is an alias of rownum.
+    if (offset > 0) s"WHERE rn > $offset" else ""
+  }
+
   class OracleSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions)
     extends JdbcSQLQueryBuilder(dialect, options) {
 
-    // TODO[SPARK-42289]: DS V2 pushdown could let JDBC dialect decide to push down offset
     override def build(): String = {
       val selectStmt = s"SELECT $columnList FROM ${options.tableOrQuery} $tableSampleClause" +
         s" $whereClause $groupByClause $orderByClause"
-      if (limit > 0) {
-        val limitClause = dialect.getLimitClause(limit)
-        options.prepareQuery + s"SELECT tab.* FROM ($selectStmt) tab $limitClause"
+      val finalSelectStmt = if (limit > 0) {
+        if (offset > 0) {
+          s"SELECT $columnList FROM (SELECT tab.*, rownum rn FROM ($selectStmt) tab)" +

Review Comment:
   Maybe it is a bug of Oracle.
   If we using rownum directly, the result will incorrect.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org