You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Luis Fernando Kauer <lf...@yahoo.com.br.INVALID> on 2017/08/31 00:23:31 UTC
JdbcAdapter Sort Limit and Offset
Hi,
I was really upset that Calcite wouldn't push limit keyword to the database, not
even in simple queries like "select * from t limit 10".
Actually, not even ORDER BY was pushed to the database even in simple queries.
So I decided to take a look into it and opened Jira [CALCITE-1906] some time ago.
I found out that SortRemoveRule is applied when there is a RelCollationTraitDef, because it assumes that the table is already ordered, so JdbcSortRule must replace the input's treatSet with RelCollations.EMPTY, just like MongoDB and Cassandra adapters do.
Besides that, a simple fix to JdbcSortRule and JdbcSort allows it to push limit and fetch to the database.
Unfortunately, after fixing locally this other issues appeared.
When issuing a query with sort and no limit/fetch, like:
SELECT empno, ename FROM "SCOTT"."EMP" where ename='CLARK' order by ename
the selected plan is:
JdbcToEnumerableConverter
JdbcProject(EMPNO=[$0], ENAME=[$1])
JdbcSort(sort0=[$1], dir0=[ASC])
JdbcFilter(condition=[=($1, 'CLARK')])
JdbcTableScan(table=[[SCOTT, EMP]]))Notice that JdbcProject is above JdbcSort and that's because SortProjectTransposeRule is applied.
The problem is that the generated SQL is:
SELECT "EMPNO", "ENAME"
FROM (SELECT *
FROM "SCOTT"."EMP"
WHERE "ENAME" = 'CLARK'
ORDER BY "ENAME" NULLS LAST) AS "t0"
Noticed that the JdbcProject uses a subquery instead of being built at the same query level.
The problem is that SqlImplementor.Clause is ordered and "SELECT" is before "ORDER_BY" and SqlImplementor.Builder has the following check:
for (Clause clause : clauses) {
if (maxClause.ordinal() > clause.ordinal() || (maxClause == clause && !nonWrapSet.contains(clause))) {
needNew = true;
}
}
So what happens is that "PROJECT" is added after "ORDER_BY" and the code above flags the need for a new query.
Changing SELECT order in SqlImplementor.Clause to the last fixes this issue but I don't know if there is any side effect or problem doing this.
I also noticed that after fixing all this, the JdbcAdapterTests.testColumnNullability was failing and the problem seems to be how JdbcAdapter is generating the SQL for HSQLDB now that limit clause is pushed to the database. The generated SQL is:
SELECT "employee_id", "position_id"
FROM "foodmart"."employee"
FETCH NEXT 10 ROWS ONLY
And the error is:
Caused by: org.hsqldb.HsqlException: unexpected token: NEXT : line: 3
Seems that HSQLDB has a problem with FETCH NEXT without OFFSET. One solution could be to change "NEXT" with "FIRST" in SqlPrettyWriter, but this is done to all the databases that support this feature, or could just make HSQLDB not support OffsetFetch in SqlDialect, so it would use LIMIT/OFFSET instead.
Sorry to bring all this together, but one thing led to another, and fixing one thing started to break other tests and I would really like to have this fixed.
Best regards,
Luis Fernando
Re: JdbcAdapter Sort Limit and Offset
Posted by Christian Beikov <ch...@gmail.com>.
I might not be qualified to answer how to handle limit/offset, but I'd
like to step in and say that we had a related discussion regarding where
to put sequence support. How about introducing unparse methods in the
SqlDialect for that stuff and let the dialects handle that however a
DBMS supports it. Since the dialect is constructed with the database
metadata it could even do further self configuration i.e. check if
configurations are activated on the DBMS that enable the use of a
certain capability.
If we want to support DBMSs that don't have a clause which we can use
for doing limit/offset we need to wrap the query and use whatever the
DBMS has to offer like e.g. ROWNUM.
I'd like to suggest we introduce a SqlDialectFactory which can be
configured on the JdbcSchema. The sole purpose of that factory is to
read the database metadata and construct an appropriate SqlDialect. The
default factory looks at the database metadata and constructs an
appropriate SqlDialect, although that dialect can then do some self
configuration with that metadata. We'd make the current SqlDialect
abstract and create implementations for at least every DBMS that we
support. We could still have components like e.g. a SequenceExtractor or
LimitOffsetRenderer, but the dialects choose the implementation based on
the database metadata.
What do you think?
Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 31.08.2017 um 02:23 schrieb Luis Fernando Kauer:
> Hi,
>
> I was really upset that Calcite wouldn't push limit keyword to the database, not
> even in simple queries like "select * from t limit 10".
> Actually, not even ORDER BY was pushed to the database even in simple queries.
> So I decided to take a look into it and opened Jira [CALCITE-1906] some time ago.
> I found out that SortRemoveRule is applied when there is a RelCollationTraitDef, because it assumes that the table is already ordered, so JdbcSortRule must replace the input's treatSet with RelCollations.EMPTY, just like MongoDB and Cassandra adapters do.
> Besides that, a simple fix to JdbcSortRule and JdbcSort allows it to push limit and fetch to the database.
> Unfortunately, after fixing locally this other issues appeared.
> When issuing a query with sort and no limit/fetch, like:
> SELECT empno, ename FROM "SCOTT"."EMP" where ename='CLARK' order by ename
> the selected plan is:
> JdbcToEnumerableConverter
> JdbcProject(EMPNO=[$0], ENAME=[$1])
> JdbcSort(sort0=[$1], dir0=[ASC])
> JdbcFilter(condition=[=($1, 'CLARK')])
> JdbcTableScan(table=[[SCOTT, EMP]]))Notice that JdbcProject is above JdbcSort and that's because SortProjectTransposeRule is applied.
> The problem is that the generated SQL is:
> SELECT "EMPNO", "ENAME"
> FROM (SELECT *
> FROM "SCOTT"."EMP"
> WHERE "ENAME" = 'CLARK'
> ORDER BY "ENAME" NULLS LAST) AS "t0"
> Noticed that the JdbcProject uses a subquery instead of being built at the same query level.
> The problem is that SqlImplementor.Clause is ordered and "SELECT" is before "ORDER_BY" and SqlImplementor.Builder has the following check:
> for (Clause clause : clauses) {
> if (maxClause.ordinal() > clause.ordinal() || (maxClause == clause && !nonWrapSet.contains(clause))) {
> needNew = true;
> }
> }
> So what happens is that "PROJECT" is added after "ORDER_BY" and the code above flags the need for a new query.
> Changing SELECT order in SqlImplementor.Clause to the last fixes this issue but I don't know if there is any side effect or problem doing this.
> I also noticed that after fixing all this, the JdbcAdapterTests.testColumnNullability was failing and the problem seems to be how JdbcAdapter is generating the SQL for HSQLDB now that limit clause is pushed to the database. The generated SQL is:
> SELECT "employee_id", "position_id"
> FROM "foodmart"."employee"
> FETCH NEXT 10 ROWS ONLY
> And the error is:
> Caused by: org.hsqldb.HsqlException: unexpected token: NEXT : line: 3
> Seems that HSQLDB has a problem with FETCH NEXT without OFFSET. One solution could be to change "NEXT" with "FIRST" in SqlPrettyWriter, but this is done to all the databases that support this feature, or could just make HSQLDB not support OffsetFetch in SqlDialect, so it would use LIMIT/OFFSET instead.
> Sorry to bring all this together, but one thing led to another, and fixing one thing started to break other tests and I would really like to have this fixed.
> Best regards,
> Luis Fernando