You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@calcite.apache.org by Luis Fernando Kauer <lf...@yahoo.com.br.INVALID> on 2017/08/31 00:23:31 UTC

JdbcAdapter Sort Limit and Offset

Hi,

I was really upset that Calcite wouldn't push limit keyword to the database, not 
even in simple queries like "select * from t limit 10".
Actually, not even ORDER BY was pushed to the database even in simple queries.
So I decided to take a look into it and opened Jira [CALCITE-1906] some time ago.
I found out that SortRemoveRule is applied when there is a RelCollationTraitDef, because it assumes that the table is already ordered, so JdbcSortRule must replace the input's treatSet with RelCollations.EMPTY, just like MongoDB and Cassandra adapters do.
Besides that, a simple fix to JdbcSortRule and JdbcSort allows it to push limit and fetch to the database.
Unfortunately, after fixing locally this other issues appeared.
When issuing a query with sort and no limit/fetch, like:
SELECT empno, ename FROM "SCOTT"."EMP" where ename='CLARK' order by ename
the selected plan is:
JdbcToEnumerableConverter 
 JdbcProject(EMPNO=[$0], ENAME=[$1]) 
   JdbcSort(sort0=[$1], dir0=[ASC]) 
     JdbcFilter(condition=[=($1, 'CLARK')]) 
       JdbcTableScan(table=[[SCOTT, EMP]]))Notice that JdbcProject is above JdbcSort and that's because SortProjectTransposeRule is applied.
The problem is that the generated SQL is:
SELECT "EMPNO", "ENAME"
FROM (SELECT *
FROM "SCOTT"."EMP"
WHERE "ENAME" = 'CLARK'
ORDER BY "ENAME" NULLS LAST) AS "t0"
Noticed that the JdbcProject uses a subquery instead of being built at the same query level.
The problem is that SqlImplementor.Clause is ordered and "SELECT" is before "ORDER_BY" and SqlImplementor.Builder has the following check:
for (Clause clause : clauses) { 
  if (maxClause.ordinal() > clause.ordinal() || (maxClause == clause && !nonWrapSet.contains(clause))) { 
    needNew = true; 
  } 
}
So what happens is that "PROJECT" is added after "ORDER_BY" and the code above flags the need for a new query.
Changing SELECT order in SqlImplementor.Clause to the last fixes this issue but I don't know if there is any side effect or problem doing this.
I also noticed that after fixing all this, the JdbcAdapterTests.testColumnNullability was failing and the problem seems to be how JdbcAdapter is generating the SQL for HSQLDB now that limit clause is pushed to the database. The generated SQL is:
SELECT "employee_id", "position_id" 
FROM "foodmart"."employee" 
FETCH NEXT 10 ROWS ONLY
And the error is:
Caused by: org.hsqldb.HsqlException: unexpected token: NEXT : line: 3
Seems that HSQLDB has a problem with FETCH NEXT without OFFSET. One solution could be to change "NEXT" with "FIRST" in SqlPrettyWriter, but this is done to all the databases that support this feature, or could just make HSQLDB not support OffsetFetch in SqlDialect, so it would use LIMIT/OFFSET instead.
Sorry to bring all this together, but one thing led to another, and fixing one thing started to break other tests and I would really like to have this fixed.
Best regards,
Luis Fernando

Re: JdbcAdapter Sort Limit and Offset

Posted by Christian Beikov <ch...@gmail.com>.

I might not be qualified to answer how to handle limit/offset, but I'd 
like to step in and say that we had a related discussion regarding where 
to put sequence support. How about introducing unparse methods in the 
SqlDialect for that stuff and let the dialects handle that however a 
DBMS supports it. Since the dialect is constructed with the database 
metadata it could even do further self configuration i.e. check if 
configurations are activated on the DBMS that enable the use of a 
certain capability.

If we want to support DBMSs that don't have a clause which we can use 
for doing limit/offset we need to wrap the query and use whatever the 
DBMS has to offer like e.g. ROWNUM.

I'd like to suggest we introduce a SqlDialectFactory which can be 
configured on the JdbcSchema. The sole purpose of that factory is to 
read the database metadata and construct an appropriate SqlDialect. The 
default factory looks at the database metadata and constructs an 
appropriate SqlDialect, although that dialect can then do some self 
configuration with that metadata. We'd make the current SqlDialect 
abstract and create implementations for at least every DBMS that we 
support. We could still have components like e.g. a SequenceExtractor or 
LimitOffsetRenderer, but the dialects choose the implementation based on 
the database metadata.

What do you think?


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 31.08.2017 um 02:23 schrieb Luis Fernando Kauer:
> Hi,
>
> I was really upset that Calcite wouldn't push limit keyword to the database, not
> even in simple queries like "select * from t limit 10".
> Actually, not even ORDER BY was pushed to the database even in simple queries.
> So I decided to take a look into it and opened Jira [CALCITE-1906] some time ago.
> I found out that SortRemoveRule is applied when there is a RelCollationTraitDef, because it assumes that the table is already ordered, so JdbcSortRule must replace the input's treatSet with RelCollations.EMPTY, just like MongoDB and Cassandra adapters do.
> Besides that, a simple fix to JdbcSortRule and JdbcSort allows it to push limit and fetch to the database.
> Unfortunately, after fixing locally this other issues appeared.
> When issuing a query with sort and no limit/fetch, like:
> SELECT empno, ename FROM "SCOTT"."EMP" where ename='CLARK' order by ename
> the selected plan is:
> JdbcToEnumerableConverter
>   JdbcProject(EMPNO=[$0], ENAME=[$1])
>     JdbcSort(sort0=[$1], dir0=[ASC])
>       JdbcFilter(condition=[=($1, 'CLARK')])
>         JdbcTableScan(table=[[SCOTT, EMP]]))Notice that JdbcProject is above JdbcSort and that's because SortProjectTransposeRule is applied.
> The problem is that the generated SQL is:
> SELECT "EMPNO", "ENAME"
> FROM (SELECT *
> FROM "SCOTT"."EMP"
> WHERE "ENAME" = 'CLARK'
> ORDER BY "ENAME" NULLS LAST) AS "t0"
> Noticed that the JdbcProject uses a subquery instead of being built at the same query level.
> The problem is that SqlImplementor.Clause is ordered and "SELECT" is before "ORDER_BY" and SqlImplementor.Builder has the following check:
> for (Clause clause : clauses) {
>    if (maxClause.ordinal() > clause.ordinal() || (maxClause == clause && !nonWrapSet.contains(clause))) {
>      needNew = true;
>    }
> }
> So what happens is that "PROJECT" is added after "ORDER_BY" and the code above flags the need for a new query.
> Changing SELECT order in SqlImplementor.Clause to the last fixes this issue but I don't know if there is any side effect or problem doing this.
> I also noticed that after fixing all this, the JdbcAdapterTests.testColumnNullability was failing and the problem seems to be how JdbcAdapter is generating the SQL for HSQLDB now that limit clause is pushed to the database. The generated SQL is:
> SELECT "employee_id", "position_id"
> FROM "foodmart"."employee"
> FETCH NEXT 10 ROWS ONLY
> And the error is:
> Caused by: org.hsqldb.HsqlException: unexpected token: NEXT : line: 3
> Seems that HSQLDB has a problem with FETCH NEXT without OFFSET. One solution could be to change "NEXT" with "FIRST" in SqlPrettyWriter, but this is done to all the databases that support this feature, or could just make HSQLDB not support OffsetFetch in SqlDialect, so it would use LIMIT/OFFSET instead.
> Sorry to bring all this together, but one thing led to another, and fixing one thing started to break other tests and I would really like to have this fixed.
> Best regards,
> Luis Fernando