You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Miguel Oliveira <mi...@gmail.com> on 2016/03/04 17:24:05 UTC

Re: Sort push down rule on custom adapter

Hi,

Thanks for this tip, it helped a lot. Now, i have two connections for my
adapter: by jdbc and the by direct connection (like the mongo tests).
Query 1: "select LeadSource from MyAdapter.Opportunity order by LeadSource"

Plan with JDBC:
MyAdapterToEnumerableConverter: rowcount = 100.0, cumulative cost = {30.0
rows, 30.1 cpu, 0.0 io}, id = 135
  MyAdapterProject(LeadSource=[$1]): rowcount = 100.0, cumulative cost =
{20.0 rows, 20.1 cpu, 0.0 io}, id = 133
    MyAdapterTableScan(table=[[MyAdapter, Opportunity]]): rowcount = 100.0,
cumulative cost = {10.0 rows, 10.100000000000001 cpu, 0.0 io}, id = 105

Plan with direct connection:
MyAdapterToEnumerableConverter: rowcount = 1.0, cumulative cost = {12.0
rows, 10.350000000000001 cpu, 0.0 io}, id = 114
  MyAdapterProject(LeadSource=[$1]): rowcount = 1.0, cumulative cost =
{11.9 rows, 10.250000000000002 cpu, 0.0 io}, id = 112
    MyAdapterSort(sort0=[$1], dir0=[ASC], fetch=[1]): rowcount = 1.0,
cumulative cost = {11.8 rows, 10.150000000000002 cpu, 0.0 io}, id = 110
      MyAdapterTableScan(table=[[MyAdapter, Opportunity]]): rowcount =
100.0, cumulative cost = {10.0 rows, 10.100000000000001 cpu, 0.0 io}, id = 0

Now as you can see, the sort rule is pushed down only if the connection
isn't performed by JDBC and i would like to know what can make this happen.
I checked the logs but i didn't reach any conclusion. I know that the
SortRemoveRule (from the package org.apache.calcite.rel.rules) is also
triggered and it might remove the sort when the connection is performed by
JDBC.
Any suggestion?
Thanks in advance.

Best regards,
Bruno Miguel

2016-02-26 18:47 GMT+00:00 Julian Hyde <jh...@apache.org>:

> Turn on plan tracing and look at the final planner space. Hopefully you
> will see a plan with MyAdapterSort and another with EnumerableSort. Is the
> cumulative cost of EnumerableSort lower? If so, that would explain why
> Calcite is choosing it. You might need to tweak the cost function of
> MyAdapterSort to make sure that it is lower cost.
>
> Julian
>
> > On Feb 26, 2016, at 10:41 AM, Miguel Oliveira <
> migueloliveira1990@gmail.com> wrote:
> >
> > Hi,
> >
> > I’m implementing an adapter for Apache Calcite and I want to push down as
> > much processing as possible to the datasource side in order to minimize
> the
> > operations that are performed in memory. Despite of having a node named
> > MyAdapterSort and a rule to convert Sort to MyAdapterSort, my sorting
> > operation is not included in the plan and is always performed in memory.
> > What can be done to force the usage of my sort node implementation
> (instead
> > of falling back to EnumerableSort).
> > I‘m adding an example of my query (1) and the query (2) of a test with
> the
> > Mongo adapter, which as the same behaviour (projection and sort).
> >
> > Query 1: "select LeadSource from MyAdapter.Opportunity order by
> LeadSource"
> > PLAN=EnumerableSort(sort0=[$0], dir0=[ASC])
> > MyAdapterToEnumerableConverter
> > MyAdapterProject(LEADSOURCE=[$0])
> >  MyAdapterTableScan(table=[[MyAdapter, OPPORTUNITY]])
> >
> > Query 2: "select * from zips order by state limit 2"
> > PLAN=MongoToEnumerableConverter"
> > MongoSort(sort0=[$4], dir0=[ASC])\n"
> > MongoProject(CITY=[CAST(ITEM($0, 'city')):VARCHAR(20) CHARACTER SET
> > \"ISO-8859-1\" COLLATE \"ISO-8859-1$en_US$primary\"],
> > LONGITUDE=[CAST(ITEM(ITEM($0, 'loc'), 0)):FLOAT],
> > LATITUDE=[CAST(ITEM(ITEM($0, 'loc'), 1)):FLOAT], POP=[CAST(ITEM($0,
> > 'pop')):INTEGER], STATE=[CAST(ITEM($0, 'state')):VARCHAR(2) CHARACTER SET
> > \"ISO-8859-1\" COLLATE \"ISO-8859-1$en_US$primary\"], ID=[CAST(ITEM($0,
> > '_id')):VARCHAR(5) CHARACTER SET \"ISO-8859-1\" COLLATE
> > \"ISO-8859-1$en_US$primary\"])\n"
> > MongoTableScan(table=[[mongo_raw, zips]])");
> >
> > Best regards,
> > Bruno Miguel
>
>