You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Gian Merlino <gi...@imply.io> on 2017/01/13 17:02:19 UTC

Re: Embedding Calcite, adjusting convertlets

This was totally user error, I was ignoring "fields" and "collation" of the
RelRoot. Passing "collation" to planner.transform stopped it from removing
the sort.

Gian

On Wed, Dec 14, 2016 at 3:21 PM, Gian Merlino <gi...@imply.io> wrote:

> I don't think so, since my test query (SELECT dim1 FROM s.foo GROUP BY
> dim1 ORDER BY dim1 DESC) is sorting on a column that is also projected.
>
> Gian
>
> On Wed, Dec 14, 2016 at 11:58 AM, Julian Hyde <jh...@apache.org> wrote:
>
>> Are you running into some variant of the problems that inspired
>> https://issues.apache.org/jira/browse/CALCITE-819: <
>> https://issues.apache.org/jira/browse/CALCITE-819:> at the root of the
>> tree, columns that are not projected are removed, and if the desired sort
>> order involves non-projected columns, the desired sort order is forgotten).
>>
>> > On Dec 14, 2016, at 11:19 AM, Gian Merlino <gi...@imply.io> wrote:
>> >
>> > Ah, thanks. So if that sort of thing is not a smoking gun, do you have
>> an
>> > idea about where I should look next? If not I'll keep poking around.
>> >
>> > Gian
>> >
>> > On Wed, Dec 14, 2016 at 11:06 AM, Julian Hyde <jh...@apache.org> wrote:
>> >
>> >>> - But its "set" field points to a RelSet with "rels" that _don't_ have
>> >>> _any_ collation traits.
>> >>
>> >> That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are
>> >> logically and physically equivalent (same results, same physical
>> >> properties) whereas a “set” (RelSet) is a collection of RelNodes that
>> are
>> >> logically equivalent.
>> >>
>> >> A set can therefore be considered to be a collection of subsets, each
>> of
>> >> which contains RelNodes. And it used to be implemented that way, but in
>> >> https://issues.apache.org/jira/browse/CALCITE-88 <
>> >> https://issues.apache.org/jira/browse/CALCITE-88> we introduced
>> collation
>> >> as a trait, and that made subsets non-disjoint (a RelNode can be
>> sorted on
>> >> (x, y), and also on (x), and also on (), and also on (z)) so we made
>> >> RelSubset just a view onto a RelSet, filtering the list of RelNodes
>> >> according to the ones that have (“subsume”) the desired traits.
>> >>
>> >> Julian
>> >>
>> >>
>> >>> On Dec 14, 2016, at 10:45 AM, Gian Merlino <gi...@imply.io> wrote:
>> >>>
>> >>> I spent some more time looking into (3) and found that when I had
>> things
>> >>> going through the Planner rather than the JDBC driver, SortRemoveRule
>> was
>> >>> removing sorts when it shouldn't have been. This happens even for
>> simple
>> >>> queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1
>> >>> DESC". Removing SortRemoveRule from the planner fixed the broken
>> tests on
>> >>> my end.
>> >>>
>> >>> I dug into that a bit and saw that the call to
>> "convert(sort.getInput(),
>> >>> traits)" in SortRemoveRule was returning a RelSubset that looked a bit
>> >>> funny in the debugger:
>> >>>
>> >>> - The RelSubset's "traitSet" _does_ have the proper collation trait.
>> >>> - But its "set" field points to a RelSet with "rels" that _don't_ have
>> >>> _any_ collation traits.
>> >>>
>> >>> From what I understand that causes Calcite to treat the unsorted and
>> >> sorted
>> >>> rels as equivalent when they in fact aren't. I'm still not sure if
>> this
>> >> is
>> >>> a Calcite bug or user error on my part… I'll keep looking into it
>> unless
>> >>> someone has any bright ideas.
>> >>>
>> >>> fwiw, my Planner construction looks like this:
>> >>>
>> >>>   final FrameworkConfig frameworkConfig = Frameworks
>> >>>       .newConfigBuilder()
>> >>>       .parserConfig(
>> >>>           SqlParser.configBuilder()
>> >>>                    .setCaseSensitive(true)
>> >>>                    .setUnquotedCasing(Casing.UNCHANGED)
>> >>>                    .build()
>> >>>       )
>> >>>       .defaultSchema(rootSchema)
>> >>>       .traitDefs(ConventionTraitDef.INSTANCE,
>> >>> RelCollationTraitDef.INSTANCE)
>> >>>       .programs(Programs.ofRules(myRules))
>> >>>       .executor(new RexExecutorImpl(Schemas.createDataContext(null)))
>> >>>       .context(Contexts.EMPTY_CONTEXT)
>> >>>       .build();
>> >>>
>> >>>   return Frameworks.getPlanner(frameworkConfig);
>> >>>
>> >>> Gian
>> >>>
>> >>> On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino <gi...@imply.io> wrote:
>> >>>
>> >>>> Sure, I added those first two to the ticket.
>> >>>>
>> >>>> I don't think those are happening with (3) but I'll double check next
>> >> time
>> >>>> I take a look at using the Planner.
>> >>>>
>> >>>> Gian
>> >>>>
>> >>>> On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde <jh...@apache.org>
>> wrote:
>> >>>>
>> >>>>> Can you please add (1) and (2) to https://issues.apache.org/jira
>> >>>>> /browse/CALCITE-1525 <https://issues.apache.org/jir
>> >> a/browse/CALCITE-1525>,
>> >>>>> which deals with the whole issue of using “Planner” within the JDBC
>> >> driver,
>> >>>>> so we can be consistent.
>> >>>>>
>> >>>>> (3) doesn’t look likely to be related. Do your queries have UNION or
>> >>>>> other set-ops? Are you sorting on columns that do not appear in the
>> >> final
>> >>>>> result?
>> >>>>>
>> >>>>> Julian
>> >>>>>
>> >>>>>
>> >>>>>> On Nov 28, 2016, at 10:45 AM, Gian Merlino <gi...@imply.io> wrote:
>> >>>>>>
>> >>>>>> I traveled a bit down the Frameworks/Planner road and got most of
>> my
>> >>>>> tests
>> >>>>>> passing, but ran into some problems getting them all to work:
>> >>>>>>
>> >>>>>> (1) "EXPLAIN PLAN FOR" throws NullPointerException during
>> >>>>> Planner.validate.
>> >>>>>> It looks like CalcitePrepareImpl has some special code to handle
>> >>>>> validation
>> >>>>>> of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is
>> >> something I
>> >>>>>> should be doing on my end, or if it's a bug in PlannerImpl.
>> >>>>>> (2) I don't see a way to do ?-style prepared statements with bound
>> >>>>>> variables, which _is_ possible with the JDBC driver route.
>> >>>>>> (3) Not sure why this is happening, but for some reason ORDER BY /
>> >> LIMIT
>> >>>>>> clauses are getting ignored sometimes, even when they work with the
>> >> JDBC
>> >>>>>> driver route. This may be something messed up with my rules though
>> and
>> >>>>> may
>> >>>>>> not be Calcite's fault.
>> >>>>>>
>> >>>>>> Julian, do any of these look like bugs that should be raised in
>> jira,
>> >> or
>> >>>>>> are they just stuff I should be dealing with on my side?
>> >>>>>>
>> >>>>>> Btw, I do like that the Frameworks/Planner route gives me back the
>> >>>>> RelNode
>> >>>>>> itself, since that means I can make the Druid queries directly
>> without
>> >>>>>> needing to go through the extra layers of the JDBC driver. That
>> part
>> >> is
>> >>>>>> nice.
>> >>>>>>
>> >>>>>> Gian
>> >>>>>>
>> >>>>>> On Wed, Nov 23, 2016 at 10:11 PM, Julian Hyde <jh...@apache.org>
>> >> wrote:
>> >>>>>>
>> >>>>>>> I don’t know how it’s used outside Calcite. Maybe some others can
>> >>>>> chime in.
>> >>>>>>>
>> >>>>>>> Thanks for the PR. I logged https://issues.apache.org/jira
>> >>>>>>> /browse/CALCITE-1509 <https://issues.apache.org/jir
>> >>>>> a/browse/CALCITE-1509>
>> >>>>>>> for it, and will commit shortly.
>> >>>>>>>
>> >>>>>>> Julian
>> >>>>>>>
>> >>>>>>>> On Nov 23, 2016, at 12:32 PM, Gian Merlino <gi...@imply.io>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> Do you know examples of projects that use Planner or PlannerImpl
>> >>>>>>> currently
>> >>>>>>>> (from "outside")? As far as I can tell, within Calcite itself
>> it's
>> >>>>> only
>> >>>>>>>> used in test code. Maybe that'd be a better entry point.
>> >>>>>>>>
>> >>>>>>>> In the meantime I raised a PR here for allowing a convertlet
>> table
>> >>>>>>> override
>> >>>>>>>> in a CalcitePrepareImpl: https://github.com/apache/calc
>> ite/pull/330
>> >> .
>> >>>>>>> That
>> >>>>>>>> was enough to get the JDBC driver on my end to behave how I want
>> it
>> >>>>> to.
>> >>>>>>>>
>> >>>>>>>> Gian
>> >>>>>>>>
>> >>>>>>>> On Thu, Nov 17, 2016 at 5:23 PM, Julian Hyde <jh...@apache.org>
>> >>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> I was wrong earlier… FrameworkConfig already has a
>> >> getConvertletTable
>> >>>>>>>>> method. But regarding using FrameworkConfig from within the JDBC
>> >>>>> driver,
>> >>>>>>>>> It’s complicated. FrameworkConfig only works if you are
>> “outside”
>> >>>>>>> Calcite,
>> >>>>>>>>> whereas CalcitePrepare is when you are customizing from the
>> inside,
>> >>>>> and
>> >>>>>>>>> sadly CalcitePrepare does not use a FrameworkConfig.
>> >>>>>>>>>
>> >>>>>>>>> Compare and contrast:
>> >>>>>>>>> * CalcitePrepareImpl.getSqlToRelConverter [
>> >>>>> https://github.com/apache/
>> >>>>>>>>> calcite/blob/3f92157d5742dd10f3b828d22d7a753e0a2899cc/core/
>> >>>>>>> src/main/java/
>> >>>>>>>>> org/apache/calcite/prepare/CalcitePrepareImpl.java#L1114 <
>> >>>>>>>>> https://github.com/apache/calcite/blob/3f92157d5742dd10f3b82
>> >>>>> 8d22d7a75
>> >>>>>>>>> 3e0a2899cc/core/src/main/java/org/apache/calcite/prepare/
>> >>>>>>>>> CalcitePrepareImpl.java#L1114> ]
>> >>>>>>>>> * PlannerImpl.rel [ https://github.com/apache/calcite/blob/
>> >>>>>>>>> 105bba1f83cd9631e8e1211d262e4886a4a863b7/core/src/main/java/
>> >>>>>>>>> org/apache/calcite/prepare/PlannerImpl.java#L225 <
>> >>>>>>>>> https://github.com/apache/calcite/blob/105bba1f83cd9631e8e12
>> >>>>> 11d262e48
>> >>>>>>>>> 86a4a863b7/core/src/main/java/org/apache/calcite/prepare/
>> >>>>>>>>> PlannerImpl.java#L225> ]
>> >>>>>>>>>
>> >>>>>>>>> The latter uses a convertletTable sourced from a
>> FrameworkConfig.
>> >>>>>>>>>
>> >>>>>>>>> The ideal thing would be to get CalcitePrepareImpl to use a
>> >>>>> PlannerImpl
>> >>>>>>> to
>> >>>>>>>>> do its dirty work. Then “inside” and “outside” would work the
>> same.
>> >>>>>>> Would
>> >>>>>>>>> definitely appreciate that as a patch.
>> >>>>>>>>>
>> >>>>>>>>> If you choose to go the JDBC driver route, you could override
>> >>>>>>>>> Driver.createPrepareFactory to produce a sub-class of
>> >> CalcitePrepare
>> >>>>>>> that
>> >>>>>>>>> works for your environment, one with an explicit convertletTable
>> >>>>> rather
>> >>>>>>>>> than just using the default.
>> >>>>>>>>>
>> >>>>>>>>> Julian
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>> On Nov 17, 2016, at 5:01 PM, Gian Merlino <gi...@imply.io>
>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> Hey Julian,
>> >>>>>>>>>>
>> >>>>>>>>>> If the convertlets were customizable with a FrameworkConfig,
>> how
>> >>>>> would
>> >>>>>>> I
>> >>>>>>>>>> use that configure the JDBC driver (given that I'm doing it
>> with
>> >> the
>> >>>>>>> code
>> >>>>>>>>>> upthread)? Or would that suggest using a different approach to
>> >>>>>>> embedding
>> >>>>>>>>>> Calcite?
>> >>>>>>>>>>
>> >>>>>>>>>> Gian
>> >>>>>>>>>>
>> >>>>>>>>>> On Thu, Nov 17, 2016 at 4:02 PM, Julian Hyde <jhyde@apache.org
>> >
>> >>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Convertlets have a similar effect to planner rules (albeit
>> they
>> >>>>> act on
>> >>>>>>>>>>> scalar expressions, not relational expressions) so people
>> should
>> >> be
>> >>>>>>>>> able to
>> >>>>>>>>>>> change the set of active convertlets.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Would you like to propose a change that makes the convertlet
>> >> table
>> >>>>>>>>>>> pluggable? Maybe as part of FrameworkConfig? Regardless,
>> please
>> >>>>> log a
>> >>>>>>>>> JIRA
>> >>>>>>>>>>> to track this.
>> >>>>>>>>>>>
>> >>>>>>>>>>> And by the way, RexImpTable, which defines how operators are
>> >>>>>>> implemented
>> >>>>>>>>>>> by generating java code, should also be pluggable. It’s been
>> on
>> >> my
>> >>>>>>> mind
>> >>>>>>>>> for
>> >>>>>>>>>>> a long time to allow the “engine” — related to the data
>> format,
>> >> and
>> >>>>>>> how
>> >>>>>>>>>>> code is generated to access fields and evaluate expressions
>> and
>> >>>>>>>>> operators —
>> >>>>>>>>>>> to be pluggable.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Regarding whether the JDBC driver is the right way to embed
>> >>>>> Calcite.
>> >>>>>>>>>>> There’s no easy answer. You might want to embed Calcite as a
>> >>>>> library
>> >>>>>>> in
>> >>>>>>>>>>> your own server (as Drill and Hive do). Or you might want to
>> make
>> >>>>>>>>> yourself
>> >>>>>>>>>>> just an adapter that runs inside a Calcite JDBC server (as the
>> >> CSV
>> >>>>>>>>> adapter
>> >>>>>>>>>>> does). Or something in the middle, like what Phoenix does:
>> using
>> >>>>>>> Calcite
>> >>>>>>>>>>> for JDBC, SQL, planning, but with your own metadata and
>> runtime
>> >>>>>>> engine.
>> >>>>>>>>>>>
>> >>>>>>>>>>> As long as you build the valuable stuff into planner rules,
>> new
>> >>>>>>>>> relational
>> >>>>>>>>>>> operators (if necessary) and use the schema SPI, you should be
>> >>>>> able to
>> >>>>>>>>>>> change packaging in the future.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Julian
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>> On Nov 17, 2016, at 1:59 PM, Gian Merlino <gi...@imply.io>
>> >> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Hey Calcites,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I'm working on embedding Calcite into Druid (
>> http://druid.io/,
>> >>>>>>>>>>>> https://github.com/druid-io/druid/pull/3682) and am running
>> >> into
>> >>>>> a
>> >>>>>>>>>>> problem
>> >>>>>>>>>>>> that is making me wonder if the approach I'm using makes
>> sense.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Consider the expression EXTRACT(YEAR FROM __time). Calcite
>> has a
>> >>>>>>>>> standard
>> >>>>>>>>>>>> convertlet rule "convertExtract" that changes this into some
>> >>>>>>> arithmetic
>> >>>>>>>>>>> on
>> >>>>>>>>>>>> __time casted to an int type. But Druid has some builtin
>> >>>>> functions to
>> >>>>>>>>> do
>> >>>>>>>>>>>> this, and I'd rather use those than arithmetic (for a bunch
>> of
>> >>>>>>>>> reasons).
>> >>>>>>>>>>>> Ideally, in my RelOptRules that convert Calcite rels to Druid
>> >>>>>>> queries,
>> >>>>>>>>>>> I'd
>> >>>>>>>>>>>> see the EXTRACT as a normal RexCall with the time flag and an
>> >>>>>>>>> expression
>> >>>>>>>>>>> to
>> >>>>>>>>>>>> apply it to. That's a lot easier to translate than the
>> >> arithmetic
>> >>>>>>>>> stuff,
>> >>>>>>>>>>>> which I'd have to pattern match and undo first before
>> >> translating.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> So the problem I have is that I want to disable
>> convertExtract,
>> >>>>> but I
>> >>>>>>>>>>> don't
>> >>>>>>>>>>>> see a way to do that or to swap out the convertlet table.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The code I'm using to set up a connection is:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> public CalciteConnection createCalciteConnection(
>> >>>>>>>>>>>> final DruidSchema druidSchema
>> >>>>>>>>>>>> ) throws SQLException
>> >>>>>>>>>>>> {
>> >>>>>>>>>>>> final Properties props = new Properties();
>> >>>>>>>>>>>> props.setProperty("caseSensitive", "true");
>> >>>>>>>>>>>> props.setProperty("unquotedCasing", "UNCHANGED");
>> >>>>>>>>>>>> final Connection connection =
>> >>>>>>>>>>>> DriverManager.getConnection("jdbc:calcite:", props);
>> >>>>>>>>>>>> final CalciteConnection calciteConnection =
>> >>>>>>>>>>>> connection.unwrap(CalciteConnection.class);
>> >>>>>>>>>>>> calciteConnection.getRootSchema().setCacheEnabled(false);
>> >>>>>>>>>>>> calciteConnection.getRootSchema().add(DRUID_SCHEMA_NAME,
>> >>>>>>>>>>> druidSchema);
>> >>>>>>>>>>>> return calciteConnection;
>> >>>>>>>>>>>> }
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> This CalciteConnection is then used by the Druid HTTP server
>> to
>> >>>>>>> offer a
>> >>>>>>>>>>> SQL
>> >>>>>>>>>>>> API.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Is there some way to swap out the convertlet table that I'm
>> >>>>> missing?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Also, just in general, am I going about this the right way?
>> Is
>> >>>>> using
>> >>>>>>>>> the
>> >>>>>>>>>>>> JDBC driver the right way to embed Calcite? Or should I be
>> >> calling
>> >>>>>>> into
>> >>>>>>>>>>> it
>> >>>>>>>>>>>> at some lower level?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Thanks!
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Gian
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>
>> >>
>>
>>
>

Re: Embedding Calcite, adjusting convertlets

Posted by Julian Hyde <jh...@apache.org>.
Ah, thanks for letting me know. RelRoot is tricky; because it is not a
RelNode, you are working outside the usual mechanism for sorts and
projects, hence it is error-prone, as you have discovered.

On Fri, Jan 13, 2017 at 9:02 AM, Gian Merlino <gi...@imply.io> wrote:
> This was totally user error, I was ignoring "fields" and "collation" of the
> RelRoot. Passing "collation" to planner.transform stopped it from removing
> the sort.
>
> Gian
>
> On Wed, Dec 14, 2016 at 3:21 PM, Gian Merlino <gi...@imply.io> wrote:
>
>> I don't think so, since my test query (SELECT dim1 FROM s.foo GROUP BY
>> dim1 ORDER BY dim1 DESC) is sorting on a column that is also projected.
>>
>> Gian
>>
>> On Wed, Dec 14, 2016 at 11:58 AM, Julian Hyde <jh...@apache.org> wrote:
>>
>>> Are you running into some variant of the problems that inspired
>>> https://issues.apache.org/jira/browse/CALCITE-819: <
>>> https://issues.apache.org/jira/browse/CALCITE-819:> at the root of the
>>> tree, columns that are not projected are removed, and if the desired sort
>>> order involves non-projected columns, the desired sort order is forgotten).
>>>
>>> > On Dec 14, 2016, at 11:19 AM, Gian Merlino <gi...@imply.io> wrote:
>>> >
>>> > Ah, thanks. So if that sort of thing is not a smoking gun, do you have
>>> an
>>> > idea about where I should look next? If not I'll keep poking around.
>>> >
>>> > Gian
>>> >
>>> > On Wed, Dec 14, 2016 at 11:06 AM, Julian Hyde <jh...@apache.org> wrote:
>>> >
>>> >>> - But its "set" field points to a RelSet with "rels" that _don't_ have
>>> >>> _any_ collation traits.
>>> >>
>>> >> That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are
>>> >> logically and physically equivalent (same results, same physical
>>> >> properties) whereas a “set” (RelSet) is a collection of RelNodes that
>>> are
>>> >> logically equivalent.
>>> >>
>>> >> A set can therefore be considered to be a collection of subsets, each
>>> of
>>> >> which contains RelNodes. And it used to be implemented that way, but in
>>> >> https://issues.apache.org/jira/browse/CALCITE-88 <
>>> >> https://issues.apache.org/jira/browse/CALCITE-88> we introduced
>>> collation
>>> >> as a trait, and that made subsets non-disjoint (a RelNode can be
>>> sorted on
>>> >> (x, y), and also on (x), and also on (), and also on (z)) so we made
>>> >> RelSubset just a view onto a RelSet, filtering the list of RelNodes
>>> >> according to the ones that have (“subsume”) the desired traits.
>>> >>
>>> >> Julian
>>> >>
>>> >>
>>> >>> On Dec 14, 2016, at 10:45 AM, Gian Merlino <gi...@imply.io> wrote:
>>> >>>
>>> >>> I spent some more time looking into (3) and found that when I had
>>> things
>>> >>> going through the Planner rather than the JDBC driver, SortRemoveRule
>>> was
>>> >>> removing sorts when it shouldn't have been. This happens even for
>>> simple
>>> >>> queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1
>>> >>> DESC". Removing SortRemoveRule from the planner fixed the broken
>>> tests on
>>> >>> my end.
>>> >>>
>>> >>> I dug into that a bit and saw that the call to
>>> "convert(sort.getInput(),
>>> >>> traits)" in SortRemoveRule was returning a RelSubset that looked a bit
>>> >>> funny in the debugger:
>>> >>>
>>> >>> - The RelSubset's "traitSet" _does_ have the proper collation trait.
>>> >>> - But its "set" field points to a RelSet with "rels" that _don't_ have
>>> >>> _any_ collation traits.
>>> >>>
>>> >>> From what I understand that causes Calcite to treat the unsorted and
>>> >> sorted
>>> >>> rels as equivalent when they in fact aren't. I'm still not sure if
>>> this
>>> >> is
>>> >>> a Calcite bug or user error on my part… I'll keep looking into it
>>> unless
>>> >>> someone has any bright ideas.
>>> >>>
>>> >>> fwiw, my Planner construction looks like this:
>>> >>>
>>> >>>   final FrameworkConfig frameworkConfig = Frameworks
>>> >>>       .newConfigBuilder()
>>> >>>       .parserConfig(
>>> >>>           SqlParser.configBuilder()
>>> >>>                    .setCaseSensitive(true)
>>> >>>                    .setUnquotedCasing(Casing.UNCHANGED)
>>> >>>                    .build()
>>> >>>       )
>>> >>>       .defaultSchema(rootSchema)
>>> >>>       .traitDefs(ConventionTraitDef.INSTANCE,
>>> >>> RelCollationTraitDef.INSTANCE)
>>> >>>       .programs(Programs.ofRules(myRules))
>>> >>>       .executor(new RexExecutorImpl(Schemas.createDataContext(null)))
>>> >>>       .context(Contexts.EMPTY_CONTEXT)
>>> >>>       .build();
>>> >>>
>>> >>>   return Frameworks.getPlanner(frameworkConfig);
>>> >>>
>>> >>> Gian
>>> >>>
>>> >>> On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino <gi...@imply.io> wrote:
>>> >>>
>>> >>>> Sure, I added those first two to the ticket.
>>> >>>>
>>> >>>> I don't think those are happening with (3) but I'll double check next
>>> >> time
>>> >>>> I take a look at using the Planner.
>>> >>>>
>>> >>>> Gian
>>> >>>>
>>> >>>> On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde <jh...@apache.org>
>>> wrote:
>>> >>>>
>>> >>>>> Can you please add (1) and (2) to https://issues.apache.org/jira
>>> >>>>> /browse/CALCITE-1525 <https://issues.apache.org/jir
>>> >> a/browse/CALCITE-1525>,
>>> >>>>> which deals with the whole issue of using “Planner” within the JDBC
>>> >> driver,
>>> >>>>> so we can be consistent.
>>> >>>>>
>>> >>>>> (3) doesn’t look likely to be related. Do your queries have UNION or
>>> >>>>> other set-ops? Are you sorting on columns that do not appear in the
>>> >> final
>>> >>>>> result?
>>> >>>>>
>>> >>>>> Julian
>>> >>>>>
>>> >>>>>
>>> >>>>>> On Nov 28, 2016, at 10:45 AM, Gian Merlino <gi...@imply.io> wrote:
>>> >>>>>>
>>> >>>>>> I traveled a bit down the Frameworks/Planner road and got most of
>>> my
>>> >>>>> tests
>>> >>>>>> passing, but ran into some problems getting them all to work:
>>> >>>>>>
>>> >>>>>> (1) "EXPLAIN PLAN FOR" throws NullPointerException during
>>> >>>>> Planner.validate.
>>> >>>>>> It looks like CalcitePrepareImpl has some special code to handle
>>> >>>>> validation
>>> >>>>>> of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is
>>> >> something I
>>> >>>>>> should be doing on my end, or if it's a bug in PlannerImpl.
>>> >>>>>> (2) I don't see a way to do ?-style prepared statements with bound
>>> >>>>>> variables, which _is_ possible with the JDBC driver route.
>>> >>>>>> (3) Not sure why this is happening, but for some reason ORDER BY /
>>> >> LIMIT
>>> >>>>>> clauses are getting ignored sometimes, even when they work with the
>>> >> JDBC
>>> >>>>>> driver route. This may be something messed up with my rules though
>>> and
>>> >>>>> may
>>> >>>>>> not be Calcite's fault.
>>> >>>>>>
>>> >>>>>> Julian, do any of these look like bugs that should be raised in
>>> jira,
>>> >> or
>>> >>>>>> are they just stuff I should be dealing with on my side?
>>> >>>>>>
>>> >>>>>> Btw, I do like that the Frameworks/Planner route gives me back the
>>> >>>>> RelNode
>>> >>>>>> itself, since that means I can make the Druid queries directly
>>> without
>>> >>>>>> needing to go through the extra layers of the JDBC driver. That
>>> part
>>> >> is
>>> >>>>>> nice.
>>> >>>>>>
>>> >>>>>> Gian
>>> >>>>>>
>>> >>>>>> On Wed, Nov 23, 2016 at 10:11 PM, Julian Hyde <jh...@apache.org>
>>> >> wrote:
>>> >>>>>>
>>> >>>>>>> I don’t know how it’s used outside Calcite. Maybe some others can
>>> >>>>> chime in.
>>> >>>>>>>
>>> >>>>>>> Thanks for the PR. I logged https://issues.apache.org/jira
>>> >>>>>>> /browse/CALCITE-1509 <https://issues.apache.org/jir
>>> >>>>> a/browse/CALCITE-1509>
>>> >>>>>>> for it, and will commit shortly.
>>> >>>>>>>
>>> >>>>>>> Julian
>>> >>>>>>>
>>> >>>>>>>> On Nov 23, 2016, at 12:32 PM, Gian Merlino <gi...@imply.io>
>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>> Do you know examples of projects that use Planner or PlannerImpl
>>> >>>>>>> currently
>>> >>>>>>>> (from "outside")? As far as I can tell, within Calcite itself
>>> it's
>>> >>>>> only
>>> >>>>>>>> used in test code. Maybe that'd be a better entry point.
>>> >>>>>>>>
>>> >>>>>>>> In the meantime I raised a PR here for allowing a convertlet
>>> table
>>> >>>>>>> override
>>> >>>>>>>> in a CalcitePrepareImpl: https://github.com/apache/calc
>>> ite/pull/330
>>> >> .
>>> >>>>>>> That
>>> >>>>>>>> was enough to get the JDBC driver on my end to behave how I want
>>> it
>>> >>>>> to.
>>> >>>>>>>>
>>> >>>>>>>> Gian
>>> >>>>>>>>
>>> >>>>>>>> On Thu, Nov 17, 2016 at 5:23 PM, Julian Hyde <jh...@apache.org>
>>> >>>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>>> I was wrong earlier… FrameworkConfig already has a
>>> >> getConvertletTable
>>> >>>>>>>>> method. But regarding using FrameworkConfig from within the JDBC
>>> >>>>> driver,
>>> >>>>>>>>> It’s complicated. FrameworkConfig only works if you are
>>> “outside”
>>> >>>>>>> Calcite,
>>> >>>>>>>>> whereas CalcitePrepare is when you are customizing from the
>>> inside,
>>> >>>>> and
>>> >>>>>>>>> sadly CalcitePrepare does not use a FrameworkConfig.
>>> >>>>>>>>>
>>> >>>>>>>>> Compare and contrast:
>>> >>>>>>>>> * CalcitePrepareImpl.getSqlToRelConverter [
>>> >>>>> https://github.com/apache/
>>> >>>>>>>>> calcite/blob/3f92157d5742dd10f3b828d22d7a753e0a2899cc/core/
>>> >>>>>>> src/main/java/
>>> >>>>>>>>> org/apache/calcite/prepare/CalcitePrepareImpl.java#L1114 <
>>> >>>>>>>>> https://github.com/apache/calcite/blob/3f92157d5742dd10f3b82
>>> >>>>> 8d22d7a75
>>> >>>>>>>>> 3e0a2899cc/core/src/main/java/org/apache/calcite/prepare/
>>> >>>>>>>>> CalcitePrepareImpl.java#L1114> ]
>>> >>>>>>>>> * PlannerImpl.rel [ https://github.com/apache/calcite/blob/
>>> >>>>>>>>> 105bba1f83cd9631e8e1211d262e4886a4a863b7/core/src/main/java/
>>> >>>>>>>>> org/apache/calcite/prepare/PlannerImpl.java#L225 <
>>> >>>>>>>>> https://github.com/apache/calcite/blob/105bba1f83cd9631e8e12
>>> >>>>> 11d262e48
>>> >>>>>>>>> 86a4a863b7/core/src/main/java/org/apache/calcite/prepare/
>>> >>>>>>>>> PlannerImpl.java#L225> ]
>>> >>>>>>>>>
>>> >>>>>>>>> The latter uses a convertletTable sourced from a
>>> FrameworkConfig.
>>> >>>>>>>>>
>>> >>>>>>>>> The ideal thing would be to get CalcitePrepareImpl to use a
>>> >>>>> PlannerImpl
>>> >>>>>>> to
>>> >>>>>>>>> do its dirty work. Then “inside” and “outside” would work the
>>> same.
>>> >>>>>>> Would
>>> >>>>>>>>> definitely appreciate that as a patch.
>>> >>>>>>>>>
>>> >>>>>>>>> If you choose to go the JDBC driver route, you could override
>>> >>>>>>>>> Driver.createPrepareFactory to produce a sub-class of
>>> >> CalcitePrepare
>>> >>>>>>> that
>>> >>>>>>>>> works for your environment, one with an explicit convertletTable
>>> >>>>> rather
>>> >>>>>>>>> than just using the default.
>>> >>>>>>>>>
>>> >>>>>>>>> Julian
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>> On Nov 17, 2016, at 5:01 PM, Gian Merlino <gi...@imply.io>
>>> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>> Hey Julian,
>>> >>>>>>>>>>
>>> >>>>>>>>>> If the convertlets were customizable with a FrameworkConfig,
>>> how
>>> >>>>> would
>>> >>>>>>> I
>>> >>>>>>>>>> use that configure the JDBC driver (given that I'm doing it
>>> with
>>> >> the
>>> >>>>>>> code
>>> >>>>>>>>>> upthread)? Or would that suggest using a different approach to
>>> >>>>>>> embedding
>>> >>>>>>>>>> Calcite?
>>> >>>>>>>>>>
>>> >>>>>>>>>> Gian
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Thu, Nov 17, 2016 at 4:02 PM, Julian Hyde <jhyde@apache.org
>>> >
>>> >>>>> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>>> Convertlets have a similar effect to planner rules (albeit
>>> they
>>> >>>>> act on
>>> >>>>>>>>>>> scalar expressions, not relational expressions) so people
>>> should
>>> >> be
>>> >>>>>>>>> able to
>>> >>>>>>>>>>> change the set of active convertlets.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Would you like to propose a change that makes the convertlet
>>> >> table
>>> >>>>>>>>>>> pluggable? Maybe as part of FrameworkConfig? Regardless,
>>> please
>>> >>>>> log a
>>> >>>>>>>>> JIRA
>>> >>>>>>>>>>> to track this.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> And by the way, RexImpTable, which defines how operators are
>>> >>>>>>> implemented
>>> >>>>>>>>>>> by generating java code, should also be pluggable. It’s been
>>> on
>>> >> my
>>> >>>>>>> mind
>>> >>>>>>>>> for
>>> >>>>>>>>>>> a long time to allow the “engine” — related to the data
>>> format,
>>> >> and
>>> >>>>>>> how
>>> >>>>>>>>>>> code is generated to access fields and evaluate expressions
>>> and
>>> >>>>>>>>> operators —
>>> >>>>>>>>>>> to be pluggable.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Regarding whether the JDBC driver is the right way to embed
>>> >>>>> Calcite.
>>> >>>>>>>>>>> There’s no easy answer. You might want to embed Calcite as a
>>> >>>>> library
>>> >>>>>>> in
>>> >>>>>>>>>>> your own server (as Drill and Hive do). Or you might want to
>>> make
>>> >>>>>>>>> yourself
>>> >>>>>>>>>>> just an adapter that runs inside a Calcite JDBC server (as the
>>> >> CSV
>>> >>>>>>>>> adapter
>>> >>>>>>>>>>> does). Or something in the middle, like what Phoenix does:
>>> using
>>> >>>>>>> Calcite
>>> >>>>>>>>>>> for JDBC, SQL, planning, but with your own metadata and
>>> runtime
>>> >>>>>>> engine.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> As long as you build the valuable stuff into planner rules,
>>> new
>>> >>>>>>>>> relational
>>> >>>>>>>>>>> operators (if necessary) and use the schema SPI, you should be
>>> >>>>> able to
>>> >>>>>>>>>>> change packaging in the future.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Julian
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>> On Nov 17, 2016, at 1:59 PM, Gian Merlino <gi...@imply.io>
>>> >> wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Hey Calcites,
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> I'm working on embedding Calcite into Druid (
>>> http://druid.io/,
>>> >>>>>>>>>>>> https://github.com/druid-io/druid/pull/3682) and am running
>>> >> into
>>> >>>>> a
>>> >>>>>>>>>>> problem
>>> >>>>>>>>>>>> that is making me wonder if the approach I'm using makes
>>> sense.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Consider the expression EXTRACT(YEAR FROM __time). Calcite
>>> has a
>>> >>>>>>>>> standard
>>> >>>>>>>>>>>> convertlet rule "convertExtract" that changes this into some
>>> >>>>>>> arithmetic
>>> >>>>>>>>>>> on
>>> >>>>>>>>>>>> __time casted to an int type. But Druid has some builtin
>>> >>>>> functions to
>>> >>>>>>>>> do
>>> >>>>>>>>>>>> this, and I'd rather use those than arithmetic (for a bunch
>>> of
>>> >>>>>>>>> reasons).
>>> >>>>>>>>>>>> Ideally, in my RelOptRules that convert Calcite rels to Druid
>>> >>>>>>> queries,
>>> >>>>>>>>>>> I'd
>>> >>>>>>>>>>>> see the EXTRACT as a normal RexCall with the time flag and an
>>> >>>>>>>>> expression
>>> >>>>>>>>>>> to
>>> >>>>>>>>>>>> apply it to. That's a lot easier to translate than the
>>> >> arithmetic
>>> >>>>>>>>> stuff,
>>> >>>>>>>>>>>> which I'd have to pattern match and undo first before
>>> >> translating.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> So the problem I have is that I want to disable
>>> convertExtract,
>>> >>>>> but I
>>> >>>>>>>>>>> don't
>>> >>>>>>>>>>>> see a way to do that or to swap out the convertlet table.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> The code I'm using to set up a connection is:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> public CalciteConnection createCalciteConnection(
>>> >>>>>>>>>>>> final DruidSchema druidSchema
>>> >>>>>>>>>>>> ) throws SQLException
>>> >>>>>>>>>>>> {
>>> >>>>>>>>>>>> final Properties props = new Properties();
>>> >>>>>>>>>>>> props.setProperty("caseSensitive", "true");
>>> >>>>>>>>>>>> props.setProperty("unquotedCasing", "UNCHANGED");
>>> >>>>>>>>>>>> final Connection connection =
>>> >>>>>>>>>>>> DriverManager.getConnection("jdbc:calcite:", props);
>>> >>>>>>>>>>>> final CalciteConnection calciteConnection =
>>> >>>>>>>>>>>> connection.unwrap(CalciteConnection.class);
>>> >>>>>>>>>>>> calciteConnection.getRootSchema().setCacheEnabled(false);
>>> >>>>>>>>>>>> calciteConnection.getRootSchema().add(DRUID_SCHEMA_NAME,
>>> >>>>>>>>>>> druidSchema);
>>> >>>>>>>>>>>> return calciteConnection;
>>> >>>>>>>>>>>> }
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> This CalciteConnection is then used by the Druid HTTP server
>>> to
>>> >>>>>>> offer a
>>> >>>>>>>>>>> SQL
>>> >>>>>>>>>>>> API.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Is there some way to swap out the convertlet table that I'm
>>> >>>>> missing?
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Also, just in general, am I going about this the right way?
>>> Is
>>> >>>>> using
>>> >>>>>>>>> the
>>> >>>>>>>>>>>> JDBC driver the right way to embed Calcite? Or should I be
>>> >> calling
>>> >>>>>>> into
>>> >>>>>>>>>>> it
>>> >>>>>>>>>>>> at some lower level?
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Thanks!
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Gian
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>
>>> >>
>>>
>>>
>>