You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Vladimir Ozerov <vo...@gridgain.com> on 2017/08/01 12:25:15 UTC

Create own SQL parser

Igniters,

As you know, we rely on H2 for SQL query parsing. This has several
drawbacks:

1) Limited and ugly syntax
Ignite has lot's of unique concepts which are in no way supported by
traditional RDBMS in general, or by H2 in particular. For example:
- query hints ("distributedJoins", "replicatedOnly", "colocated")
- index hints ("inline size")
- cache configuration (memory policy, affinity key, cache mode, etc)
- transaction mode (concurrency, isolation, timeouts) - not needed now, but
will be required when transactional SQL is ready

2) Performance implications
Typical SQL processing flow looks as follows
- Parse String to H2 object form (prepared statement)
- Convert it to Ignite object form (AST)
- Then convert it back to map and reduce queries in String form
- Convert map and reduce queries from String back to H2 PreparedStatement
again for final execution

This is way too much. Moreover, H2 optimizes query during parsing, but it's
optimizer is not smart enough. E.g., Ignite "IN" clauses are not optimized
and hence doesn't use indexes, so we force users to use intermediate tables
with very ugly syntax, while we should do that on our own instead. Another
example is common expression elimination - H2 cannot do that even for
deterministic functions, what cause performance problems frequently.

I propose to start some work in direction of our own parser. We can start
with something very simple, e.g. DDL support, which is not that complex,
but will improve usability significantly. And then gradually extend it to
more complex statements where both rich BNF and optimizer is necessary.

Thoughts?

Vladimir.

Re: Create own SQL parser

Posted by Alexander Paschenko <al...@gmail.com>.

Dmitry,

Ignite mode in H2 is about parsing only, it does not address other
issues pointed out by Vlad. Alas, implementing _parser_ surely won't
take no years, whilst things like smart query optimization, etc,
probably don't have and can't have proper "finished" state - it's
something that you can improve again and again as there's never too
much performance.

My opinion: we should use Ignite mode in H2 to extend classic commands
with keywords of our own fashion while laying foundation for our own
SQL engine by starting work on parser core to provide for processing
of Ignite specific commands. I'll try to come up with some prototype
in near future.

- Alex

2017-08-03 4:43 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
> Vladimir, this sounds like a task that would take years to design,
> implement, and polish. Can we just aim to improve the Ignite mode in H2,
> which is much more feasible  in my view?
>
> On Wed, Aug 2, 2017 at 2:59 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
>> Alex P.,
>>
>> The very problem with non-SELECT and non-DML commands is that we do not
>> support most of what is supported by H2, and vice versa - H2 doesn't
>> support most things that we need, like cache properties, templates, inline
>> indexes, etc.. Another important thing is that at some point we will add a
>> kind of SQL-based command-line or scripting utility(es) for Ignite [1].
>> Mature products has rich set of commands, which are outside of SQL
>> standard. E.g., we would like to manage transaction settings (concurrency,
>> isolation) on per-session basis, grant and rewoke Ignite-specific roles,
>> gather some metrics from the cluster, etc.. It doesn't make sense to
>> develop it in H2.
>>
>> Actual H2 parsing logic takes about a dozen KLOCs. But parsing core is much
>> smaller and most of overall parser's code relates to SELECT and DML
>> statements, which are mostly not needed for DDL and administrative
>> commands. That said, I think it is perfectly fine to move Ignite-specific
>> commands to Ignite's own parser.
>>
>> Alex K.,
>>
>> Having the whole own SQL engine is very cool thing, as it gives us
>> unlimited capabilities in terms of performance and UX. But this is a very
>> huge thing. H2's core which is used in Ignite is about as big as all
>> existing Ignite's SQL logic in terms of lines of codes. So I would put this
>> question out of scope for now. We should focus on new features, usability
>> and documentation for now, and try getting as much as possible from the
>> given architecture.
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-5608
>>
>>
>> On Wed, Aug 2, 2017 at 3:38 PM, Alexey Kuznetsov <ak...@apache.org>
>> wrote:
>>
>> > From my opinion we could start investing in our own parser and SQL engine
>> > step by step, little by little
>> >  and one day drop H2 at all.
>> >
>> >  Having own parser and engine will give us a freedom to any optimizations
>> > and any syntax we like.
>> >
>> > Also that will be for one dependency less and we could have SQL out of
>> the
>> > box with no third-party dependencies.
>> >
>> >
>> > On Wed, Aug 2, 2017 at 7:25 PM, Alexander Paschenko <
>> > alexander.a.paschenko@gmail.com> wrote:
>> >
>> > > I'd like to point out that we already do have Ignite mode in H2 parser
>> > > (thanks Sergi) and thus have AFFINITY keyword support. Is is suggested
>> > > that we should abandon H2 way at all? Or should we suggest adding to
>> > > H2 only rather minor stuff (like some keywords for existing commands)
>> > > whilst introducing completely new commands for our own parser?
>> > >
>> > > - Alex
>> > >
>> > > 2017-08-02 9:01 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
>> > > > No, it will work as follows:
>> > > >
>> > > > Model parse(String sql) {
>> > > >     Model res = tryParseWithIgnite(sql); // Parse what we can
>> > > >
>> > > >     if (res == null)
>> > > >         res = parseWithH2(sql);
>> > > >
>> > > >     return res;
>> > > > }
>> > > >
>> > > > We will need a number of custom commands which are not present in H2.
>> > > >
>> > > > On Wed, Aug 2, 2017 at 3:44 AM, Dmitriy Setrakyan <
>> > dsetrakyan@apache.org
>> > > >
>> > > > wrote:
>> > > >
>> > > >> On Tue, Aug 1, 2017 at 11:08 PM, Vladimir Ozerov <
>> > vozerov@gridgain.com>
>> > > >> wrote:
>> > > >>
>> > > >> > Own parser capable of processing non-SELECT and non-DML
>> statements.
>> > > >> >
>> > > >>
>> > > >> And how will it integrate with H2 parser? Or are you suggesting that
>> > we
>> > > get
>> > > >> rid of H2 SQL parser?
>> > > >>
>> > > >>
>> > > >> >
>> > > >> > On Tue, Aug 1, 2017 at 9:44 PM, <ds...@apache.org> wrote:
>> > > >> >
>> > > >> > > Vova, I am not sure what you are proposing... extending H2
>> parser
>> > > with
>> > > >> > new
>> > > >> > > syntax or a brand new parser?
>> > > >> > >
>> > > >> > > ⁣D.
>> > > >> > >
>> > > >> > > On Aug 1, 2017, 4:26 PM, at 4:26 PM, Vladimir Ozerov <
>> > > >> > vozerov@gridgain.com>
>> > > >> > > wrote:
>> > > >> > > >Andrey,
>> > > >> > > >
>> > > >> > > >Note that I am not proposing to remove H2 as a whole. Main
>> point
>> > > for
>> > > >> > > >now is
>> > > >> > > >to support missing pieces of DDL syntax and possibly and some
>> > > >> > > >extensions.
>> > > >> > > >Several examples:
>> > > >> > > >
>> > > >> > > >1) Currently CREATE TABLE command looks ugly:
>> > > >> > > >CREATE TABLE Person (id LONG, name VARCHAR) WITH
>> > > >> "template=PARTITIONED,
>> > > >> > > >backups=1, ..."
>> > > >> > > >
>> > > >> > > >Commas typically treated in a special way in editors and IDEs,
>> so
>> > > user
>> > > >> > > >will
>> > > >> > > >have to escape them, making usability even worse.
>> > > >> > > >
>> > > >> > > >2) What if I need to introduce new template? Currently you have
>> > to
>> > > >> > > >restart
>> > > >> > > >the node with new config. With our own parser you will do:
>> > > >> > > >CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
>> > > >> > > >CREATE TABLE Person (...) TEMPLATE my_template;
>> > > >> > > >
>> > > >> > > >No restarts, everything is done dynamically.
>> > > >> > > >
>> > > >> > > >
>> > > >> > > >
>> > > >> > > >On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov
>> > > >> > > ><andrey.mashenkov@gmail.com
>> > > >> > > >> wrote:
>> > > >> > > >
>> > > >> > > >> Vovan,
>> > > >> > > >>
>> > > >> > > >> 1. What about ANSI-xx compliant? Will new syntax brake it in
>> > some
>> > > >> > > >cases or
>> > > >> > > >> just extend?
>> > > >> > > >>
>> > > >> > > >> 2. This would be great to have more ways for optimization.
>> > > >> > > >>
>> > > >> > > >> Does anyone know or may be have experience with some
>> frameworks
>> > > or
>> > > >> > > >open
>> > > >> > > >> projects which can be helpful? E.g. Apache Calcite?
>> > > >> > > >>
>> > > >> > > >> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov
>> > > >> > > ><vo...@gridgain.com>
>> > > >> > > >> wrote:
>> > > >> > > >>
>> > > >> > > >> > Igniters,
>> > > >> > > >> >
>> > > >> > > >> > As you know, we rely on H2 for SQL query parsing. This has
>> > > several
>> > > >> > > >> > drawbacks:
>> > > >> > > >> >
>> > > >> > > >> > 1) Limited and ugly syntax
>> > > >> > > >> > Ignite has lot's of unique concepts which are in no way
>> > > supported
>> > > >> > > >by
>> > > >> > > >> > traditional RDBMS in general, or by H2 in particular. For
>> > > example:
>> > > >> > > >> > - query hints ("distributedJoins", "replicatedOnly",
>> > > "colocated")
>> > > >> > > >> > - index hints ("inline size")
>> > > >> > > >> > - cache configuration (memory policy, affinity key, cache
>> > mode,
>> > > >> > > >etc)
>> > > >> > > >> > - transaction mode (concurrency, isolation, timeouts) - not
>> > > needed
>> > > >> > > >now,
>> > > >> > > >> but
>> > > >> > > >> > will be required when transactional SQL is ready
>> > > >> > > >> >
>> > > >> > > >> > 2) Performance implications
>> > > >> > > >> > Typical SQL processing flow looks as follows
>> > > >> > > >> > - Parse String to H2 object form (prepared statement)
>> > > >> > > >> > - Convert it to Ignite object form (AST)
>> > > >> > > >> > - Then convert it back to map and reduce queries in String
>> > form
>> > > >> > > >> > - Convert map and reduce queries from String back to H2
>> > > >> > > >PreparedStatement
>> > > >> > > >> > again for final execution
>> > > >> > > >> >
>> > > >> > > >> > This is way too much. Moreover, H2 optimizes query during
>> > > parsing,
>> > > >> > > >but
>> > > >> > > >> it's
>> > > >> > > >> > optimizer is not smart enough. E.g., Ignite "IN" clauses
>> are
>> > > not
>> > > >> > > >> optimized
>> > > >> > > >> > and hence doesn't use indexes, so we force users to use
>> > > >> > > >intermediate
>> > > >> > > >> tables
>> > > >> > > >> > with very ugly syntax, while we should do that on our own
>> > > instead.
>> > > >> > > >> Another
>> > > >> > > >> > example is common expression elimination - H2 cannot do
>> that
>> > > even
>> > > >> > > >for
>> > > >> > > >> > deterministic functions, what cause performance problems
>> > > >> > > >frequently.
>> > > >> > > >> >
>> > > >> > > >> > I propose to start some work in direction of our own
>> parser.
>> > We
>> > > >> can
>> > > >> > > >start
>> > > >> > > >> > with something very simple, e.g. DDL support, which is not
>> > that
>> > > >> > > >complex,
>> > > >> > > >> > but will improve usability significantly. And then
>> gradually
>> > > >> extend
>> > > >> > > >it to
>> > > >> > > >> > more complex statements where both rich BNF and optimizer
>> is
>> > > >> > > >necessary.
>> > > >> > > >> >
>> > > >> > > >> > Thoughts?
>> > > >> > > >> >
>> > > >> > > >> > Vladimir.
>> > > >> > > >> >
>> > > >> > > >>
>> > > >> > > >>
>> > > >> > > >>
>> > > >> > > >> --
>> > > >> > > >> Best regards,
>> > > >> > > >> Andrey V. Mashenkov
>> > > >> > > >>
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> >
>> >
>> > --
>> > Alexey Kuznetsov
>> >
>>

Re: Create own SQL parser

Posted by Dmitriy Setrakyan <ds...@apache.org>.

Vladimir, this sounds like a task that would take years to design,
implement, and polish. Can we just aim to improve the Ignite mode in H2,
which is much more feasible  in my view?

On Wed, Aug 2, 2017 at 2:59 PM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> Alex P.,
>
> The very problem with non-SELECT and non-DML commands is that we do not
> support most of what is supported by H2, and vice versa - H2 doesn't
> support most things that we need, like cache properties, templates, inline
> indexes, etc.. Another important thing is that at some point we will add a
> kind of SQL-based command-line or scripting utility(es) for Ignite [1].
> Mature products has rich set of commands, which are outside of SQL
> standard. E.g., we would like to manage transaction settings (concurrency,
> isolation) on per-session basis, grant and rewoke Ignite-specific roles,
> gather some metrics from the cluster, etc.. It doesn't make sense to
> develop it in H2.
>
> Actual H2 parsing logic takes about a dozen KLOCs. But parsing core is much
> smaller and most of overall parser's code relates to SELECT and DML
> statements, which are mostly not needed for DDL and administrative
> commands. That said, I think it is perfectly fine to move Ignite-specific
> commands to Ignite's own parser.
>
> Alex K.,
>
> Having the whole own SQL engine is very cool thing, as it gives us
> unlimited capabilities in terms of performance and UX. But this is a very
> huge thing. H2's core which is used in Ignite is about as big as all
> existing Ignite's SQL logic in terms of lines of codes. So I would put this
> question out of scope for now. We should focus on new features, usability
> and documentation for now, and try getting as much as possible from the
> given architecture.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-5608
>
>
> On Wed, Aug 2, 2017 at 3:38 PM, Alexey Kuznetsov <ak...@apache.org>
> wrote:
>
> > From my opinion we could start investing in our own parser and SQL engine
> > step by step, little by little
> >  and one day drop H2 at all.
> >
> >  Having own parser and engine will give us a freedom to any optimizations
> > and any syntax we like.
> >
> > Also that will be for one dependency less and we could have SQL out of
> the
> > box with no third-party dependencies.
> >
> >
> > On Wed, Aug 2, 2017 at 7:25 PM, Alexander Paschenko <
> > alexander.a.paschenko@gmail.com> wrote:
> >
> > > I'd like to point out that we already do have Ignite mode in H2 parser
> > > (thanks Sergi) and thus have AFFINITY keyword support. Is is suggested
> > > that we should abandon H2 way at all? Or should we suggest adding to
> > > H2 only rather minor stuff (like some keywords for existing commands)
> > > whilst introducing completely new commands for our own parser?
> > >
> > > - Alex
> > >
> > > 2017-08-02 9:01 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> > > > No, it will work as follows:
> > > >
> > > > Model parse(String sql) {
> > > >     Model res = tryParseWithIgnite(sql); // Parse what we can
> > > >
> > > >     if (res == null)
> > > >         res = parseWithH2(sql);
> > > >
> > > >     return res;
> > > > }
> > > >
> > > > We will need a number of custom commands which are not present in H2.
> > > >
> > > > On Wed, Aug 2, 2017 at 3:44 AM, Dmitriy Setrakyan <
> > dsetrakyan@apache.org
> > > >
> > > > wrote:
> > > >
> > > >> On Tue, Aug 1, 2017 at 11:08 PM, Vladimir Ozerov <
> > vozerov@gridgain.com>
> > > >> wrote:
> > > >>
> > > >> > Own parser capable of processing non-SELECT and non-DML
> statements.
> > > >> >
> > > >>
> > > >> And how will it integrate with H2 parser? Or are you suggesting that
> > we
> > > get
> > > >> rid of H2 SQL parser?
> > > >>
> > > >>
> > > >> >
> > > >> > On Tue, Aug 1, 2017 at 9:44 PM, <ds...@apache.org> wrote:
> > > >> >
> > > >> > > Vova, I am not sure what you are proposing... extending H2
> parser
> > > with
> > > >> > new
> > > >> > > syntax or a brand new parser?
> > > >> > >
> > > >> > > ⁣D.
> > > >> > >
> > > >> > > On Aug 1, 2017, 4:26 PM, at 4:26 PM, Vladimir Ozerov <
> > > >> > vozerov@gridgain.com>
> > > >> > > wrote:
> > > >> > > >Andrey,
> > > >> > > >
> > > >> > > >Note that I am not proposing to remove H2 as a whole. Main
> point
> > > for
> > > >> > > >now is
> > > >> > > >to support missing pieces of DDL syntax and possibly and some
> > > >> > > >extensions.
> > > >> > > >Several examples:
> > > >> > > >
> > > >> > > >1) Currently CREATE TABLE command looks ugly:
> > > >> > > >CREATE TABLE Person (id LONG, name VARCHAR) WITH
> > > >> "template=PARTITIONED,
> > > >> > > >backups=1, ..."
> > > >> > > >
> > > >> > > >Commas typically treated in a special way in editors and IDEs,
> so
> > > user
> > > >> > > >will
> > > >> > > >have to escape them, making usability even worse.
> > > >> > > >
> > > >> > > >2) What if I need to introduce new template? Currently you have
> > to
> > > >> > > >restart
> > > >> > > >the node with new config. With our own parser you will do:
> > > >> > > >CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
> > > >> > > >CREATE TABLE Person (...) TEMPLATE my_template;
> > > >> > > >
> > > >> > > >No restarts, everything is done dynamically.
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov
> > > >> > > ><andrey.mashenkov@gmail.com
> > > >> > > >> wrote:
> > > >> > > >
> > > >> > > >> Vovan,
> > > >> > > >>
> > > >> > > >> 1. What about ANSI-xx compliant? Will new syntax brake it in
> > some
> > > >> > > >cases or
> > > >> > > >> just extend?
> > > >> > > >>
> > > >> > > >> 2. This would be great to have more ways for optimization.
> > > >> > > >>
> > > >> > > >> Does anyone know or may be have experience with some
> frameworks
> > > or
> > > >> > > >open
> > > >> > > >> projects which can be helpful? E.g. Apache Calcite?
> > > >> > > >>
> > > >> > > >> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov
> > > >> > > ><vo...@gridgain.com>
> > > >> > > >> wrote:
> > > >> > > >>
> > > >> > > >> > Igniters,
> > > >> > > >> >
> > > >> > > >> > As you know, we rely on H2 for SQL query parsing. This has
> > > several
> > > >> > > >> > drawbacks:
> > > >> > > >> >
> > > >> > > >> > 1) Limited and ugly syntax
> > > >> > > >> > Ignite has lot's of unique concepts which are in no way
> > > supported
> > > >> > > >by
> > > >> > > >> > traditional RDBMS in general, or by H2 in particular. For
> > > example:
> > > >> > > >> > - query hints ("distributedJoins", "replicatedOnly",
> > > "colocated")
> > > >> > > >> > - index hints ("inline size")
> > > >> > > >> > - cache configuration (memory policy, affinity key, cache
> > mode,
> > > >> > > >etc)
> > > >> > > >> > - transaction mode (concurrency, isolation, timeouts) - not
> > > needed
> > > >> > > >now,
> > > >> > > >> but
> > > >> > > >> > will be required when transactional SQL is ready
> > > >> > > >> >
> > > >> > > >> > 2) Performance implications
> > > >> > > >> > Typical SQL processing flow looks as follows
> > > >> > > >> > - Parse String to H2 object form (prepared statement)
> > > >> > > >> > - Convert it to Ignite object form (AST)
> > > >> > > >> > - Then convert it back to map and reduce queries in String
> > form
> > > >> > > >> > - Convert map and reduce queries from String back to H2
> > > >> > > >PreparedStatement
> > > >> > > >> > again for final execution
> > > >> > > >> >
> > > >> > > >> > This is way too much. Moreover, H2 optimizes query during
> > > parsing,
> > > >> > > >but
> > > >> > > >> it's
> > > >> > > >> > optimizer is not smart enough. E.g., Ignite "IN" clauses
> are
> > > not
> > > >> > > >> optimized
> > > >> > > >> > and hence doesn't use indexes, so we force users to use
> > > >> > > >intermediate
> > > >> > > >> tables
> > > >> > > >> > with very ugly syntax, while we should do that on our own
> > > instead.
> > > >> > > >> Another
> > > >> > > >> > example is common expression elimination - H2 cannot do
> that
> > > even
> > > >> > > >for
> > > >> > > >> > deterministic functions, what cause performance problems
> > > >> > > >frequently.
> > > >> > > >> >
> > > >> > > >> > I propose to start some work in direction of our own
> parser.
> > We
> > > >> can
> > > >> > > >start
> > > >> > > >> > with something very simple, e.g. DDL support, which is not
> > that
> > > >> > > >complex,
> > > >> > > >> > but will improve usability significantly. And then
> gradually
> > > >> extend
> > > >> > > >it to
> > > >> > > >> > more complex statements where both rich BNF and optimizer
> is
> > > >> > > >necessary.
> > > >> > > >> >
> > > >> > > >> > Thoughts?
> > > >> > > >> >
> > > >> > > >> > Vladimir.
> > > >> > > >> >
> > > >> > > >>
> > > >> > > >>
> > > >> > > >>
> > > >> > > >> --
> > > >> > > >> Best regards,
> > > >> > > >> Andrey V. Mashenkov
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> >
> >
> > --
> > Alexey Kuznetsov
> >
>

Re: Create own SQL parser

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Alex P.,

The very problem with non-SELECT and non-DML commands is that we do not
support most of what is supported by H2, and vice versa - H2 doesn't
support most things that we need, like cache properties, templates, inline
indexes, etc.. Another important thing is that at some point we will add a
kind of SQL-based command-line or scripting utility(es) for Ignite [1].
Mature products has rich set of commands, which are outside of SQL
standard. E.g., we would like to manage transaction settings (concurrency,
isolation) on per-session basis, grant and rewoke Ignite-specific roles,
gather some metrics from the cluster, etc.. It doesn't make sense to
develop it in H2.

Actual H2 parsing logic takes about a dozen KLOCs. But parsing core is much
smaller and most of overall parser's code relates to SELECT and DML
statements, which are mostly not needed for DDL and administrative
commands. That said, I think it is perfectly fine to move Ignite-specific
commands to Ignite's own parser.

Alex K.,

Having the whole own SQL engine is very cool thing, as it gives us
unlimited capabilities in terms of performance and UX. But this is a very
huge thing. H2's core which is used in Ignite is about as big as all
existing Ignite's SQL logic in terms of lines of codes. So I would put this
question out of scope for now. We should focus on new features, usability
and documentation for now, and try getting as much as possible from the
given architecture.

[1] https://issues.apache.org/jira/browse/IGNITE-5608


On Wed, Aug 2, 2017 at 3:38 PM, Alexey Kuznetsov <ak...@apache.org>
wrote:

> From my opinion we could start investing in our own parser and SQL engine
> step by step, little by little
>  and one day drop H2 at all.
>
>  Having own parser and engine will give us a freedom to any optimizations
> and any syntax we like.
>
> Also that will be for one dependency less and we could have SQL out of the
> box with no third-party dependencies.
>
>
> On Wed, Aug 2, 2017 at 7:25 PM, Alexander Paschenko <
> alexander.a.paschenko@gmail.com> wrote:
>
> > I'd like to point out that we already do have Ignite mode in H2 parser
> > (thanks Sergi) and thus have AFFINITY keyword support. Is is suggested
> > that we should abandon H2 way at all? Or should we suggest adding to
> > H2 only rather minor stuff (like some keywords for existing commands)
> > whilst introducing completely new commands for our own parser?
> >
> > - Alex
> >
> > 2017-08-02 9:01 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> > > No, it will work as follows:
> > >
> > > Model parse(String sql) {
> > >     Model res = tryParseWithIgnite(sql); // Parse what we can
> > >
> > >     if (res == null)
> > >         res = parseWithH2(sql);
> > >
> > >     return res;
> > > }
> > >
> > > We will need a number of custom commands which are not present in H2.
> > >
> > > On Wed, Aug 2, 2017 at 3:44 AM, Dmitriy Setrakyan <
> dsetrakyan@apache.org
> > >
> > > wrote:
> > >
> > >> On Tue, Aug 1, 2017 at 11:08 PM, Vladimir Ozerov <
> vozerov@gridgain.com>
> > >> wrote:
> > >>
> > >> > Own parser capable of processing non-SELECT and non-DML statements.
> > >> >
> > >>
> > >> And how will it integrate with H2 parser? Or are you suggesting that
> we
> > get
> > >> rid of H2 SQL parser?
> > >>
> > >>
> > >> >
> > >> > On Tue, Aug 1, 2017 at 9:44 PM, <ds...@apache.org> wrote:
> > >> >
> > >> > > Vova, I am not sure what you are proposing... extending H2 parser
> > with
> > >> > new
> > >> > > syntax or a brand new parser?
> > >> > >
> > >> > > ⁣D.
> > >> > >
> > >> > > On Aug 1, 2017, 4:26 PM, at 4:26 PM, Vladimir Ozerov <
> > >> > vozerov@gridgain.com>
> > >> > > wrote:
> > >> > > >Andrey,
> > >> > > >
> > >> > > >Note that I am not proposing to remove H2 as a whole. Main point
> > for
> > >> > > >now is
> > >> > > >to support missing pieces of DDL syntax and possibly and some
> > >> > > >extensions.
> > >> > > >Several examples:
> > >> > > >
> > >> > > >1) Currently CREATE TABLE command looks ugly:
> > >> > > >CREATE TABLE Person (id LONG, name VARCHAR) WITH
> > >> "template=PARTITIONED,
> > >> > > >backups=1, ..."
> > >> > > >
> > >> > > >Commas typically treated in a special way in editors and IDEs, so
> > user
> > >> > > >will
> > >> > > >have to escape them, making usability even worse.
> > >> > > >
> > >> > > >2) What if I need to introduce new template? Currently you have
> to
> > >> > > >restart
> > >> > > >the node with new config. With our own parser you will do:
> > >> > > >CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
> > >> > > >CREATE TABLE Person (...) TEMPLATE my_template;
> > >> > > >
> > >> > > >No restarts, everything is done dynamically.
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov
> > >> > > ><andrey.mashenkov@gmail.com
> > >> > > >> wrote:
> > >> > > >
> > >> > > >> Vovan,
> > >> > > >>
> > >> > > >> 1. What about ANSI-xx compliant? Will new syntax brake it in
> some
> > >> > > >cases or
> > >> > > >> just extend?
> > >> > > >>
> > >> > > >> 2. This would be great to have more ways for optimization.
> > >> > > >>
> > >> > > >> Does anyone know or may be have experience with some frameworks
> > or
> > >> > > >open
> > >> > > >> projects which can be helpful? E.g. Apache Calcite?
> > >> > > >>
> > >> > > >> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov
> > >> > > ><vo...@gridgain.com>
> > >> > > >> wrote:
> > >> > > >>
> > >> > > >> > Igniters,
> > >> > > >> >
> > >> > > >> > As you know, we rely on H2 for SQL query parsing. This has
> > several
> > >> > > >> > drawbacks:
> > >> > > >> >
> > >> > > >> > 1) Limited and ugly syntax
> > >> > > >> > Ignite has lot's of unique concepts which are in no way
> > supported
> > >> > > >by
> > >> > > >> > traditional RDBMS in general, or by H2 in particular. For
> > example:
> > >> > > >> > - query hints ("distributedJoins", "replicatedOnly",
> > "colocated")
> > >> > > >> > - index hints ("inline size")
> > >> > > >> > - cache configuration (memory policy, affinity key, cache
> mode,
> > >> > > >etc)
> > >> > > >> > - transaction mode (concurrency, isolation, timeouts) - not
> > needed
> > >> > > >now,
> > >> > > >> but
> > >> > > >> > will be required when transactional SQL is ready
> > >> > > >> >
> > >> > > >> > 2) Performance implications
> > >> > > >> > Typical SQL processing flow looks as follows
> > >> > > >> > - Parse String to H2 object form (prepared statement)
> > >> > > >> > - Convert it to Ignite object form (AST)
> > >> > > >> > - Then convert it back to map and reduce queries in String
> form
> > >> > > >> > - Convert map and reduce queries from String back to H2
> > >> > > >PreparedStatement
> > >> > > >> > again for final execution
> > >> > > >> >
> > >> > > >> > This is way too much. Moreover, H2 optimizes query during
> > parsing,
> > >> > > >but
> > >> > > >> it's
> > >> > > >> > optimizer is not smart enough. E.g., Ignite "IN" clauses are
> > not
> > >> > > >> optimized
> > >> > > >> > and hence doesn't use indexes, so we force users to use
> > >> > > >intermediate
> > >> > > >> tables
> > >> > > >> > with very ugly syntax, while we should do that on our own
> > instead.
> > >> > > >> Another
> > >> > > >> > example is common expression elimination - H2 cannot do that
> > even
> > >> > > >for
> > >> > > >> > deterministic functions, what cause performance problems
> > >> > > >frequently.
> > >> > > >> >
> > >> > > >> > I propose to start some work in direction of our own parser.
> We
> > >> can
> > >> > > >start
> > >> > > >> > with something very simple, e.g. DDL support, which is not
> that
> > >> > > >complex,
> > >> > > >> > but will improve usability significantly. And then gradually
> > >> extend
> > >> > > >it to
> > >> > > >> > more complex statements where both rich BNF and optimizer is
> > >> > > >necessary.
> > >> > > >> >
> > >> > > >> > Thoughts?
> > >> > > >> >
> > >> > > >> > Vladimir.
> > >> > > >> >
> > >> > > >>
> > >> > > >>
> > >> > > >>
> > >> > > >> --
> > >> > > >> Best regards,
> > >> > > >> Andrey V. Mashenkov
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
>
>
>
> --
> Alexey Kuznetsov
>

Re: Create own SQL parser

Posted by Alexey Kuznetsov <ak...@apache.org>.

From my opinion we could start investing in our own parser and SQL engine
step by step, little by little
 and one day drop H2 at all.

 Having own parser and engine will give us a freedom to any optimizations
and any syntax we like.

Also that will be for one dependency less and we could have SQL out of the
box with no third-party dependencies.


On Wed, Aug 2, 2017 at 7:25 PM, Alexander Paschenko <
alexander.a.paschenko@gmail.com> wrote:

> I'd like to point out that we already do have Ignite mode in H2 parser
> (thanks Sergi) and thus have AFFINITY keyword support. Is is suggested
> that we should abandon H2 way at all? Or should we suggest adding to
> H2 only rather minor stuff (like some keywords for existing commands)
> whilst introducing completely new commands for our own parser?
>
> - Alex
>
> 2017-08-02 9:01 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> > No, it will work as follows:
> >
> > Model parse(String sql) {
> >     Model res = tryParseWithIgnite(sql); // Parse what we can
> >
> >     if (res == null)
> >         res = parseWithH2(sql);
> >
> >     return res;
> > }
> >
> > We will need a number of custom commands which are not present in H2.
> >
> > On Wed, Aug 2, 2017 at 3:44 AM, Dmitriy Setrakyan <dsetrakyan@apache.org
> >
> > wrote:
> >
> >> On Tue, Aug 1, 2017 at 11:08 PM, Vladimir Ozerov <vo...@gridgain.com>
> >> wrote:
> >>
> >> > Own parser capable of processing non-SELECT and non-DML statements.
> >> >
> >>
> >> And how will it integrate with H2 parser? Or are you suggesting that we
> get
> >> rid of H2 SQL parser?
> >>
> >>
> >> >
> >> > On Tue, Aug 1, 2017 at 9:44 PM, <ds...@apache.org> wrote:
> >> >
> >> > > Vova, I am not sure what you are proposing... extending H2 parser
> with
> >> > new
> >> > > syntax or a brand new parser?
> >> > >
> >> > > ⁣D.
> >> > >
> >> > > On Aug 1, 2017, 4:26 PM, at 4:26 PM, Vladimir Ozerov <
> >> > vozerov@gridgain.com>
> >> > > wrote:
> >> > > >Andrey,
> >> > > >
> >> > > >Note that I am not proposing to remove H2 as a whole. Main point
> for
> >> > > >now is
> >> > > >to support missing pieces of DDL syntax and possibly and some
> >> > > >extensions.
> >> > > >Several examples:
> >> > > >
> >> > > >1) Currently CREATE TABLE command looks ugly:
> >> > > >CREATE TABLE Person (id LONG, name VARCHAR) WITH
> >> "template=PARTITIONED,
> >> > > >backups=1, ..."
> >> > > >
> >> > > >Commas typically treated in a special way in editors and IDEs, so
> user
> >> > > >will
> >> > > >have to escape them, making usability even worse.
> >> > > >
> >> > > >2) What if I need to introduce new template? Currently you have to
> >> > > >restart
> >> > > >the node with new config. With our own parser you will do:
> >> > > >CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
> >> > > >CREATE TABLE Person (...) TEMPLATE my_template;
> >> > > >
> >> > > >No restarts, everything is done dynamically.
> >> > > >
> >> > > >
> >> > > >
> >> > > >On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov
> >> > > ><andrey.mashenkov@gmail.com
> >> > > >> wrote:
> >> > > >
> >> > > >> Vovan,
> >> > > >>
> >> > > >> 1. What about ANSI-xx compliant? Will new syntax brake it in some
> >> > > >cases or
> >> > > >> just extend?
> >> > > >>
> >> > > >> 2. This would be great to have more ways for optimization.
> >> > > >>
> >> > > >> Does anyone know or may be have experience with some frameworks
> or
> >> > > >open
> >> > > >> projects which can be helpful? E.g. Apache Calcite?
> >> > > >>
> >> > > >> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov
> >> > > ><vo...@gridgain.com>
> >> > > >> wrote:
> >> > > >>
> >> > > >> > Igniters,
> >> > > >> >
> >> > > >> > As you know, we rely on H2 for SQL query parsing. This has
> several
> >> > > >> > drawbacks:
> >> > > >> >
> >> > > >> > 1) Limited and ugly syntax
> >> > > >> > Ignite has lot's of unique concepts which are in no way
> supported
> >> > > >by
> >> > > >> > traditional RDBMS in general, or by H2 in particular. For
> example:
> >> > > >> > - query hints ("distributedJoins", "replicatedOnly",
> "colocated")
> >> > > >> > - index hints ("inline size")
> >> > > >> > - cache configuration (memory policy, affinity key, cache mode,
> >> > > >etc)
> >> > > >> > - transaction mode (concurrency, isolation, timeouts) - not
> needed
> >> > > >now,
> >> > > >> but
> >> > > >> > will be required when transactional SQL is ready
> >> > > >> >
> >> > > >> > 2) Performance implications
> >> > > >> > Typical SQL processing flow looks as follows
> >> > > >> > - Parse String to H2 object form (prepared statement)
> >> > > >> > - Convert it to Ignite object form (AST)
> >> > > >> > - Then convert it back to map and reduce queries in String form
> >> > > >> > - Convert map and reduce queries from String back to H2
> >> > > >PreparedStatement
> >> > > >> > again for final execution
> >> > > >> >
> >> > > >> > This is way too much. Moreover, H2 optimizes query during
> parsing,
> >> > > >but
> >> > > >> it's
> >> > > >> > optimizer is not smart enough. E.g., Ignite "IN" clauses are
> not
> >> > > >> optimized
> >> > > >> > and hence doesn't use indexes, so we force users to use
> >> > > >intermediate
> >> > > >> tables
> >> > > >> > with very ugly syntax, while we should do that on our own
> instead.
> >> > > >> Another
> >> > > >> > example is common expression elimination - H2 cannot do that
> even
> >> > > >for
> >> > > >> > deterministic functions, what cause performance problems
> >> > > >frequently.
> >> > > >> >
> >> > > >> > I propose to start some work in direction of our own parser. We
> >> can
> >> > > >start
> >> > > >> > with something very simple, e.g. DDL support, which is not that
> >> > > >complex,
> >> > > >> > but will improve usability significantly. And then gradually
> >> extend
> >> > > >it to
> >> > > >> > more complex statements where both rich BNF and optimizer is
> >> > > >necessary.
> >> > > >> >
> >> > > >> > Thoughts?
> >> > > >> >
> >> > > >> > Vladimir.
> >> > > >> >
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >> --
> >> > > >> Best regards,
> >> > > >> Andrey V. Mashenkov
> >> > > >>
> >> > >
> >> >
> >>
>



-- 
Alexey Kuznetsov

Re: Create own SQL parser

Posted by Alexander Paschenko <al...@gmail.com>.

I'd like to point out that we already do have Ignite mode in H2 parser
(thanks Sergi) and thus have AFFINITY keyword support. Is is suggested
that we should abandon H2 way at all? Or should we suggest adding to
H2 only rather minor stuff (like some keywords for existing commands)
whilst introducing completely new commands for our own parser?

- Alex

2017-08-02 9:01 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> No, it will work as follows:
>
> Model parse(String sql) {
>     Model res = tryParseWithIgnite(sql); // Parse what we can
>
>     if (res == null)
>         res = parseWithH2(sql);
>
>     return res;
> }
>
> We will need a number of custom commands which are not present in H2.
>
> On Wed, Aug 2, 2017 at 3:44 AM, Dmitriy Setrakyan <ds...@apache.org>
> wrote:
>
>> On Tue, Aug 1, 2017 at 11:08 PM, Vladimir Ozerov <vo...@gridgain.com>
>> wrote:
>>
>> > Own parser capable of processing non-SELECT and non-DML statements.
>> >
>>
>> And how will it integrate with H2 parser? Or are you suggesting that we get
>> rid of H2 SQL parser?
>>
>>
>> >
>> > On Tue, Aug 1, 2017 at 9:44 PM, <ds...@apache.org> wrote:
>> >
>> > > Vova, I am not sure what you are proposing... extending H2 parser with
>> > new
>> > > syntax or a brand new parser?
>> > >
>> > > ⁣D.
>> > >
>> > > On Aug 1, 2017, 4:26 PM, at 4:26 PM, Vladimir Ozerov <
>> > vozerov@gridgain.com>
>> > > wrote:
>> > > >Andrey,
>> > > >
>> > > >Note that I am not proposing to remove H2 as a whole. Main point for
>> > > >now is
>> > > >to support missing pieces of DDL syntax and possibly and some
>> > > >extensions.
>> > > >Several examples:
>> > > >
>> > > >1) Currently CREATE TABLE command looks ugly:
>> > > >CREATE TABLE Person (id LONG, name VARCHAR) WITH
>> "template=PARTITIONED,
>> > > >backups=1, ..."
>> > > >
>> > > >Commas typically treated in a special way in editors and IDEs, so user
>> > > >will
>> > > >have to escape them, making usability even worse.
>> > > >
>> > > >2) What if I need to introduce new template? Currently you have to
>> > > >restart
>> > > >the node with new config. With our own parser you will do:
>> > > >CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
>> > > >CREATE TABLE Person (...) TEMPLATE my_template;
>> > > >
>> > > >No restarts, everything is done dynamically.
>> > > >
>> > > >
>> > > >
>> > > >On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov
>> > > ><andrey.mashenkov@gmail.com
>> > > >> wrote:
>> > > >
>> > > >> Vovan,
>> > > >>
>> > > >> 1. What about ANSI-xx compliant? Will new syntax brake it in some
>> > > >cases or
>> > > >> just extend?
>> > > >>
>> > > >> 2. This would be great to have more ways for optimization.
>> > > >>
>> > > >> Does anyone know or may be have experience with some frameworks or
>> > > >open
>> > > >> projects which can be helpful? E.g. Apache Calcite?
>> > > >>
>> > > >> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov
>> > > ><vo...@gridgain.com>
>> > > >> wrote:
>> > > >>
>> > > >> > Igniters,
>> > > >> >
>> > > >> > As you know, we rely on H2 for SQL query parsing. This has several
>> > > >> > drawbacks:
>> > > >> >
>> > > >> > 1) Limited and ugly syntax
>> > > >> > Ignite has lot's of unique concepts which are in no way supported
>> > > >by
>> > > >> > traditional RDBMS in general, or by H2 in particular. For example:
>> > > >> > - query hints ("distributedJoins", "replicatedOnly", "colocated")
>> > > >> > - index hints ("inline size")
>> > > >> > - cache configuration (memory policy, affinity key, cache mode,
>> > > >etc)
>> > > >> > - transaction mode (concurrency, isolation, timeouts) - not needed
>> > > >now,
>> > > >> but
>> > > >> > will be required when transactional SQL is ready
>> > > >> >
>> > > >> > 2) Performance implications
>> > > >> > Typical SQL processing flow looks as follows
>> > > >> > - Parse String to H2 object form (prepared statement)
>> > > >> > - Convert it to Ignite object form (AST)
>> > > >> > - Then convert it back to map and reduce queries in String form
>> > > >> > - Convert map and reduce queries from String back to H2
>> > > >PreparedStatement
>> > > >> > again for final execution
>> > > >> >
>> > > >> > This is way too much. Moreover, H2 optimizes query during parsing,
>> > > >but
>> > > >> it's
>> > > >> > optimizer is not smart enough. E.g., Ignite "IN" clauses are not
>> > > >> optimized
>> > > >> > and hence doesn't use indexes, so we force users to use
>> > > >intermediate
>> > > >> tables
>> > > >> > with very ugly syntax, while we should do that on our own instead.
>> > > >> Another
>> > > >> > example is common expression elimination - H2 cannot do that even
>> > > >for
>> > > >> > deterministic functions, what cause performance problems
>> > > >frequently.
>> > > >> >
>> > > >> > I propose to start some work in direction of our own parser. We
>> can
>> > > >start
>> > > >> > with something very simple, e.g. DDL support, which is not that
>> > > >complex,
>> > > >> > but will improve usability significantly. And then gradually
>> extend
>> > > >it to
>> > > >> > more complex statements where both rich BNF and optimizer is
>> > > >necessary.
>> > > >> >
>> > > >> > Thoughts?
>> > > >> >
>> > > >> > Vladimir.
>> > > >> >
>> > > >>
>> > > >>
>> > > >>
>> > > >> --
>> > > >> Best regards,
>> > > >> Andrey V. Mashenkov
>> > > >>
>> > >
>> >
>>

Re: Create own SQL parser

Posted by Vladimir Ozerov <vo...@gridgain.com>.

No, it will work as follows:

Model parse(String sql) {
    Model res = tryParseWithIgnite(sql); // Parse what we can

    if (res == null)
        res = parseWithH2(sql);

    return res;
}

We will need a number of custom commands which are not present in H2.

On Wed, Aug 2, 2017 at 3:44 AM, Dmitriy Setrakyan <ds...@apache.org>
wrote:

> On Tue, Aug 1, 2017 at 11:08 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
> > Own parser capable of processing non-SELECT and non-DML statements.
> >
>
> And how will it integrate with H2 parser? Or are you suggesting that we get
> rid of H2 SQL parser?
>
>
> >
> > On Tue, Aug 1, 2017 at 9:44 PM, <ds...@apache.org> wrote:
> >
> > > Vova, I am not sure what you are proposing... extending H2 parser with
> > new
> > > syntax or a brand new parser?
> > >
> > > ⁣D.
> > >
> > > On Aug 1, 2017, 4:26 PM, at 4:26 PM, Vladimir Ozerov <
> > vozerov@gridgain.com>
> > > wrote:
> > > >Andrey,
> > > >
> > > >Note that I am not proposing to remove H2 as a whole. Main point for
> > > >now is
> > > >to support missing pieces of DDL syntax and possibly and some
> > > >extensions.
> > > >Several examples:
> > > >
> > > >1) Currently CREATE TABLE command looks ugly:
> > > >CREATE TABLE Person (id LONG, name VARCHAR) WITH
> "template=PARTITIONED,
> > > >backups=1, ..."
> > > >
> > > >Commas typically treated in a special way in editors and IDEs, so user
> > > >will
> > > >have to escape them, making usability even worse.
> > > >
> > > >2) What if I need to introduce new template? Currently you have to
> > > >restart
> > > >the node with new config. With our own parser you will do:
> > > >CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
> > > >CREATE TABLE Person (...) TEMPLATE my_template;
> > > >
> > > >No restarts, everything is done dynamically.
> > > >
> > > >
> > > >
> > > >On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov
> > > ><andrey.mashenkov@gmail.com
> > > >> wrote:
> > > >
> > > >> Vovan,
> > > >>
> > > >> 1. What about ANSI-xx compliant? Will new syntax brake it in some
> > > >cases or
> > > >> just extend?
> > > >>
> > > >> 2. This would be great to have more ways for optimization.
> > > >>
> > > >> Does anyone know or may be have experience with some frameworks or
> > > >open
> > > >> projects which can be helpful? E.g. Apache Calcite?
> > > >>
> > > >> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov
> > > ><vo...@gridgain.com>
> > > >> wrote:
> > > >>
> > > >> > Igniters,
> > > >> >
> > > >> > As you know, we rely on H2 for SQL query parsing. This has several
> > > >> > drawbacks:
> > > >> >
> > > >> > 1) Limited and ugly syntax
> > > >> > Ignite has lot's of unique concepts which are in no way supported
> > > >by
> > > >> > traditional RDBMS in general, or by H2 in particular. For example:
> > > >> > - query hints ("distributedJoins", "replicatedOnly", "colocated")
> > > >> > - index hints ("inline size")
> > > >> > - cache configuration (memory policy, affinity key, cache mode,
> > > >etc)
> > > >> > - transaction mode (concurrency, isolation, timeouts) - not needed
> > > >now,
> > > >> but
> > > >> > will be required when transactional SQL is ready
> > > >> >
> > > >> > 2) Performance implications
> > > >> > Typical SQL processing flow looks as follows
> > > >> > - Parse String to H2 object form (prepared statement)
> > > >> > - Convert it to Ignite object form (AST)
> > > >> > - Then convert it back to map and reduce queries in String form
> > > >> > - Convert map and reduce queries from String back to H2
> > > >PreparedStatement
> > > >> > again for final execution
> > > >> >
> > > >> > This is way too much. Moreover, H2 optimizes query during parsing,
> > > >but
> > > >> it's
> > > >> > optimizer is not smart enough. E.g., Ignite "IN" clauses are not
> > > >> optimized
> > > >> > and hence doesn't use indexes, so we force users to use
> > > >intermediate
> > > >> tables
> > > >> > with very ugly syntax, while we should do that on our own instead.
> > > >> Another
> > > >> > example is common expression elimination - H2 cannot do that even
> > > >for
> > > >> > deterministic functions, what cause performance problems
> > > >frequently.
> > > >> >
> > > >> > I propose to start some work in direction of our own parser. We
> can
> > > >start
> > > >> > with something very simple, e.g. DDL support, which is not that
> > > >complex,
> > > >> > but will improve usability significantly. And then gradually
> extend
> > > >it to
> > > >> > more complex statements where both rich BNF and optimizer is
> > > >necessary.
> > > >> >
> > > >> > Thoughts?
> > > >> >
> > > >> > Vladimir.
> > > >> >
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Best regards,
> > > >> Andrey V. Mashenkov
> > > >>
> > >
> >
>

Re: Create own SQL parser

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Tue, Aug 1, 2017 at 11:08 PM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> Own parser capable of processing non-SELECT and non-DML statements.
>

And how will it integrate with H2 parser? Or are you suggesting that we get
rid of H2 SQL parser?


>
> On Tue, Aug 1, 2017 at 9:44 PM, <ds...@apache.org> wrote:
>
> > Vova, I am not sure what you are proposing... extending H2 parser with
> new
> > syntax or a brand new parser?
> >
> > ⁣D.
> >
> > On Aug 1, 2017, 4:26 PM, at 4:26 PM, Vladimir Ozerov <
> vozerov@gridgain.com>
> > wrote:
> > >Andrey,
> > >
> > >Note that I am not proposing to remove H2 as a whole. Main point for
> > >now is
> > >to support missing pieces of DDL syntax and possibly and some
> > >extensions.
> > >Several examples:
> > >
> > >1) Currently CREATE TABLE command looks ugly:
> > >CREATE TABLE Person (id LONG, name VARCHAR) WITH "template=PARTITIONED,
> > >backups=1, ..."
> > >
> > >Commas typically treated in a special way in editors and IDEs, so user
> > >will
> > >have to escape them, making usability even worse.
> > >
> > >2) What if I need to introduce new template? Currently you have to
> > >restart
> > >the node with new config. With our own parser you will do:
> > >CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
> > >CREATE TABLE Person (...) TEMPLATE my_template;
> > >
> > >No restarts, everything is done dynamically.
> > >
> > >
> > >
> > >On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov
> > ><andrey.mashenkov@gmail.com
> > >> wrote:
> > >
> > >> Vovan,
> > >>
> > >> 1. What about ANSI-xx compliant? Will new syntax brake it in some
> > >cases or
> > >> just extend?
> > >>
> > >> 2. This would be great to have more ways for optimization.
> > >>
> > >> Does anyone know or may be have experience with some frameworks or
> > >open
> > >> projects which can be helpful? E.g. Apache Calcite?
> > >>
> > >> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov
> > ><vo...@gridgain.com>
> > >> wrote:
> > >>
> > >> > Igniters,
> > >> >
> > >> > As you know, we rely on H2 for SQL query parsing. This has several
> > >> > drawbacks:
> > >> >
> > >> > 1) Limited and ugly syntax
> > >> > Ignite has lot's of unique concepts which are in no way supported
> > >by
> > >> > traditional RDBMS in general, or by H2 in particular. For example:
> > >> > - query hints ("distributedJoins", "replicatedOnly", "colocated")
> > >> > - index hints ("inline size")
> > >> > - cache configuration (memory policy, affinity key, cache mode,
> > >etc)
> > >> > - transaction mode (concurrency, isolation, timeouts) - not needed
> > >now,
> > >> but
> > >> > will be required when transactional SQL is ready
> > >> >
> > >> > 2) Performance implications
> > >> > Typical SQL processing flow looks as follows
> > >> > - Parse String to H2 object form (prepared statement)
> > >> > - Convert it to Ignite object form (AST)
> > >> > - Then convert it back to map and reduce queries in String form
> > >> > - Convert map and reduce queries from String back to H2
> > >PreparedStatement
> > >> > again for final execution
> > >> >
> > >> > This is way too much. Moreover, H2 optimizes query during parsing,
> > >but
> > >> it's
> > >> > optimizer is not smart enough. E.g., Ignite "IN" clauses are not
> > >> optimized
> > >> > and hence doesn't use indexes, so we force users to use
> > >intermediate
> > >> tables
> > >> > with very ugly syntax, while we should do that on our own instead.
> > >> Another
> > >> > example is common expression elimination - H2 cannot do that even
> > >for
> > >> > deterministic functions, what cause performance problems
> > >frequently.
> > >> >
> > >> > I propose to start some work in direction of our own parser. We can
> > >start
> > >> > with something very simple, e.g. DDL support, which is not that
> > >complex,
> > >> > but will improve usability significantly. And then gradually extend
> > >it to
> > >> > more complex statements where both rich BNF and optimizer is
> > >necessary.
> > >> >
> > >> > Thoughts?
> > >> >
> > >> > Vladimir.
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> Best regards,
> > >> Andrey V. Mashenkov
> > >>
> >
>

Re: Create own SQL parser

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Own parser capable of processing non-SELECT and non-DML statements.

On Tue, Aug 1, 2017 at 9:44 PM, <ds...@apache.org> wrote:

> Vova, I am not sure what you are proposing... extending H2 parser with new
> syntax or a brand new parser?
>
> ⁣D.
>
> On Aug 1, 2017, 4:26 PM, at 4:26 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
> >Andrey,
> >
> >Note that I am not proposing to remove H2 as a whole. Main point for
> >now is
> >to support missing pieces of DDL syntax and possibly and some
> >extensions.
> >Several examples:
> >
> >1) Currently CREATE TABLE command looks ugly:
> >CREATE TABLE Person (id LONG, name VARCHAR) WITH "template=PARTITIONED,
> >backups=1, ..."
> >
> >Commas typically treated in a special way in editors and IDEs, so user
> >will
> >have to escape them, making usability even worse.
> >
> >2) What if I need to introduce new template? Currently you have to
> >restart
> >the node with new config. With our own parser you will do:
> >CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
> >CREATE TABLE Person (...) TEMPLATE my_template;
> >
> >No restarts, everything is done dynamically.
> >
> >
> >
> >On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov
> ><andrey.mashenkov@gmail.com
> >> wrote:
> >
> >> Vovan,
> >>
> >> 1. What about ANSI-xx compliant? Will new syntax brake it in some
> >cases or
> >> just extend?
> >>
> >> 2. This would be great to have more ways for optimization.
> >>
> >> Does anyone know or may be have experience with some frameworks or
> >open
> >> projects which can be helpful? E.g. Apache Calcite?
> >>
> >> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov
> ><vo...@gridgain.com>
> >> wrote:
> >>
> >> > Igniters,
> >> >
> >> > As you know, we rely on H2 for SQL query parsing. This has several
> >> > drawbacks:
> >> >
> >> > 1) Limited and ugly syntax
> >> > Ignite has lot's of unique concepts which are in no way supported
> >by
> >> > traditional RDBMS in general, or by H2 in particular. For example:
> >> > - query hints ("distributedJoins", "replicatedOnly", "colocated")
> >> > - index hints ("inline size")
> >> > - cache configuration (memory policy, affinity key, cache mode,
> >etc)
> >> > - transaction mode (concurrency, isolation, timeouts) - not needed
> >now,
> >> but
> >> > will be required when transactional SQL is ready
> >> >
> >> > 2) Performance implications
> >> > Typical SQL processing flow looks as follows
> >> > - Parse String to H2 object form (prepared statement)
> >> > - Convert it to Ignite object form (AST)
> >> > - Then convert it back to map and reduce queries in String form
> >> > - Convert map and reduce queries from String back to H2
> >PreparedStatement
> >> > again for final execution
> >> >
> >> > This is way too much. Moreover, H2 optimizes query during parsing,
> >but
> >> it's
> >> > optimizer is not smart enough. E.g., Ignite "IN" clauses are not
> >> optimized
> >> > and hence doesn't use indexes, so we force users to use
> >intermediate
> >> tables
> >> > with very ugly syntax, while we should do that on our own instead.
> >> Another
> >> > example is common expression elimination - H2 cannot do that even
> >for
> >> > deterministic functions, what cause performance problems
> >frequently.
> >> >
> >> > I propose to start some work in direction of our own parser. We can
> >start
> >> > with something very simple, e.g. DDL support, which is not that
> >complex,
> >> > but will improve usability significantly. And then gradually extend
> >it to
> >> > more complex statements where both rich BNF and optimizer is
> >necessary.
> >> >
> >> > Thoughts?
> >> >
> >> > Vladimir.
> >> >
> >>
> >>
> >>
> >> --
> >> Best regards,
> >> Andrey V. Mashenkov
> >>
>

Re: Create own SQL parser

Posted by ds...@apache.org.

Vova, I am not sure what you are proposing... extending H2 parser with new syntax or a brand new parser?

⁣D.

On Aug 1, 2017, 4:26 PM, at 4:26 PM, Vladimir Ozerov <vo...@gridgain.com> wrote:
>Andrey,
>
>Note that I am not proposing to remove H2 as a whole. Main point for
>now is
>to support missing pieces of DDL syntax and possibly and some
>extensions.
>Several examples:
>
>1) Currently CREATE TABLE command looks ugly:
>CREATE TABLE Person (id LONG, name VARCHAR) WITH "template=PARTITIONED,
>backups=1, ..."
>
>Commas typically treated in a special way in editors and IDEs, so user
>will
>have to escape them, making usability even worse.
>
>2) What if I need to introduce new template? Currently you have to
>restart
>the node with new config. With our own parser you will do:
>CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
>CREATE TABLE Person (...) TEMPLATE my_template;
>
>No restarts, everything is done dynamically.
>
>
>
>On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov
><andrey.mashenkov@gmail.com
>> wrote:
>
>> Vovan,
>>
>> 1. What about ANSI-xx compliant? Will new syntax brake it in some
>cases or
>> just extend?
>>
>> 2. This would be great to have more ways for optimization.
>>
>> Does anyone know or may be have experience with some frameworks or
>open
>> projects which can be helpful? E.g. Apache Calcite?
>>
>> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov
><vo...@gridgain.com>
>> wrote:
>>
>> > Igniters,
>> >
>> > As you know, we rely on H2 for SQL query parsing. This has several
>> > drawbacks:
>> >
>> > 1) Limited and ugly syntax
>> > Ignite has lot's of unique concepts which are in no way supported
>by
>> > traditional RDBMS in general, or by H2 in particular. For example:
>> > - query hints ("distributedJoins", "replicatedOnly", "colocated")
>> > - index hints ("inline size")
>> > - cache configuration (memory policy, affinity key, cache mode,
>etc)
>> > - transaction mode (concurrency, isolation, timeouts) - not needed
>now,
>> but
>> > will be required when transactional SQL is ready
>> >
>> > 2) Performance implications
>> > Typical SQL processing flow looks as follows
>> > - Parse String to H2 object form (prepared statement)
>> > - Convert it to Ignite object form (AST)
>> > - Then convert it back to map and reduce queries in String form
>> > - Convert map and reduce queries from String back to H2
>PreparedStatement
>> > again for final execution
>> >
>> > This is way too much. Moreover, H2 optimizes query during parsing,
>but
>> it's
>> > optimizer is not smart enough. E.g., Ignite "IN" clauses are not
>> optimized
>> > and hence doesn't use indexes, so we force users to use
>intermediate
>> tables
>> > with very ugly syntax, while we should do that on our own instead.
>> Another
>> > example is common expression elimination - H2 cannot do that even
>for
>> > deterministic functions, what cause performance problems
>frequently.
>> >
>> > I propose to start some work in direction of our own parser. We can
>start
>> > with something very simple, e.g. DDL support, which is not that
>complex,
>> > but will improve usability significantly. And then gradually extend
>it to
>> > more complex statements where both rich BNF and optimizer is
>necessary.
>> >
>> > Thoughts?
>> >
>> > Vladimir.
>> >
>>
>>
>>
>> --
>> Best regards,
>> Andrey V. Mashenkov
>>

Re: Create own SQL parser

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Andrey,

Note that I am not proposing to remove H2 as a whole. Main point for now is
to support missing pieces of DDL syntax and possibly and some extensions.
Several examples:

1) Currently CREATE TABLE command looks ugly:
CREATE TABLE Person (id LONG, name VARCHAR) WITH "template=PARTITIONED,
backups=1, ..."

Commas typically treated in a special way in editors and IDEs, so user will
have to escape them, making usability even worse.

2) What if I need to introduce new template? Currently you have to restart
the node with new config. With our own parser you will do:
CREATE TEMPLATE my_template MODE=PARTITIONED, BACKUPS=1;
CREATE TABLE Person (...) TEMPLATE my_template;

No restarts, everything is done dynamically.



On Tue, Aug 1, 2017 at 4:18 PM, Andrey Mashenkov <andrey.mashenkov@gmail.com
> wrote:

> Vovan,
>
> 1. What about ANSI-xx compliant? Will new syntax brake it in some cases or
> just extend?
>
> 2. This would be great to have more ways for optimization.
>
> Does anyone know or may be have experience with some frameworks or open
> projects which can be helpful? E.g. Apache Calcite?
>
> On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
> > Igniters,
> >
> > As you know, we rely on H2 for SQL query parsing. This has several
> > drawbacks:
> >
> > 1) Limited and ugly syntax
> > Ignite has lot's of unique concepts which are in no way supported by
> > traditional RDBMS in general, or by H2 in particular. For example:
> > - query hints ("distributedJoins", "replicatedOnly", "colocated")
> > - index hints ("inline size")
> > - cache configuration (memory policy, affinity key, cache mode, etc)
> > - transaction mode (concurrency, isolation, timeouts) - not needed now,
> but
> > will be required when transactional SQL is ready
> >
> > 2) Performance implications
> > Typical SQL processing flow looks as follows
> > - Parse String to H2 object form (prepared statement)
> > - Convert it to Ignite object form (AST)
> > - Then convert it back to map and reduce queries in String form
> > - Convert map and reduce queries from String back to H2 PreparedStatement
> > again for final execution
> >
> > This is way too much. Moreover, H2 optimizes query during parsing, but
> it's
> > optimizer is not smart enough. E.g., Ignite "IN" clauses are not
> optimized
> > and hence doesn't use indexes, so we force users to use intermediate
> tables
> > with very ugly syntax, while we should do that on our own instead.
> Another
> > example is common expression elimination - H2 cannot do that even for
> > deterministic functions, what cause performance problems frequently.
> >
> > I propose to start some work in direction of our own parser. We can start
> > with something very simple, e.g. DDL support, which is not that complex,
> > but will improve usability significantly. And then gradually extend it to
> > more complex statements where both rich BNF and optimizer is necessary.
> >
> > Thoughts?
> >
> > Vladimir.
> >
>
>
>
> --
> Best regards,
> Andrey V. Mashenkov
>

Re: Create own SQL parser

Posted by Andrey Mashenkov <an...@gmail.com>.

Vovan,

1. What about ANSI-xx compliant? Will new syntax brake it in some cases or
just extend?

2. This would be great to have more ways for optimization.

Does anyone know or may be have experience with some frameworks or open
projects which can be helpful? E.g. Apache Calcite?

On Tue, Aug 1, 2017 at 3:25 PM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> Igniters,
>
> As you know, we rely on H2 for SQL query parsing. This has several
> drawbacks:
>
> 1) Limited and ugly syntax
> Ignite has lot's of unique concepts which are in no way supported by
> traditional RDBMS in general, or by H2 in particular. For example:
> - query hints ("distributedJoins", "replicatedOnly", "colocated")
> - index hints ("inline size")
> - cache configuration (memory policy, affinity key, cache mode, etc)
> - transaction mode (concurrency, isolation, timeouts) - not needed now, but
> will be required when transactional SQL is ready
>
> 2) Performance implications
> Typical SQL processing flow looks as follows
> - Parse String to H2 object form (prepared statement)
> - Convert it to Ignite object form (AST)
> - Then convert it back to map and reduce queries in String form
> - Convert map and reduce queries from String back to H2 PreparedStatement
> again for final execution
>
> This is way too much. Moreover, H2 optimizes query during parsing, but it's
> optimizer is not smart enough. E.g., Ignite "IN" clauses are not optimized
> and hence doesn't use indexes, so we force users to use intermediate tables
> with very ugly syntax, while we should do that on our own instead. Another
> example is common expression elimination - H2 cannot do that even for
> deterministic functions, what cause performance problems frequently.
>
> I propose to start some work in direction of our own parser. We can start
> with something very simple, e.g. DDL support, which is not that complex,
> but will improve usability significantly. And then gradually extend it to
> more complex statements where both rich BNF and optimizer is necessary.
>
> Thoughts?
>
> Vladimir.
>



-- 
Best regards,
Andrey V. Mashenkov