You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Alexander Paschenko <al...@gmail.com> on 2017/01/11 17:54:03 UTC

DDL implementation details

Hello Igniters,

I would like to start discussion about implementation of SQL DDL commands.

At the first stage, the most important ones seem to be CREATE TABLE
(that will obviously correspond to creation of a cache) and CREATE
INDEX.

Regarding first one: SQL command for CREATE TABLE does not contain any
hints about cache settings (atomicity, replication, etc.), so these
will probably be defined by some configuration properties (like
ignite.ddl.default_cache_atomiticity, etc).

Also it does not allow to distinguish between key and value columns -
currently it is handled by keyFields property of QueryEntity, but it
is unclear how to declare key fields via CREATE TABLE.

So at a first glance it seems like we should either implement some
sort of custom parsing (I believe Sergi will be against it) or
introduce some kind of name prefix that would tell SQL engine that
certain column is a key field column.

Of course, this problem disappears is key is of SQL type.

Regarding CREATE INDEX: probably at first we will have to implement
this in "stop-the-world" manner, i.e. all cache will be blocked during
the index's initial buildup.

Any thoughts?

Currently I'm working on parsing of those commands as that will be
needed anyway and does not affect further implementation.

- Alex

Re: DDL implementation details

Posted by Dmitriy Setrakyan <ds...@apache.org>.

Alexander, not sure what is the point of creating a table without indexes.
I would combine stage 1 and 2.

On Thu, Jan 12, 2017 at 12:57 PM, Denis Magda <dm...@apache.org> wrote:

> Guys,
>
> As for the stages I would propose the following three creating separate
> JIRA tickets for them.
>
> Stage 1:
> - CREATE/DROP SCHEMA.
> - CREATE/DROP TABLE.
>
> Stage 2:
> - CREATE/DROP INDEX.
> - indexes are updated in the ‘lock-the-world mode'
>
> Stage 3:
> - CREATE/DROP INDEX.
> - indexes are updated concurrently.
>
> Going further we might need to make up a command like ‘CREATE CLUSTER’
> that will be mapped to IgniteConfiguration. Using the command the user will
> fill in the whole cluster configuration from DDL without a need to go to
> XML at all.
>
> Thoughts?
>
> —
> Denis
>
> > On Jan 12, 2017, at 12:41 AM, Sergey Kozlov <sk...@gridgain.com>
> wrote:
> >
> > As first stage of DDL we can implement following CREATE TABLE statement
> > support:
> > - CREATE TABLE without cache properties (use default cache properties or
> > cache properties defined in SQL Schema)
> > - CREATE TABLE .. LIKE where we can create a cache based on an another
> > existing cache.
> >
> > On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <
> dsetrakyan@apache.org>
> > wrote:
> >
> >> Agree with Sergey. We should be able to specify cache properties inside
> of
> >> SQL statements. Does H2 have any support to process SQL hints? Can we
> >> change it?
> >>
> >> Having said that, while we finalize the above, I think we should start
> >> working on DDL implementation to use the default settings, as specified
> in
> >> Alexander's email.
> >>
> >> Also agree with the stop-the-world on the cache for index creation. We
> can
> >> always improve on it in future.
> >>
> >> D.
> >>
> >> On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <sk...@gridgain.com>
> >> wrote:
> >>
> >>> Hi
> >>>
> >>> I suppose we should put any ignite cache properties as additional
> >>> non-standard attributes after CREATE TABLE () clause as it does
> >> Postgress,
> >>> MySQL and other RDBMS.
> >>> Take a look on CREATE TABLE with using TABLESPACE (Postgess) or for
> >> CREATE
> >>> TABLE with using PARTITIONS (MySQL).
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
> vozerov@gridgain.com>
> >>> wrote:
> >>>
> >>>> I believe custom synthax and parsing is a *must* for us, as well as
> for
> >>> any
> >>>> distributed database. At the very least we need to specify affinity
> key
> >>>> column somehow. Any cache property can be specified at the very end of
> >>>> table definition. Key columns can be determined as the ones with
> >> PRIMARY
> >>>> KEY constraint (Alex K. idea) + affinity column(s):
> >>>>
> >>>> CREATE TABLE employee (
> >>>>    id BIGINT PRIMARY KEY,
> >>>>    dept_id BIGINT AFFINITY KEY,
> >>>>    name VARCHAR(128),
> >>>>    address VARCHAR(256)
> >>>>    BACKUPS 2,
> >>>>    ATOMICITY_MODE ATOMIC,
> >>>> );
> >>>>
> >>>> "id" and "dept_id" form key type, "name" and "address" form value
> type.
> >>>>
> >>>> Vladimir.
> >>>>
> >>>> On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
> >> akuznetsov@apache.org
> >>>>
> >>>> wrote:
> >>>>
> >>>>> Hi, Alex!
> >>>>>
> >>>>> As far as I know most RDBMS allow something like: create table t1 (id
> >>>>> integer primary key, ....)
> >>>>> How about to take as key field that marked as "primary key"?
> >>>>>
> >>>>> As of atomicity and replication - I think it is a cache properties
> >> and
> >>>> with
> >>>>> create table we will create "types" in cache. No?
> >>>>> I thought that cache it is a kind of "schema" in RDBMS.
> >>>>>
> >>>>> Could you describe what will be created with CREATE TABLE?
> >>>>>
> >>>>> On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> >>>>> alexander.a.paschenko@gmail.com> wrote:
> >>>>>
> >>>>>> Hello Igniters,
> >>>>>>
> >>>>>> I would like to start discussion about implementation of SQL DDL
> >>>>> commands.
> >>>>>>
> >>>>>> At the first stage, the most important ones seem to be CREATE TABLE
> >>>>>> (that will obviously correspond to creation of a cache) and CREATE
> >>>>>> INDEX.
> >>>>>>
> >>>>>> Regarding first one: SQL command for CREATE TABLE does not contain
> >>> any
> >>>>>> hints about cache settings (atomicity, replication, etc.), so these
> >>>>>> will probably be defined by some configuration properties (like
> >>>>>> ignite.ddl.default_cache_atomiticity, etc).
> >>>>>>
> >>>>>> Also it does not allow to distinguish between key and value
> >> columns -
> >>>>>> currently it is handled by keyFields property of QueryEntity, but
> >> it
> >>>>>> is unclear how to declare key fields via CREATE TABLE.
> >>>>>>
> >>>>>> So at a first glance it seems like we should either implement some
> >>>>>> sort of custom parsing (I believe Sergi will be against it) or
> >>>>>> introduce some kind of name prefix that would tell SQL engine that
> >>>>>> certain column is a key field column.
> >>>>>>
> >>>>>> Of course, this problem disappears is key is of SQL type.
> >>>>>>
> >>>>>> Regarding CREATE INDEX: probably at first we will have to implement
> >>>>>> this in "stop-the-world" manner, i.e. all cache will be blocked
> >>> during
> >>>>>> the index's initial buildup.
> >>>>>>
> >>>>>> Any thoughts?
> >>>>>>
> >>>>>> Currently I'm working on parsing of those commands as that will be
> >>>>>> needed anyway and does not affect further implementation.
> >>>>>>
> >>>>>> - Alex
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Alexey Kuznetsov
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Sergey Kozlov
> >>> GridGain Systems
> >>> www.gridgain.com
> >>>
> >>
> >
> >
> >
> > --
> > Sergey Kozlov
> > GridGain Systems
> > www.gridgain.com
>
>

Re: DDL implementation details

Posted by Denis Magda <dm...@apache.org>.

Guys,

As for the stages I would propose the following three creating separate JIRA tickets for them.

Stage 1:
- CREATE/DROP SCHEMA.
- CREATE/DROP TABLE.

Stage 2:
- CREATE/DROP INDEX.
- indexes are updated in the ‘lock-the-world mode'

Stage 3:
- CREATE/DROP INDEX.
- indexes are updated concurrently.

Going further we might need to make up a command like ‘CREATE CLUSTER’ that will be mapped to IgniteConfiguration. Using the command the user will fill in the whole cluster configuration from DDL without a need to go to XML at all.

Thoughts?

—
Denis

> On Jan 12, 2017, at 12:41 AM, Sergey Kozlov <sk...@gridgain.com> wrote:
> 
> As first stage of DDL we can implement following CREATE TABLE statement
> support:
> - CREATE TABLE without cache properties (use default cache properties or
> cache properties defined in SQL Schema)
> - CREATE TABLE .. LIKE where we can create a cache based on an another
> existing cache.
> 
> On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <ds...@apache.org>
> wrote:
> 
>> Agree with Sergey. We should be able to specify cache properties inside of
>> SQL statements. Does H2 have any support to process SQL hints? Can we
>> change it?
>> 
>> Having said that, while we finalize the above, I think we should start
>> working on DDL implementation to use the default settings, as specified in
>> Alexander's email.
>> 
>> Also agree with the stop-the-world on the cache for index creation. We can
>> always improve on it in future.
>> 
>> D.
>> 
>> On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <sk...@gridgain.com>
>> wrote:
>> 
>>> Hi
>>> 
>>> I suppose we should put any ignite cache properties as additional
>>> non-standard attributes after CREATE TABLE () clause as it does
>> Postgress,
>>> MySQL and other RDBMS.
>>> Take a look on CREATE TABLE with using TABLESPACE (Postgess) or for
>> CREATE
>>> TABLE with using PARTITIONS (MySQL).
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <vo...@gridgain.com>
>>> wrote:
>>> 
>>>> I believe custom synthax and parsing is a *must* for us, as well as for
>>> any
>>>> distributed database. At the very least we need to specify affinity key
>>>> column somehow. Any cache property can be specified at the very end of
>>>> table definition. Key columns can be determined as the ones with
>> PRIMARY
>>>> KEY constraint (Alex K. idea) + affinity column(s):
>>>> 
>>>> CREATE TABLE employee (
>>>>    id BIGINT PRIMARY KEY,
>>>>    dept_id BIGINT AFFINITY KEY,
>>>>    name VARCHAR(128),
>>>>    address VARCHAR(256)
>>>>    BACKUPS 2,
>>>>    ATOMICITY_MODE ATOMIC,
>>>> );
>>>> 
>>>> "id" and "dept_id" form key type, "name" and "address" form value type.
>>>> 
>>>> Vladimir.
>>>> 
>>>> On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
>> akuznetsov@apache.org
>>>> 
>>>> wrote:
>>>> 
>>>>> Hi, Alex!
>>>>> 
>>>>> As far as I know most RDBMS allow something like: create table t1 (id
>>>>> integer primary key, ....)
>>>>> How about to take as key field that marked as "primary key"?
>>>>> 
>>>>> As of atomicity and replication - I think it is a cache properties
>> and
>>>> with
>>>>> create table we will create "types" in cache. No?
>>>>> I thought that cache it is a kind of "schema" in RDBMS.
>>>>> 
>>>>> Could you describe what will be created with CREATE TABLE?
>>>>> 
>>>>> On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
>>>>> alexander.a.paschenko@gmail.com> wrote:
>>>>> 
>>>>>> Hello Igniters,
>>>>>> 
>>>>>> I would like to start discussion about implementation of SQL DDL
>>>>> commands.
>>>>>> 
>>>>>> At the first stage, the most important ones seem to be CREATE TABLE
>>>>>> (that will obviously correspond to creation of a cache) and CREATE
>>>>>> INDEX.
>>>>>> 
>>>>>> Regarding first one: SQL command for CREATE TABLE does not contain
>>> any
>>>>>> hints about cache settings (atomicity, replication, etc.), so these
>>>>>> will probably be defined by some configuration properties (like
>>>>>> ignite.ddl.default_cache_atomiticity, etc).
>>>>>> 
>>>>>> Also it does not allow to distinguish between key and value
>> columns -
>>>>>> currently it is handled by keyFields property of QueryEntity, but
>> it
>>>>>> is unclear how to declare key fields via CREATE TABLE.
>>>>>> 
>>>>>> So at a first glance it seems like we should either implement some
>>>>>> sort of custom parsing (I believe Sergi will be against it) or
>>>>>> introduce some kind of name prefix that would tell SQL engine that
>>>>>> certain column is a key field column.
>>>>>> 
>>>>>> Of course, this problem disappears is key is of SQL type.
>>>>>> 
>>>>>> Regarding CREATE INDEX: probably at first we will have to implement
>>>>>> this in "stop-the-world" manner, i.e. all cache will be blocked
>>> during
>>>>>> the index's initial buildup.
>>>>>> 
>>>>>> Any thoughts?
>>>>>> 
>>>>>> Currently I'm working on parsing of those commands as that will be
>>>>>> needed anyway and does not affect further implementation.
>>>>>> 
>>>>>> - Alex
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Alexey Kuznetsov
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Sergey Kozlov
>>> GridGain Systems
>>> www.gridgain.com
>>> 
>> 
> 
> 
> 
> -- 
> Sergey Kozlov
> GridGain Systems
> www.gridgain.com

Re: DDL implementation details

Posted by Alexander Paschenko <al...@gmail.com>.

BTW, I have also did some H2 guts inspection, and, as I see it now,
it's not impossible to do custom parsing without (an awful lot) of
ugliness. What we would have to do is basically spoof private
singleton org.h2.engine.Engine#INSTANCE on node start via reflection
with our custom implementation that could invoke custom parsing when
needed.

Not that we need it now - just in case.

- Alex

2017-01-13 2:16 GMT+08:00 Alexander Paschenko <al...@gmail.com>:
> Sergi,
>
> OK, great. Still, what's up with CREATE TABLE? After a bit of code
> digging, I currently don't see major obstacles against registering
> query entities (i.e. type descriptors) on the fly - CREATE TABLE will
> essentially boil down to *GridQueryIndexing#registerType* call.
>
> But, as you have justly noted, we have to keep nodes joining the
> cluster up-to-date about what schemas and tables need to be created in
> order for those nodes to participate in distributed queries. And,
> correct me if I'm wrong, but this is relevant even outside of context
> of 2.0 and page memory and persistent stores, amirite?
>
> - Alex
>
> 2017-01-13 1:47 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
>> The xml config was only for example. We can put in this configuration
>> string cache config parameters directly like this:
>>
>> CREATE SCHEMA "MyCacheName" WITH
>> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
>>
>> Sergi
>>
>> 2017-01-12 20:21 GMT+03:00 Alexander Paschenko <
>> alexander.a.paschenko@gmail.com>:
>>
>>> Sergi, Alexey G.,
>>>
>>> I see your point and am rather inclined to agree that we should let
>>> current notion of "single schema - multiple tables" live.
>>>
>>> Still, if we create schema with cache config file, what's the whole
>>> point of SQL then if the user anyway has to write XML? This probably
>>> could be useful to propagate configuration to all cluster nodes tho.
>>>
>>> And if we skip CREATE TABLE now, it means that we leave user facing
>>> the need to write XML configuration, no other options. Is this what we
>>> want?
>>>
>>> Still I must admit that leaving user with his familiar XML stuff looks
>>> attractive - no messing with bunch of unknown new params, just write
>>> your XML and go. Also it's portable and allows to re-use
>>> configurations easily, so undoubtedly is a good approach from some
>>> point.
>>>
>>> - Alex
>>>
>>> 2017-01-12 23:51 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
>>> > Hi,
>>> >
>>> > 1. For now I'm against inventing any custom SQL syntax and implementing
>>> > parsing.
>>> > Currently H2 supports the following syntax:
>>> >
>>> > CREATE TABLE test(...) WITH "myCustomParamString"
>>> >
>>> > This is enough for us to pass the needed parameters.
>>> >
>>> > 2. Agree with AG, we have to separate cache creation from table creation.
>>> > Cache == SQL schema for us. We just have to add the same WITH syntax in
>>> H2
>>> > for schema creation like this:
>>> >
>>> > CREATE SCHEMA "MyCacheName" WITH "cacheConfig=myCache.xml"
>>> >
>>> > 3. If we want to create tables then I suggest to put this functionality
>>> to
>>> > 2.0+PageMemory right away and think where and how we are going to store
>>> all
>>> > the related metadata.This is especially important for persistent
>>> storages.
>>> >
>>> > Sergi
>>> >
>>> >
>>> > 2017-01-12 16:56 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
>>> >
>>> >> I am afraid in this case user will have to define too much schemes -
>>> >> boilerplate.
>>> >> Does it make sense at all to pack multiple tuples into a single cache
>>> from
>>> >> user perspective?
>>> >>
>>> >> On Thu, Jan 12, 2017 at 4:40 PM, Alexey Goncharuk <
>>> >> alexey.goncharuk@gmail.com> wrote:
>>> >>
>>> >> > Alexander,
>>> >> >
>>> >> > Will we keep the old option to have multiple tables in one cache? If
>>> so,
>>> >> > how will create table statement know which cache to choose?
>>> >> >
>>> >> > It seems to me that to be consistent with the current DML
>>> implementation
>>> >> we
>>> >> > should have a CREATE SCHEMA statement which will define the cache and
>>> >> cache
>>> >> > configuration, and CREATE TABLE should specify the schema name.
>>> >> >
>>> >> > Otherwise, we should enforce the single type per cache rule at the
>>> >> > configuration level and in runtime.
>>> >> >
>>> >> > As for affinity and primary key - agree with Vladimir.
>>> >> >
>>> >> > --
>>> >> > AG
>>> >> >
>>> >> > 2017-01-12 11:41 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
>>> >> >
>>> >> > > As first stage of DDL we can implement following CREATE TABLE
>>> statement
>>> >> > > support:
>>> >> > >  - CREATE TABLE without cache properties (use default cache
>>> properties
>>> >> or
>>> >> > > cache properties defined in SQL Schema)
>>> >> > >  - CREATE TABLE .. LIKE where we can create a cache based on an
>>> another
>>> >> > > existing cache.
>>> >> > >
>>> >> > > On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <
>>> >> > dsetrakyan@apache.org>
>>> >> > > wrote:
>>> >> > >
>>> >> > > > Agree with Sergey. We should be able to specify cache properties
>>> >> inside
>>> >> > > of
>>> >> > > > SQL statements. Does H2 have any support to process SQL hints?
>>> Can we
>>> >> > > > change it?
>>> >> > > >
>>> >> > > > Having said that, while we finalize the above, I think we should
>>> >> start
>>> >> > > > working on DDL implementation to use the default settings, as
>>> >> specified
>>> >> > > in
>>> >> > > > Alexander's email.
>>> >> > > >
>>> >> > > > Also agree with the stop-the-world on the cache for index
>>> creation.
>>> >> We
>>> >> > > can
>>> >> > > > always improve on it in future.
>>> >> > > >
>>> >> > > > D.
>>> >> > > >
>>> >> > > > On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <
>>> >> skozlov@gridgain.com>
>>> >> > > > wrote:
>>> >> > > >
>>> >> > > > > Hi
>>> >> > > > >
>>> >> > > > > I suppose we should put any ignite cache properties as
>>> additional
>>> >> > > > > non-standard attributes after CREATE TABLE () clause as it does
>>> >> > > > Postgress,
>>> >> > > > > MySQL and other RDBMS.
>>> >> > > > > Take a look on CREATE TABLE with using TABLESPACE (Postgess) or
>>> for
>>> >> > > > CREATE
>>> >> > > > > TABLE with using PARTITIONS (MySQL).
>>> >> > > > >
>>> >> > > > >
>>> >> > > > >
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
>>> >> > > vozerov@gridgain.com>
>>> >> > > > > wrote:
>>> >> > > > >
>>> >> > > > > > I believe custom synthax and parsing is a *must* for us, as
>>> well
>>> >> as
>>> >> > > for
>>> >> > > > > any
>>> >> > > > > > distributed database. At the very least we need to specify
>>> >> affinity
>>> >> > > key
>>> >> > > > > > column somehow. Any cache property can be specified at the
>>> very
>>> >> end
>>> >> > > of
>>> >> > > > > > table definition. Key columns can be determined as the ones
>>> with
>>> >> > > > PRIMARY
>>> >> > > > > > KEY constraint (Alex K. idea) + affinity column(s):
>>> >> > > > > >
>>> >> > > > > > CREATE TABLE employee (
>>> >> > > > > >     id BIGINT PRIMARY KEY,
>>> >> > > > > >     dept_id BIGINT AFFINITY KEY,
>>> >> > > > > >     name VARCHAR(128),
>>> >> > > > > >     address VARCHAR(256)
>>> >> > > > > >     BACKUPS 2,
>>> >> > > > > >     ATOMICITY_MODE ATOMIC,
>>> >> > > > > > );
>>> >> > > > > >
>>> >> > > > > > "id" and "dept_id" form key type, "name" and "address" form
>>> value
>>> >> > > type.
>>> >> > > > > >
>>> >> > > > > > Vladimir.
>>> >> > > > > >
>>> >> > > > > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
>>> >> > > > akuznetsov@apache.org
>>> >> > > > > >
>>> >> > > > > > wrote:
>>> >> > > > > >
>>> >> > > > > > > Hi, Alex!
>>> >> > > > > > >
>>> >> > > > > > > As far as I know most RDBMS allow something like: create
>>> table
>>> >> t1
>>> >> > > (id
>>> >> > > > > > > integer primary key, ....)
>>> >> > > > > > > How about to take as key field that marked as "primary key"?
>>> >> > > > > > >
>>> >> > > > > > > As of atomicity and replication - I think it is a cache
>>> >> > properties
>>> >> > > > and
>>> >> > > > > > with
>>> >> > > > > > > create table we will create "types" in cache. No?
>>> >> > > > > > > I thought that cache it is a kind of "schema" in RDBMS.
>>> >> > > > > > >
>>> >> > > > > > > Could you describe what will be created with CREATE TABLE?
>>> >> > > > > > >
>>> >> > > > > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
>>> >> > > > > > > alexander.a.paschenko@gmail.com> wrote:
>>> >> > > > > > >
>>> >> > > > > > > > Hello Igniters,
>>> >> > > > > > > >
>>> >> > > > > > > > I would like to start discussion about implementation of
>>> SQL
>>> >> > DDL
>>> >> > > > > > > commands.
>>> >> > > > > > > >
>>> >> > > > > > > > At the first stage, the most important ones seem to be
>>> CREATE
>>> >> > > TABLE
>>> >> > > > > > > > (that will obviously correspond to creation of a cache)
>>> and
>>> >> > > CREATE
>>> >> > > > > > > > INDEX.
>>> >> > > > > > > >
>>> >> > > > > > > > Regarding first one: SQL command for CREATE TABLE does not
>>> >> > > contain
>>> >> > > > > any
>>> >> > > > > > > > hints about cache settings (atomicity, replication,
>>> etc.), so
>>> >> > > these
>>> >> > > > > > > > will probably be defined by some configuration properties
>>> >> (like
>>> >> > > > > > > > ignite.ddl.default_cache_atomiticity, etc).
>>> >> > > > > > > >
>>> >> > > > > > > > Also it does not allow to distinguish between key and
>>> value
>>> >> > > > columns -
>>> >> > > > > > > > currently it is handled by keyFields property of
>>> QueryEntity,
>>> >> > but
>>> >> > > > it
>>> >> > > > > > > > is unclear how to declare key fields via CREATE TABLE.
>>> >> > > > > > > >
>>> >> > > > > > > > So at a first glance it seems like we should either
>>> implement
>>> >> > > some
>>> >> > > > > > > > sort of custom parsing (I believe Sergi will be against
>>> it)
>>> >> or
>>> >> > > > > > > > introduce some kind of name prefix that would tell SQL
>>> engine
>>> >> > > that
>>> >> > > > > > > > certain column is a key field column.
>>> >> > > > > > > >
>>> >> > > > > > > > Of course, this problem disappears is key is of SQL type.
>>> >> > > > > > > >
>>> >> > > > > > > > Regarding CREATE INDEX: probably at first we will have to
>>> >> > > implement
>>> >> > > > > > > > this in "stop-the-world" manner, i.e. all cache will be
>>> >> blocked
>>> >> > > > > during
>>> >> > > > > > > > the index's initial buildup.
>>> >> > > > > > > >
>>> >> > > > > > > > Any thoughts?
>>> >> > > > > > > >
>>> >> > > > > > > > Currently I'm working on parsing of those commands as that
>>> >> will
>>> >> > > be
>>> >> > > > > > > > needed anyway and does not affect further implementation.
>>> >> > > > > > > >
>>> >> > > > > > > > - Alex
>>> >> > > > > > > >
>>> >> > > > > > >
>>> >> > > > > > >
>>> >> > > > > > >
>>> >> > > > > > > --
>>> >> > > > > > > Alexey Kuznetsov
>>> >> > > > > > >
>>> >> > > > > >
>>> >> > > > >
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > --
>>> >> > > > > Sergey Kozlov
>>> >> > > > > GridGain Systems
>>> >> > > > > www.gridgain.com
>>> >> > > > >
>>> >> > > >
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > --
>>> >> > > Sergey Kozlov
>>> >> > > GridGain Systems
>>> >> > > www.gridgain.com
>>> >> > >
>>> >> >
>>> >>
>>>

Re: DDL implementation details

Posted by Alexey Goncharuk <al...@gmail.com>.

>
> Right, however, this section of the docs
>
> https://apacheignite.readme.io/docs/jcache#section-dynamic-cache
>
> does not clarify Ignite's behavior regarding dynamic caches when a new
> node joins the cluster. Will it just pick up dynamic cache
> configurations from the peers without being explicitly configured? If
> yes, I suggest that we at least mention it in the docs. (Sorry if I'm
> missing something but I skimmed through the documentation and have not
> found a clear answer to this.) Could someone clarify please?
>

Alex,

Yes, newly joined nodes will fetch all started caches configuration and
create local data structured accordingly. If we use schema=cache approach,
nothing beyond cache creation should be done to create the schema. I've
updated the doc you mentioned.

--AG

Re: DDL implementation details

Posted by Alexander Paschenko <al...@gmail.com>.

Dima,

2017-01-13 21:20 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
> How about dynamic caches? Isn't the metadata already maintained across all
> nodes every time a new cache is created? What am I missing here?

Right, however, this section of the docs

https://apacheignite.readme.io/docs/jcache#section-dynamic-cache

does not clarify Ignite's behavior regarding dynamic caches when a new
node joins the cluster. Will it just pick up dynamic cache
configurations from the peers without being explicitly configured? If
yes, I suggest that we at least mention it in the docs. (Sorry if I'm
missing something but I skimmed through the documentation and have not
found a clear answer to this.) Could someone clarify please?

- Alex

Re: DDL implementation details

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Fri, Jan 13, 2017 at 12:26 AM, Alexander Paschenko <
alexander.a.paschenko@gmail.com> wrote:

> Dima,
>
> >> But, as you have justly noted, we have to keep nodes joining the
> >> cluster up-to-date about what schemas and tables need to be created in
> >> order for those nodes to participate in distributed queries. And,
> >> correct me if I'm wrong, but this is relevant even outside of context
> >> of 2.0 and page memory and persistent stores, amirite?
> >>
> >
> > This should already be supported in Ignite. Otherwise, how are we able to
> > run queries today?
> >
>
> Currently SQL structures identity is maintained between nodes via
> configuration identity. Nodes participating in distributed queries
> must be started w/identical configs, there's no mechanism to propagate
> SQL related configuration between peer nodes. This works as long as
> SQL structures are created currently just once, at node startup.
> However, if we wish to create those structures dynamically, we have to
> maintain some sort of that metadata delivery to new nodes.
>

How about dynamic caches? Isn't the metadata already maintained across all
nodes every time a new cache is created? What am I missing here?

Re: DDL implementation details

Posted by Alexander Paschenko <al...@gmail.com>.

Dima,

>> But, as you have justly noted, we have to keep nodes joining the
>> cluster up-to-date about what schemas and tables need to be created in
>> order for those nodes to participate in distributed queries. And,
>> correct me if I'm wrong, but this is relevant even outside of context
>> of 2.0 and page memory and persistent stores, amirite?
>>
>
> This should already be supported in Ignite. Otherwise, how are we able to
> run queries today?
>

Currently SQL structures identity is maintained between nodes via
configuration identity. Nodes participating in distributed queries
must be started w/identical configs, there's no mechanism to propagate
SQL related configuration between peer nodes. This works as long as
SQL structures are created currently just once, at node startup.
However, if we wish to create those structures dynamically, we have to
maintain some sort of that metadata delivery to new nodes.

- Alex

>>
>> 2017-01-13 1:47 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
>> > The xml config was only for example. We can put in this configuration
>> > string cache config parameters directly like this:
>> >
>> > CREATE SCHEMA "MyCacheName" WITH
>> > "cacheMode=REPLICATED;atomicityMode=ATOMIC"
>> >
>> > Sergi
>> >
>> > 2017-01-12 20:21 GMT+03:00 Alexander Paschenko <
>> > alexander.a.paschenko@gmail.com>:
>> >
>> >> Sergi, Alexey G.,
>> >>
>> >> I see your point and am rather inclined to agree that we should let
>> >> current notion of "single schema - multiple tables" live.
>> >>
>> >> Still, if we create schema with cache config file, what's the whole
>> >> point of SQL then if the user anyway has to write XML? This probably
>> >> could be useful to propagate configuration to all cluster nodes tho.
>> >>
>> >> And if we skip CREATE TABLE now, it means that we leave user facing
>> >> the need to write XML configuration, no other options. Is this what we
>> >> want?
>> >>
>> >> Still I must admit that leaving user with his familiar XML stuff looks
>> >> attractive - no messing with bunch of unknown new params, just write
>> >> your XML and go. Also it's portable and allows to re-use
>> >> configurations easily, so undoubtedly is a good approach from some
>> >> point.
>> >>
>> >> - Alex
>> >>
>> >> 2017-01-12 23:51 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
>> >> > Hi,
>> >> >
>> >> > 1. For now I'm against inventing any custom SQL syntax and
>> implementing
>> >> > parsing.
>> >> > Currently H2 supports the following syntax:
>> >> >
>> >> > CREATE TABLE test(...) WITH "myCustomParamString"
>> >> >
>> >> > This is enough for us to pass the needed parameters.
>> >> >
>> >> > 2. Agree with AG, we have to separate cache creation from table
>> creation.
>> >> > Cache == SQL schema for us. We just have to add the same WITH syntax
>> in
>> >> H2
>> >> > for schema creation like this:
>> >> >
>> >> > CREATE SCHEMA "MyCacheName" WITH "cacheConfig=myCache.xml"
>> >> >
>> >> > 3. If we want to create tables then I suggest to put this
>> functionality
>> >> to
>> >> > 2.0+PageMemory right away and think where and how we are going to
>> store
>> >> all
>> >> > the related metadata.This is especially important for persistent
>> >> storages.
>> >> >
>> >> > Sergi
>> >> >
>> >> >
>> >> > 2017-01-12 16:56 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
>> >> >
>> >> >> I am afraid in this case user will have to define too much schemes -
>> >> >> boilerplate.
>> >> >> Does it make sense at all to pack multiple tuples into a single cache
>> >> from
>> >> >> user perspective?
>> >> >>
>> >> >> On Thu, Jan 12, 2017 at 4:40 PM, Alexey Goncharuk <
>> >> >> alexey.goncharuk@gmail.com> wrote:
>> >> >>
>> >> >> > Alexander,
>> >> >> >
>> >> >> > Will we keep the old option to have multiple tables in one cache?
>> If
>> >> so,
>> >> >> > how will create table statement know which cache to choose?
>> >> >> >
>> >> >> > It seems to me that to be consistent with the current DML
>> >> implementation
>> >> >> we
>> >> >> > should have a CREATE SCHEMA statement which will define the cache
>> and
>> >> >> cache
>> >> >> > configuration, and CREATE TABLE should specify the schema name.
>> >> >> >
>> >> >> > Otherwise, we should enforce the single type per cache rule at the
>> >> >> > configuration level and in runtime.
>> >> >> >
>> >> >> > As for affinity and primary key - agree with Vladimir.
>> >> >> >
>> >> >> > --
>> >> >> > AG
>> >> >> >
>> >> >> > 2017-01-12 11:41 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
>> >> >> >
>> >> >> > > As first stage of DDL we can implement following CREATE TABLE
>> >> statement
>> >> >> > > support:
>> >> >> > >  - CREATE TABLE without cache properties (use default cache
>> >> properties
>> >> >> or
>> >> >> > > cache properties defined in SQL Schema)
>> >> >> > >  - CREATE TABLE .. LIKE where we can create a cache based on an
>> >> another
>> >> >> > > existing cache.
>> >> >> > >
>> >> >> > > On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <
>> >> >> > dsetrakyan@apache.org>
>> >> >> > > wrote:
>> >> >> > >
>> >> >> > > > Agree with Sergey. We should be able to specify cache
>> properties
>> >> >> inside
>> >> >> > > of
>> >> >> > > > SQL statements. Does H2 have any support to process SQL hints?
>> >> Can we
>> >> >> > > > change it?
>> >> >> > > >
>> >> >> > > > Having said that, while we finalize the above, I think we
>> should
>> >> >> start
>> >> >> > > > working on DDL implementation to use the default settings, as
>> >> >> specified
>> >> >> > > in
>> >> >> > > > Alexander's email.
>> >> >> > > >
>> >> >> > > > Also agree with the stop-the-world on the cache for index
>> >> creation.
>> >> >> We
>> >> >> > > can
>> >> >> > > > always improve on it in future.
>> >> >> > > >
>> >> >> > > > D.
>> >> >> > > >
>> >> >> > > > On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <
>> >> >> skozlov@gridgain.com>
>> >> >> > > > wrote:
>> >> >> > > >
>> >> >> > > > > Hi
>> >> >> > > > >
>> >> >> > > > > I suppose we should put any ignite cache properties as
>> >> additional
>> >> >> > > > > non-standard attributes after CREATE TABLE () clause as it
>> does
>> >> >> > > > Postgress,
>> >> >> > > > > MySQL and other RDBMS.
>> >> >> > > > > Take a look on CREATE TABLE with using TABLESPACE (Postgess)
>> or
>> >> for
>> >> >> > > > CREATE
>> >> >> > > > > TABLE with using PARTITIONS (MySQL).
>> >> >> > > > >
>> >> >> > > > >
>> >> >> > > > >
>> >> >> > > > >
>> >> >> > > > >
>> >> >> > > > > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
>> >> >> > > vozerov@gridgain.com>
>> >> >> > > > > wrote:
>> >> >> > > > >
>> >> >> > > > > > I believe custom synthax and parsing is a *must* for us, as
>> >> well
>> >> >> as
>> >> >> > > for
>> >> >> > > > > any
>> >> >> > > > > > distributed database. At the very least we need to specify
>> >> >> affinity
>> >> >> > > key
>> >> >> > > > > > column somehow. Any cache property can be specified at the
>> >> very
>> >> >> end
>> >> >> > > of
>> >> >> > > > > > table definition. Key columns can be determined as the ones
>> >> with
>> >> >> > > > PRIMARY
>> >> >> > > > > > KEY constraint (Alex K. idea) + affinity column(s):
>> >> >> > > > > >
>> >> >> > > > > > CREATE TABLE employee (
>> >> >> > > > > >     id BIGINT PRIMARY KEY,
>> >> >> > > > > >     dept_id BIGINT AFFINITY KEY,
>> >> >> > > > > >     name VARCHAR(128),
>> >> >> > > > > >     address VARCHAR(256)
>> >> >> > > > > >     BACKUPS 2,
>> >> >> > > > > >     ATOMICITY_MODE ATOMIC,
>> >> >> > > > > > );
>> >> >> > > > > >
>> >> >> > > > > > "id" and "dept_id" form key type, "name" and "address" form
>> >> value
>> >> >> > > type.
>> >> >> > > > > >
>> >> >> > > > > > Vladimir.
>> >> >> > > > > >
>> >> >> > > > > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
>> >> >> > > > akuznetsov@apache.org
>> >> >> > > > > >
>> >> >> > > > > > wrote:
>> >> >> > > > > >
>> >> >> > > > > > > Hi, Alex!
>> >> >> > > > > > >
>> >> >> > > > > > > As far as I know most RDBMS allow something like: create
>> >> table
>> >> >> t1
>> >> >> > > (id
>> >> >> > > > > > > integer primary key, ....)
>> >> >> > > > > > > How about to take as key field that marked as "primary
>> key"?
>> >> >> > > > > > >
>> >> >> > > > > > > As of atomicity and replication - I think it is a cache
>> >> >> > properties
>> >> >> > > > and
>> >> >> > > > > > with
>> >> >> > > > > > > create table we will create "types" in cache. No?
>> >> >> > > > > > > I thought that cache it is a kind of "schema" in RDBMS.
>> >> >> > > > > > >
>> >> >> > > > > > > Could you describe what will be created with CREATE
>> TABLE?
>> >> >> > > > > > >
>> >> >> > > > > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
>> >> >> > > > > > > alexander.a.paschenko@gmail.com> wrote:
>> >> >> > > > > > >
>> >> >> > > > > > > > Hello Igniters,
>> >> >> > > > > > > >
>> >> >> > > > > > > > I would like to start discussion about implementation
>> of
>> >> SQL
>> >> >> > DDL
>> >> >> > > > > > > commands.
>> >> >> > > > > > > >
>> >> >> > > > > > > > At the first stage, the most important ones seem to be
>> >> CREATE
>> >> >> > > TABLE
>> >> >> > > > > > > > (that will obviously correspond to creation of a cache)
>> >> and
>> >> >> > > CREATE
>> >> >> > > > > > > > INDEX.
>> >> >> > > > > > > >
>> >> >> > > > > > > > Regarding first one: SQL command for CREATE TABLE does
>> not
>> >> >> > > contain
>> >> >> > > > > any
>> >> >> > > > > > > > hints about cache settings (atomicity, replication,
>> >> etc.), so
>> >> >> > > these
>> >> >> > > > > > > > will probably be defined by some configuration
>> properties
>> >> >> (like
>> >> >> > > > > > > > ignite.ddl.default_cache_atomiticity, etc).
>> >> >> > > > > > > >
>> >> >> > > > > > > > Also it does not allow to distinguish between key and
>> >> value
>> >> >> > > > columns -
>> >> >> > > > > > > > currently it is handled by keyFields property of
>> >> QueryEntity,
>> >> >> > but
>> >> >> > > > it
>> >> >> > > > > > > > is unclear how to declare key fields via CREATE TABLE.
>> >> >> > > > > > > >
>> >> >> > > > > > > > So at a first glance it seems like we should either
>> >> implement
>> >> >> > > some
>> >> >> > > > > > > > sort of custom parsing (I believe Sergi will be against
>> >> it)
>> >> >> or
>> >> >> > > > > > > > introduce some kind of name prefix that would tell SQL
>> >> engine
>> >> >> > > that
>> >> >> > > > > > > > certain column is a key field column.
>> >> >> > > > > > > >
>> >> >> > > > > > > > Of course, this problem disappears is key is of SQL
>> type.
>> >> >> > > > > > > >
>> >> >> > > > > > > > Regarding CREATE INDEX: probably at first we will have
>> to
>> >> >> > > implement
>> >> >> > > > > > > > this in "stop-the-world" manner, i.e. all cache will be
>> >> >> blocked
>> >> >> > > > > during
>> >> >> > > > > > > > the index's initial buildup.
>> >> >> > > > > > > >
>> >> >> > > > > > > > Any thoughts?
>> >> >> > > > > > > >
>> >> >> > > > > > > > Currently I'm working on parsing of those commands as
>> that
>> >> >> will
>> >> >> > > be
>> >> >> > > > > > > > needed anyway and does not affect further
>> implementation.
>> >> >> > > > > > > >
>> >> >> > > > > > > > - Alex
>> >> >> > > > > > > >
>> >> >> > > > > > >
>> >> >> > > > > > >
>> >> >> > > > > > >
>> >> >> > > > > > > --
>> >> >> > > > > > > Alexey Kuznetsov
>> >> >> > > > > > >
>> >> >> > > > > >
>> >> >> > > > >
>> >> >> > > > >
>> >> >> > > > >
>> >> >> > > > > --
>> >> >> > > > > Sergey Kozlov
>> >> >> > > > > GridGain Systems
>> >> >> > > > > www.gridgain.com
>> >> >> > > > >
>> >> >> > > >
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > > --
>> >> >> > > Sergey Kozlov
>> >> >> > > GridGain Systems
>> >> >> > > www.gridgain.com
>> >> >> > >
>> >> >> >
>> >> >>
>> >>
>>

Re: DDL implementation details

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Thu, Jan 12, 2017 at 10:16 AM, Alexander Paschenko <
alexander.a.paschenko@gmail.com> wrote:

> Sergi,
>
> OK, great. Still, what's up with CREATE TABLE? After a bit of code
> digging, I currently don't see major obstacles against registering
> query entities (i.e. type descriptors) on the fly - CREATE TABLE will
> essentially boil down to *GridQueryIndexing#registerType* call.
>

I think this makes sense.


>
> But, as you have justly noted, we have to keep nodes joining the
> cluster up-to-date about what schemas and tables need to be created in
> order for those nodes to participate in distributed queries. And,
> correct me if I'm wrong, but this is relevant even outside of context
> of 2.0 and page memory and persistent stores, amirite?
>

This should already be supported in Ignite. Otherwise, how are we able to
run queries today?


>
> - Alex
>
> 2017-01-13 1:47 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
> > The xml config was only for example. We can put in this configuration
> > string cache config parameters directly like this:
> >
> > CREATE SCHEMA "MyCacheName" WITH
> > "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> >
> > Sergi
> >
> > 2017-01-12 20:21 GMT+03:00 Alexander Paschenko <
> > alexander.a.paschenko@gmail.com>:
> >
> >> Sergi, Alexey G.,
> >>
> >> I see your point and am rather inclined to agree that we should let
> >> current notion of "single schema - multiple tables" live.
> >>
> >> Still, if we create schema with cache config file, what's the whole
> >> point of SQL then if the user anyway has to write XML? This probably
> >> could be useful to propagate configuration to all cluster nodes tho.
> >>
> >> And if we skip CREATE TABLE now, it means that we leave user facing
> >> the need to write XML configuration, no other options. Is this what we
> >> want?
> >>
> >> Still I must admit that leaving user with his familiar XML stuff looks
> >> attractive - no messing with bunch of unknown new params, just write
> >> your XML and go. Also it's portable and allows to re-use
> >> configurations easily, so undoubtedly is a good approach from some
> >> point.
> >>
> >> - Alex
> >>
> >> 2017-01-12 23:51 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
> >> > Hi,
> >> >
> >> > 1. For now I'm against inventing any custom SQL syntax and
> implementing
> >> > parsing.
> >> > Currently H2 supports the following syntax:
> >> >
> >> > CREATE TABLE test(...) WITH "myCustomParamString"
> >> >
> >> > This is enough for us to pass the needed parameters.
> >> >
> >> > 2. Agree with AG, we have to separate cache creation from table
> creation.
> >> > Cache == SQL schema for us. We just have to add the same WITH syntax
> in
> >> H2
> >> > for schema creation like this:
> >> >
> >> > CREATE SCHEMA "MyCacheName" WITH "cacheConfig=myCache.xml"
> >> >
> >> > 3. If we want to create tables then I suggest to put this
> functionality
> >> to
> >> > 2.0+PageMemory right away and think where and how we are going to
> store
> >> all
> >> > the related metadata.This is especially important for persistent
> >> storages.
> >> >
> >> > Sergi
> >> >
> >> >
> >> > 2017-01-12 16:56 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> >> >
> >> >> I am afraid in this case user will have to define too much schemes -
> >> >> boilerplate.
> >> >> Does it make sense at all to pack multiple tuples into a single cache
> >> from
> >> >> user perspective?
> >> >>
> >> >> On Thu, Jan 12, 2017 at 4:40 PM, Alexey Goncharuk <
> >> >> alexey.goncharuk@gmail.com> wrote:
> >> >>
> >> >> > Alexander,
> >> >> >
> >> >> > Will we keep the old option to have multiple tables in one cache?
> If
> >> so,
> >> >> > how will create table statement know which cache to choose?
> >> >> >
> >> >> > It seems to me that to be consistent with the current DML
> >> implementation
> >> >> we
> >> >> > should have a CREATE SCHEMA statement which will define the cache
> and
> >> >> cache
> >> >> > configuration, and CREATE TABLE should specify the schema name.
> >> >> >
> >> >> > Otherwise, we should enforce the single type per cache rule at the
> >> >> > configuration level and in runtime.
> >> >> >
> >> >> > As for affinity and primary key - agree with Vladimir.
> >> >> >
> >> >> > --
> >> >> > AG
> >> >> >
> >> >> > 2017-01-12 11:41 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
> >> >> >
> >> >> > > As first stage of DDL we can implement following CREATE TABLE
> >> statement
> >> >> > > support:
> >> >> > >  - CREATE TABLE without cache properties (use default cache
> >> properties
> >> >> or
> >> >> > > cache properties defined in SQL Schema)
> >> >> > >  - CREATE TABLE .. LIKE where we can create a cache based on an
> >> another
> >> >> > > existing cache.
> >> >> > >
> >> >> > > On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <
> >> >> > dsetrakyan@apache.org>
> >> >> > > wrote:
> >> >> > >
> >> >> > > > Agree with Sergey. We should be able to specify cache
> properties
> >> >> inside
> >> >> > > of
> >> >> > > > SQL statements. Does H2 have any support to process SQL hints?
> >> Can we
> >> >> > > > change it?
> >> >> > > >
> >> >> > > > Having said that, while we finalize the above, I think we
> should
> >> >> start
> >> >> > > > working on DDL implementation to use the default settings, as
> >> >> specified
> >> >> > > in
> >> >> > > > Alexander's email.
> >> >> > > >
> >> >> > > > Also agree with the stop-the-world on the cache for index
> >> creation.
> >> >> We
> >> >> > > can
> >> >> > > > always improve on it in future.
> >> >> > > >
> >> >> > > > D.
> >> >> > > >
> >> >> > > > On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <
> >> >> skozlov@gridgain.com>
> >> >> > > > wrote:
> >> >> > > >
> >> >> > > > > Hi
> >> >> > > > >
> >> >> > > > > I suppose we should put any ignite cache properties as
> >> additional
> >> >> > > > > non-standard attributes after CREATE TABLE () clause as it
> does
> >> >> > > > Postgress,
> >> >> > > > > MySQL and other RDBMS.
> >> >> > > > > Take a look on CREATE TABLE with using TABLESPACE (Postgess)
> or
> >> for
> >> >> > > > CREATE
> >> >> > > > > TABLE with using PARTITIONS (MySQL).
> >> >> > > > >
> >> >> > > > >
> >> >> > > > >
> >> >> > > > >
> >> >> > > > >
> >> >> > > > > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
> >> >> > > vozerov@gridgain.com>
> >> >> > > > > wrote:
> >> >> > > > >
> >> >> > > > > > I believe custom synthax and parsing is a *must* for us, as
> >> well
> >> >> as
> >> >> > > for
> >> >> > > > > any
> >> >> > > > > > distributed database. At the very least we need to specify
> >> >> affinity
> >> >> > > key
> >> >> > > > > > column somehow. Any cache property can be specified at the
> >> very
> >> >> end
> >> >> > > of
> >> >> > > > > > table definition. Key columns can be determined as the ones
> >> with
> >> >> > > > PRIMARY
> >> >> > > > > > KEY constraint (Alex K. idea) + affinity column(s):
> >> >> > > > > >
> >> >> > > > > > CREATE TABLE employee (
> >> >> > > > > >     id BIGINT PRIMARY KEY,
> >> >> > > > > >     dept_id BIGINT AFFINITY KEY,
> >> >> > > > > >     name VARCHAR(128),
> >> >> > > > > >     address VARCHAR(256)
> >> >> > > > > >     BACKUPS 2,
> >> >> > > > > >     ATOMICITY_MODE ATOMIC,
> >> >> > > > > > );
> >> >> > > > > >
> >> >> > > > > > "id" and "dept_id" form key type, "name" and "address" form
> >> value
> >> >> > > type.
> >> >> > > > > >
> >> >> > > > > > Vladimir.
> >> >> > > > > >
> >> >> > > > > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
> >> >> > > > akuznetsov@apache.org
> >> >> > > > > >
> >> >> > > > > > wrote:
> >> >> > > > > >
> >> >> > > > > > > Hi, Alex!
> >> >> > > > > > >
> >> >> > > > > > > As far as I know most RDBMS allow something like: create
> >> table
> >> >> t1
> >> >> > > (id
> >> >> > > > > > > integer primary key, ....)
> >> >> > > > > > > How about to take as key field that marked as "primary
> key"?
> >> >> > > > > > >
> >> >> > > > > > > As of atomicity and replication - I think it is a cache
> >> >> > properties
> >> >> > > > and
> >> >> > > > > > with
> >> >> > > > > > > create table we will create "types" in cache. No?
> >> >> > > > > > > I thought that cache it is a kind of "schema" in RDBMS.
> >> >> > > > > > >
> >> >> > > > > > > Could you describe what will be created with CREATE
> TABLE?
> >> >> > > > > > >
> >> >> > > > > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> >> >> > > > > > > alexander.a.paschenko@gmail.com> wrote:
> >> >> > > > > > >
> >> >> > > > > > > > Hello Igniters,
> >> >> > > > > > > >
> >> >> > > > > > > > I would like to start discussion about implementation
> of
> >> SQL
> >> >> > DDL
> >> >> > > > > > > commands.
> >> >> > > > > > > >
> >> >> > > > > > > > At the first stage, the most important ones seem to be
> >> CREATE
> >> >> > > TABLE
> >> >> > > > > > > > (that will obviously correspond to creation of a cache)
> >> and
> >> >> > > CREATE
> >> >> > > > > > > > INDEX.
> >> >> > > > > > > >
> >> >> > > > > > > > Regarding first one: SQL command for CREATE TABLE does
> not
> >> >> > > contain
> >> >> > > > > any
> >> >> > > > > > > > hints about cache settings (atomicity, replication,
> >> etc.), so
> >> >> > > these
> >> >> > > > > > > > will probably be defined by some configuration
> properties
> >> >> (like
> >> >> > > > > > > > ignite.ddl.default_cache_atomiticity, etc).
> >> >> > > > > > > >
> >> >> > > > > > > > Also it does not allow to distinguish between key and
> >> value
> >> >> > > > columns -
> >> >> > > > > > > > currently it is handled by keyFields property of
> >> QueryEntity,
> >> >> > but
> >> >> > > > it
> >> >> > > > > > > > is unclear how to declare key fields via CREATE TABLE.
> >> >> > > > > > > >
> >> >> > > > > > > > So at a first glance it seems like we should either
> >> implement
> >> >> > > some
> >> >> > > > > > > > sort of custom parsing (I believe Sergi will be against
> >> it)
> >> >> or
> >> >> > > > > > > > introduce some kind of name prefix that would tell SQL
> >> engine
> >> >> > > that
> >> >> > > > > > > > certain column is a key field column.
> >> >> > > > > > > >
> >> >> > > > > > > > Of course, this problem disappears is key is of SQL
> type.
> >> >> > > > > > > >
> >> >> > > > > > > > Regarding CREATE INDEX: probably at first we will have
> to
> >> >> > > implement
> >> >> > > > > > > > this in "stop-the-world" manner, i.e. all cache will be
> >> >> blocked
> >> >> > > > > during
> >> >> > > > > > > > the index's initial buildup.
> >> >> > > > > > > >
> >> >> > > > > > > > Any thoughts?
> >> >> > > > > > > >
> >> >> > > > > > > > Currently I'm working on parsing of those commands as
> that
> >> >> will
> >> >> > > be
> >> >> > > > > > > > needed anyway and does not affect further
> implementation.
> >> >> > > > > > > >
> >> >> > > > > > > > - Alex
> >> >> > > > > > > >
> >> >> > > > > > >
> >> >> > > > > > >
> >> >> > > > > > >
> >> >> > > > > > > --
> >> >> > > > > > > Alexey Kuznetsov
> >> >> > > > > > >
> >> >> > > > > >
> >> >> > > > >
> >> >> > > > >
> >> >> > > > >
> >> >> > > > > --
> >> >> > > > > Sergey Kozlov
> >> >> > > > > GridGain Systems
> >> >> > > > > www.gridgain.com
> >> >> > > > >
> >> >> > > >
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > --
> >> >> > > Sergey Kozlov
> >> >> > > GridGain Systems
> >> >> > > www.gridgain.com
> >> >> > >
> >> >> >
> >> >>
> >>
>

Re: DDL implementation details

Posted by Alexander Paschenko <al...@gmail.com>.

Sergi,

OK, great. Still, what's up with CREATE TABLE? After a bit of code
digging, I currently don't see major obstacles against registering
query entities (i.e. type descriptors) on the fly - CREATE TABLE will
essentially boil down to *GridQueryIndexing#registerType* call.

But, as you have justly noted, we have to keep nodes joining the
cluster up-to-date about what schemas and tables need to be created in
order for those nodes to participate in distributed queries. And,
correct me if I'm wrong, but this is relevant even outside of context
of 2.0 and page memory and persistent stores, amirite?

- Alex

2017-01-13 1:47 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
> The xml config was only for example. We can put in this configuration
> string cache config parameters directly like this:
>
> CREATE SCHEMA "MyCacheName" WITH
> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
>
> Sergi
>
> 2017-01-12 20:21 GMT+03:00 Alexander Paschenko <
> alexander.a.paschenko@gmail.com>:
>
>> Sergi, Alexey G.,
>>
>> I see your point and am rather inclined to agree that we should let
>> current notion of "single schema - multiple tables" live.
>>
>> Still, if we create schema with cache config file, what's the whole
>> point of SQL then if the user anyway has to write XML? This probably
>> could be useful to propagate configuration to all cluster nodes tho.
>>
>> And if we skip CREATE TABLE now, it means that we leave user facing
>> the need to write XML configuration, no other options. Is this what we
>> want?
>>
>> Still I must admit that leaving user with his familiar XML stuff looks
>> attractive - no messing with bunch of unknown new params, just write
>> your XML and go. Also it's portable and allows to re-use
>> configurations easily, so undoubtedly is a good approach from some
>> point.
>>
>> - Alex
>>
>> 2017-01-12 23:51 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
>> > Hi,
>> >
>> > 1. For now I'm against inventing any custom SQL syntax and implementing
>> > parsing.
>> > Currently H2 supports the following syntax:
>> >
>> > CREATE TABLE test(...) WITH "myCustomParamString"
>> >
>> > This is enough for us to pass the needed parameters.
>> >
>> > 2. Agree with AG, we have to separate cache creation from table creation.
>> > Cache == SQL schema for us. We just have to add the same WITH syntax in
>> H2
>> > for schema creation like this:
>> >
>> > CREATE SCHEMA "MyCacheName" WITH "cacheConfig=myCache.xml"
>> >
>> > 3. If we want to create tables then I suggest to put this functionality
>> to
>> > 2.0+PageMemory right away and think where and how we are going to store
>> all
>> > the related metadata.This is especially important for persistent
>> storages.
>> >
>> > Sergi
>> >
>> >
>> > 2017-01-12 16:56 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
>> >
>> >> I am afraid in this case user will have to define too much schemes -
>> >> boilerplate.
>> >> Does it make sense at all to pack multiple tuples into a single cache
>> from
>> >> user perspective?
>> >>
>> >> On Thu, Jan 12, 2017 at 4:40 PM, Alexey Goncharuk <
>> >> alexey.goncharuk@gmail.com> wrote:
>> >>
>> >> > Alexander,
>> >> >
>> >> > Will we keep the old option to have multiple tables in one cache? If
>> so,
>> >> > how will create table statement know which cache to choose?
>> >> >
>> >> > It seems to me that to be consistent with the current DML
>> implementation
>> >> we
>> >> > should have a CREATE SCHEMA statement which will define the cache and
>> >> cache
>> >> > configuration, and CREATE TABLE should specify the schema name.
>> >> >
>> >> > Otherwise, we should enforce the single type per cache rule at the
>> >> > configuration level and in runtime.
>> >> >
>> >> > As for affinity and primary key - agree with Vladimir.
>> >> >
>> >> > --
>> >> > AG
>> >> >
>> >> > 2017-01-12 11:41 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
>> >> >
>> >> > > As first stage of DDL we can implement following CREATE TABLE
>> statement
>> >> > > support:
>> >> > >  - CREATE TABLE without cache properties (use default cache
>> properties
>> >> or
>> >> > > cache properties defined in SQL Schema)
>> >> > >  - CREATE TABLE .. LIKE where we can create a cache based on an
>> another
>> >> > > existing cache.
>> >> > >
>> >> > > On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <
>> >> > dsetrakyan@apache.org>
>> >> > > wrote:
>> >> > >
>> >> > > > Agree with Sergey. We should be able to specify cache properties
>> >> inside
>> >> > > of
>> >> > > > SQL statements. Does H2 have any support to process SQL hints?
>> Can we
>> >> > > > change it?
>> >> > > >
>> >> > > > Having said that, while we finalize the above, I think we should
>> >> start
>> >> > > > working on DDL implementation to use the default settings, as
>> >> specified
>> >> > > in
>> >> > > > Alexander's email.
>> >> > > >
>> >> > > > Also agree with the stop-the-world on the cache for index
>> creation.
>> >> We
>> >> > > can
>> >> > > > always improve on it in future.
>> >> > > >
>> >> > > > D.
>> >> > > >
>> >> > > > On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <
>> >> skozlov@gridgain.com>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > Hi
>> >> > > > >
>> >> > > > > I suppose we should put any ignite cache properties as
>> additional
>> >> > > > > non-standard attributes after CREATE TABLE () clause as it does
>> >> > > > Postgress,
>> >> > > > > MySQL and other RDBMS.
>> >> > > > > Take a look on CREATE TABLE with using TABLESPACE (Postgess) or
>> for
>> >> > > > CREATE
>> >> > > > > TABLE with using PARTITIONS (MySQL).
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
>> >> > > vozerov@gridgain.com>
>> >> > > > > wrote:
>> >> > > > >
>> >> > > > > > I believe custom synthax and parsing is a *must* for us, as
>> well
>> >> as
>> >> > > for
>> >> > > > > any
>> >> > > > > > distributed database. At the very least we need to specify
>> >> affinity
>> >> > > key
>> >> > > > > > column somehow. Any cache property can be specified at the
>> very
>> >> end
>> >> > > of
>> >> > > > > > table definition. Key columns can be determined as the ones
>> with
>> >> > > > PRIMARY
>> >> > > > > > KEY constraint (Alex K. idea) + affinity column(s):
>> >> > > > > >
>> >> > > > > > CREATE TABLE employee (
>> >> > > > > >     id BIGINT PRIMARY KEY,
>> >> > > > > >     dept_id BIGINT AFFINITY KEY,
>> >> > > > > >     name VARCHAR(128),
>> >> > > > > >     address VARCHAR(256)
>> >> > > > > >     BACKUPS 2,
>> >> > > > > >     ATOMICITY_MODE ATOMIC,
>> >> > > > > > );
>> >> > > > > >
>> >> > > > > > "id" and "dept_id" form key type, "name" and "address" form
>> value
>> >> > > type.
>> >> > > > > >
>> >> > > > > > Vladimir.
>> >> > > > > >
>> >> > > > > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
>> >> > > > akuznetsov@apache.org
>> >> > > > > >
>> >> > > > > > wrote:
>> >> > > > > >
>> >> > > > > > > Hi, Alex!
>> >> > > > > > >
>> >> > > > > > > As far as I know most RDBMS allow something like: create
>> table
>> >> t1
>> >> > > (id
>> >> > > > > > > integer primary key, ....)
>> >> > > > > > > How about to take as key field that marked as "primary key"?
>> >> > > > > > >
>> >> > > > > > > As of atomicity and replication - I think it is a cache
>> >> > properties
>> >> > > > and
>> >> > > > > > with
>> >> > > > > > > create table we will create "types" in cache. No?
>> >> > > > > > > I thought that cache it is a kind of "schema" in RDBMS.
>> >> > > > > > >
>> >> > > > > > > Could you describe what will be created with CREATE TABLE?
>> >> > > > > > >
>> >> > > > > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
>> >> > > > > > > alexander.a.paschenko@gmail.com> wrote:
>> >> > > > > > >
>> >> > > > > > > > Hello Igniters,
>> >> > > > > > > >
>> >> > > > > > > > I would like to start discussion about implementation of
>> SQL
>> >> > DDL
>> >> > > > > > > commands.
>> >> > > > > > > >
>> >> > > > > > > > At the first stage, the most important ones seem to be
>> CREATE
>> >> > > TABLE
>> >> > > > > > > > (that will obviously correspond to creation of a cache)
>> and
>> >> > > CREATE
>> >> > > > > > > > INDEX.
>> >> > > > > > > >
>> >> > > > > > > > Regarding first one: SQL command for CREATE TABLE does not
>> >> > > contain
>> >> > > > > any
>> >> > > > > > > > hints about cache settings (atomicity, replication,
>> etc.), so
>> >> > > these
>> >> > > > > > > > will probably be defined by some configuration properties
>> >> (like
>> >> > > > > > > > ignite.ddl.default_cache_atomiticity, etc).
>> >> > > > > > > >
>> >> > > > > > > > Also it does not allow to distinguish between key and
>> value
>> >> > > > columns -
>> >> > > > > > > > currently it is handled by keyFields property of
>> QueryEntity,
>> >> > but
>> >> > > > it
>> >> > > > > > > > is unclear how to declare key fields via CREATE TABLE.
>> >> > > > > > > >
>> >> > > > > > > > So at a first glance it seems like we should either
>> implement
>> >> > > some
>> >> > > > > > > > sort of custom parsing (I believe Sergi will be against
>> it)
>> >> or
>> >> > > > > > > > introduce some kind of name prefix that would tell SQL
>> engine
>> >> > > that
>> >> > > > > > > > certain column is a key field column.
>> >> > > > > > > >
>> >> > > > > > > > Of course, this problem disappears is key is of SQL type.
>> >> > > > > > > >
>> >> > > > > > > > Regarding CREATE INDEX: probably at first we will have to
>> >> > > implement
>> >> > > > > > > > this in "stop-the-world" manner, i.e. all cache will be
>> >> blocked
>> >> > > > > during
>> >> > > > > > > > the index's initial buildup.
>> >> > > > > > > >
>> >> > > > > > > > Any thoughts?
>> >> > > > > > > >
>> >> > > > > > > > Currently I'm working on parsing of those commands as that
>> >> will
>> >> > > be
>> >> > > > > > > > needed anyway and does not affect further implementation.
>> >> > > > > > > >
>> >> > > > > > > > - Alex
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > > > --
>> >> > > > > > > Alexey Kuznetsov
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > > --
>> >> > > > > Sergey Kozlov
>> >> > > > > GridGain Systems
>> >> > > > > www.gridgain.com
>> >> > > > >
>> >> > > >
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Sergey Kozlov
>> >> > > GridGain Systems
>> >> > > www.gridgain.com
>> >> > >
>> >> >
>> >>
>>

Re: DDL implementation details

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Denis,

>>> Inventing custom DDL statements is also a dead end.
Questionable. Every database has custom SQL, including DDL, are they all
"dead-ended"? Each database has unique set of feature and unique
architecture. The very goal of custom SQL is to give user control over
these differences. Without it users might easily get stuck with some
architectural limitation or performance issue, which cannot be resolved
with ANSI SQL. For this reason our very goal is not to make migration
strictly transparent - "without a single SQL line change!". It is
impossible (recall widespread enterprise myth that major Hibernate
advantage is ability to "switch" between databases transparently -
bullshit). Our goal is to make migration *smooth *on the one hand, without
loosing Ignite advantages on the other.

>>> Most of us suggest the users never store multiple types (queries
entities) per cache consider this as an anti-pattern.
Personally, I am absolutely comfortable with having multiple entities per
cache. As Sergi mentioned before, it worked fine for years. I am against
*forcing* users to put everything into single cache at the risk of total
SQL rewrite otherwise.

Having said that I do not support idea of having cache-per-table rule
either. This way we are moving from one extreme to another. Both would add
limitations to the product. For instance, as Alex mentioned, empty cache
have serious memory overhead at the moment, and it is not resolved still.
Another problem is memory pools which we expect in AI 2.0 - it will be
difficult to manage memory if you always create separate caches.

I would design it as follows:
1) SCHEMA - logical entity. This is metadata stored somewhere (e.g. in
INFORMATIONAL_SCHEMA cache) which brings database objects together (tables,
views, stored procedures, sequences, security rights, etc.);
2) CREATE CACHE - creates cache;
3) Define several default caches for the most common cases. I see two for
now - PARTITIONED for normal data, REPLICATED for reference data. This way
user will not bother with CREATE CACHE in usual scenarios.

Thoughts?

Vladimir.


On Fri, Jan 27, 2017 at 5:51 AM, Denis Magda <dm...@apache.org> wrote:

> Agree that schema-per-cache approach is natural for Ignite but will be a
> nightmare for those who migrate from a RDBMS. Inventing  custom DDL
> statements is also a dead end. This won’t make users life and transitions
> from a RDBMS easier.
>
> Let’s to look at this differently. Most of us suggest the users never
> store multiple types (queries entities) per cache consider this as an
> anti-pattern. I do remember that we even brought up an idea to apply
> query-entity-per-cache rule.
>
> Conjuring that Ignite follows query-entity-per-cache rule, DDL statements
> might have the following effect:
>
> CREATE TABLE blablabla {…} - creates both cache and query entity with name
> blablabla.
>
> In SQL queries we don’t need to specify a cache name as a schema name at
> all even during joins thanks to cache-per-query-entity rule. We can figure
> this out automatically.
>
> Executing `CREATE SCHEME myScheme` and using it later in DDL statements
> (USING SCHEME myScheme) we will add the scheme name as an attribute to
> respective query entities or caches. When the scheme name is used in a SQL
> query the engine will look for a valid query entity/cache that has the
> attribute set to this name.
>
> I do realize that this will break the compatibility but we’re approaching
> 2.0. Good chance to make not only DDL but SQL Grid more user friendly.
>
> —
> Denis
>
> > On Jan 17, 2017, at 7:29 PM, Dmitriy Setrakyan <ds...@apache.org>
> wrote:
> >
> > After looking at all the restrictions of "cache==schema" approach, I
> still
> > think that it would be the most natural match for Ignite. All other
> > suggestions are either too limiting, too hard, or not applicable for
> > Ignite.
> >
> > My vote would be to start with "cache==schema" and get some user
> feedback.
> > We can always add "tablespaces" in future, if our users demand it.
> >
> > D.
> >
> > On Tue, Jan 17, 2017 at 1:29 AM, Sergi Vladykin <
> sergi.vladykin@gmail.com>
> > wrote:
> >
> >> Sergey,
> >>
> >> See inline:
> >>
> >> 2017-01-17 0:50 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
> >>
> >>> For the approach schema==cache it is not clear how it works for some
> >> cases
> >>> (especialy if SQL and regular cache operations are mixed):
> >>>
> >>> 1. I want to store two tables in same cache (same schema) and both
> tables
> >>> have identical structure like id Integer, name String.
> >>> How they will be processed for regular cache operations
> >>> put/putall/get/getall etc where we get cache instance by its name (on
> SQL
> >>> level we use the table name) ?
> >>>
> >>
> >> This already works. We can have multiple QueryEntities configured for a
> >> single cache. The only restriction here is that cache key must be unique
> >> across all the tables inside the same schema.
> >>
> >>
> >>>
> >>> 2. What's about ALTER TABLE statement? Will the changes for one table
> >> lock
> >>> all tables in the cache?
> >>>
> >>
> >> It will not be needed, any table structure modification will be
> performed
> >> in the table scope.
> >>
> >>
> >>> 3. In the future if we will introduce table-based features like SQL
> roles
> >>> (as next step of SQL implementation) "many tables one cache" rule looks
> >>> like more difficult for its implementation.
> >>>
> >>
> >> I don't think so, must be the same thing.
> >>
> >>
> >>>
> >>> p.s. may be we need to think about new entity called "table" (based on
> >> type
> >>> descriptor) inside cache to better support of SQL DDL
> >>>
> >>
> >> We already have it: QueryEntity. It has exactly that semantics.
> >>
> >> Sergi
> >>
> >>
> >>>
> >>>
> >>> On Mon, Jan 16, 2017 at 11:06 PM, Sergi Vladykin <
> >> sergi.vladykin@gmail.com
> >>>>
> >>> wrote:
> >>>
> >>>> Dima,
> >>>>
> >>>> I agree that cache==table is definitely a wrong choice, but as far as
> I
> >>> see
> >>>> Vova suggests having cache==tablespace instead of cache==schema. I
> tend
> >>> to
> >>>> agree with this decoupling of physical and logical grouping, but the
> >>>> concern is that it will require much more work to do.
> >>>>
> >>>> Sergi
> >>>>
> >>>> 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
> >>>>
> >>>>> Vova,
> >>>>>
> >>>>> Currently I see only 2 ways we can proceed here:
> >>>>>
> >>>>>   1. cache == table
> >>>>>   2. cache == schema
> >>>>>
> >>>>> I agree that "cache==table" may be more flexible, but I don't think
> >> it
> >>>> will
> >>>>> work in Ignite.
> >>>>> We may end up with 1,000s of caches, which will carry significant
> >>>> overhead
> >>>>> on memory and cluster overall. I think that we have no choice but to
> >>> take
> >>>>> "cache==schema" approach.
> >>>>>
> >>>>> D.
> >>>>>
> >>>>> On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <
> >> vozerov@gridgain.com
> >>>>
> >>>>> wrote:
> >>>>>
> >>>>>> Sergi, Dima,
> >>>>>>
> >>>>>> In the scope of Ignite 1.x it is perfectly fine to have "schema =
> >>>> cache".
> >>>>>> Nobody suffers from it because nobody use Ignite as database. But
> >> in
> >>>>>> future, thanks to page memory, we are going to target real database
> >>> use
> >>>>>> cases. Users will have multiple tables in Ignite. Plus views,
> >>> triggers,
> >>>>>> constraints, etc.. All these features are very useful and easy to
> >>>>> implement
> >>>>>> provided that we already have table and index implementations. And
> >> in
> >>>>>> databases all related objects are *logically *grouped in a
> >> "schema".
> >>>> This
> >>>>>> is convenient for users: less boilerplate in SQL, better
> >>> manageability
> >>>>>> (remember that database users will definitely need some console
> >>> and/or
> >>>> UI
> >>>>>> tools to manage Ignite as a database).
> >>>>>>
> >>>>>> What you offer is to group database objects *physically *rather
> >> than
> >>>>>> logically. It will lead to:
> >>>>>> - Boilerplate in queries
> >>>>>> - Inconvenient database management. All the things database users
> >> are
> >>>>> used
> >>>>>> to - import/export tools, UIs, "USING" keyword, etc, will look
> >> weird
> >>> in
> >>>>>> Ignite as there will be no way to group arbitrary objects
> >> logically.
> >>>>>>
> >>>>>> With this approach almost every user will have to use two schemes
> >>>> instead
> >>>>>> of one - one for operational data (PARTITIONED) and one for
> >> reference
> >>>>> data
> >>>>>> (REPLICATED). No conventional database works this way.
> >>>>>>
> >>>>>> Vladimir.
> >>>>>>
> >>>>>> On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> >>>>> dsetrakyan@apache.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Vova,
> >>>>>>>
> >>>>>>> I will join Sergi here. It seems like "schema = cache" will take
> >>> care
> >>>>> of
> >>>>>>> all different configuration properties required for different
> >>> groups
> >>>> of
> >>>>>>> caches. In addition, it cleanly maps into current Ignite
> >>>> architecture.
> >>>>> We
> >>>>>>> will need to have a very strong reason to move away from it.
> >>>>>>>
> >>>>>>> D.
> >>>>>>>
> >>>>>>> On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <
> >>>> vozerov@gridgain.com
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Correct, it worked, because Ignite has never had real database
> >>> use
> >>>>> case
> >>>>>>> in
> >>>>>>>> mind. Unfortunately, if our global plans go as expected, it
> >> will
> >>>> not
> >>>>>> work
> >>>>>>>> for Ignite 2.x+.
> >>>>>>>>
> >>>>>>>> On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> >>>>>>> sergi.vladykin@gmail.com
> >>>>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Lets move on with SQL schema == Ignite cache. It worked
> >> always
> >>>> like
> >>>>>>>> this, I
> >>>>>>>>> see no reasons to change this.
> >>>>>>>>>
> >>>>>>>>> Sergi
> >>>>>>>>>
> >>>>>>>>> 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <
> >>> vozerov@gridgain.com
> >>>>> :
> >>>>>>>>>
> >>>>>>>>>> "Tablespace" (Oracle, PostgreSQL) is what maps better than
> >>>>> "schema"
> >>>>>>> to
> >>>>>>>>> our
> >>>>>>>>>> cache. But not ideally still.
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> >>>>>>>> vozerov@gridgain.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Alex,
> >>>>>>>>>>>
> >>>>>>>>>>> Currently Ignite is not used as database. It is used as
> >>>> search
> >>>>>>>> engine -
> >>>>>>>>>>> several types, several tables, several joins. This is why
> >>>>> having
> >>>>>>>>> "SCHEMA
> >>>>>>>>>> ==
> >>>>>>>>>>> cache" was never a problem. Users have never build
> >> complex
> >>>> SQL
> >>>>>>>>>> applications
> >>>>>>>>>>> on top of Ignite. But we are going towards database. And
> >> my
> >>>>>>> question
> >>>>>>>>>> stands
> >>>>>>>>>>> still - suppose it is Y2019, how is user going to migrate
> >>> his
> >>>>>>>> database
> >>>>>>>>>>> containing 20-30-50-100 tables in a single schema in
> >> Oracle
> >>>> to
> >>>>>>>> Ignite?
> >>>>>>>>>>>
> >>>>>>>>>>> Single cache for all tables? Doens't work - not flexible.
> >>>> Users
> >>>>>>> will
> >>>>>>>>>>> definitely require different cache modes, different
> >>>> co-location
> >>>>>>>> rules,
> >>>>>>>>>>> different number of backups, etc..
> >>>>>>>>>>> Schema per table? Doesn't work either - unmanageable and
> >>> not
> >>>>>>>> convenient
> >>>>>>>>>>> for users even for relatively small databases.
> >>>>>>>>>>>
> >>>>>>>>>>> From user perspective schema is logical grouping of
> >>> database
> >>>>>>> objects,
> >>>>>>>>>>> nothing more.
> >>>>>>>>>>>
> >>>>>>>>>>> For Ignite schema could be a logical group of resources
> >>>> (nodes,
> >>>>>>>> memory
> >>>>>>>>>>> pools, caches, etc.). And multiple tables over multiple
> >>>> caches
> >>>>>>> should
> >>>>>>>>>>> reside in it. To the contrast, table definition governs
> >> how
> >>>>> data
> >>>>>> is
> >>>>>>>>>> stored.
> >>>>>>>>>>> This is similar to, for example, MySQL approach, where
> >> you
> >>>>> define
> >>>>>>> how
> >>>>>>>>> you
> >>>>>>>>>>> store data on per-table level, and on schema level you
> >>> define
> >>>>>> only
> >>>>>>>>> minor
> >>>>>>>>>>> things like collation.
> >>>>>>>>>>>
> >>>>>>>>>>> Vladimir.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> >>>>>>>>>>> alexander.a.paschenko@gmail.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Vova,
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
> >>>>> vozerov@gridgain.com
> >>>>>>> :
> >>>>>>>>>>>>> I am not quite sure I understand the idea of "SCHEMA
> >> ==
> >>>>>> cache".
> >>>>>>>>>> Consider
> >>>>>>>>>>>>> some small database with, say, ~30 tables. And user
> >>> wants
> >>>> to
> >>>>>>>> migrate
> >>>>>>>>>> to
> >>>>>>>>>>>>> Ignite. How is he supposed to do so? 30 schemas
> >> leading
> >>> to
> >>>>>>> rewrite
> >>>>>>>>> of
> >>>>>>>>>>>> all
> >>>>>>>>>>>>> his SQL scripts? Or 30 key-value pairs in a single
> >> cache
> >>>>>> leading
> >>>>>>>> to
> >>>>>>>>>>>> lack of
> >>>>>>>>>>>>> flexibility and performance problems?
> >>>>>>>>>>>>
> >>>>>>>>>>>> But currently schema *is* semantically equal to cache
> >>> while
> >>>>>> table
> >>>>>>> is
> >>>>>>>>>>>> equal to type descriptor (i.e. type of stored entities),
> >>>>> nothing
> >>>>>>> new
> >>>>>>>>>>>> here.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Say, in single cache we may have entities of types
> >> Person
> >>>> and
> >>>>>>>>>>>> Organization, those map to two tables with same names,
> >> and
> >>>> can
> >>>>>> be
> >>>>>>>>>>>> accessed within the same cache (i.e. schema).
> >>>>>>>>>>>>
> >>>>>>>>>>>> If we want to limit the user with having single type
> >>>>> descriptor
> >>>>>>> per
> >>>>>>>>>>>> cache (i.e. cache has only one type of stored entities -
> >>>> BTW,
> >>>>>>> where
> >>>>>>>> we
> >>>>>>>>>>>> are with this 2.0-wise?), then this notion could change.
> >>> But
> >>>>>>>> currently
> >>>>>>>>>>>> what has been suggested already fits quite good with
> >> what
> >>> we
> >>>>> do
> >>>>>>> have
> >>>>>>>>>>>> at the moment regarding semantic of SQL objects.
> >>>>>>>>>>>>
> >>>>>>>>>>>> - Alex
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Another example is how to deal with referene tables?
> >>> Lots
> >>>>>>> database
> >>>>>>>>> has
> >>>>>>>>>>>>> small reference tables which is best to fit REPLICATED
> >>>>> cache,
> >>>>>>>> while
> >>>>>>>>>>>> others
> >>>>>>>>>>>>> are usually bound to PARTITIONED mode. "SCHEMA ==
> >> cache"
> >>>>> will
> >>>>>>>> force
> >>>>>>>>>>>> users
> >>>>>>>>>>>>> to split them into separate schemes leading to poor
> >> user
> >>>>>>>> experience.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I understand that we may have some implementation
> >>> details
> >>>>>> around
> >>>>>>>> it
> >>>>>>>>> at
> >>>>>>>>>>>> the
> >>>>>>>>>>>>> moment. But from user perspective "SCHEMA == cache"
> >>>> doesn't
> >>>>>> make
> >>>>>>>>>> sense.
> >>>>>>>>>>>> As
> >>>>>>>>>>>>> we are going towards AI 2.0 we'd better to rethink
> >> this
> >>>>>>> approach.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> >>>>>>> dmagda@apache.org>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> >>>>>>>>>>>> dsetrakyan@apache.org>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> >>>>>>>>>>>>>> sergi.vladykin@gmail.com>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> The xml config was only for example. We can put in
> >>>> this
> >>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>> string cache config parameters directly like this:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> CREATE SCHEMA "MyCacheName" WITH
> >>>>>>>>>>>>>>>> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This approach makes sense, if it can be easily
> >>>> supported
> >>>>>> with
> >>>>>>>> H2.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> What’s for affinity keys? Can we make an exception
> >> for
> >>>> them
> >>>>>> by
> >>>>>>>>>>>> defining in
> >>>>>>>>>>>>>> this part of the statement
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> CREATE TABLE employee (
> >>>>>>>>>>>>>>   id BIGINT PRIMARY KEY,
> >>>>>>>>>>>>>>   dept_id BIGINT AFFINITY KEY,
> >>>>>>>>>>>>>>   name VARCHAR(128),
> >>>>>>>>>>>>>> );
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> or that l
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> CREATE TABLE employee (
> >>>>>>>>>>>>>>   id BIGINT PRIMARY KEY,
> >>>>>>>>>>>>>>   dept_id BIGINT,
> >>>>>>>>>>>>>>   name VARCHAR(128),
> >>>>>>>>>>>>>>   CONSTRAINT affKey AFFINITY KEY(dept_id)
> >>>>>>>>>>>>>> );
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> —
> >>>>>>>>>>>>>> Denis
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Sergey Kozlov
> >>> GridGain Systems
> >>> www.gridgain.com
> >>>
> >>
>
>

Re: DDL implementation details

Posted by Denis Magda <dm...@apache.org>.

Agree that schema-per-cache approach is natural for Ignite but will be a nightmare for those who migrate from a RDBMS. Inventing  custom DDL statements is also a dead end. This won’t make users life and transitions from a RDBMS easier.

Let’s to look at this differently. Most of us suggest the users never store multiple types (queries entities) per cache consider this as an anti-pattern. I do remember that we even brought up an idea to apply query-entity-per-cache rule.

Conjuring that Ignite follows query-entity-per-cache rule, DDL statements might have the following effect:

CREATE TABLE blablabla {…} - creates both cache and query entity with name blablabla.

In SQL queries we don’t need to specify a cache name as a schema name at all even during joins thanks to cache-per-query-entity rule. We can figure this out automatically.

Executing `CREATE SCHEME myScheme` and using it later in DDL statements (USING SCHEME myScheme) we will add the scheme name as an attribute to respective query entities or caches. When the scheme name is used in a SQL query the engine will look for a valid query entity/cache that has the attribute set to this name.

I do realize that this will break the compatibility but we’re approaching 2.0. Good chance to make not only DDL but SQL Grid more user friendly.

—
Denis

> On Jan 17, 2017, at 7:29 PM, Dmitriy Setrakyan <ds...@apache.org> wrote:
> 
> After looking at all the restrictions of "cache==schema" approach, I still
> think that it would be the most natural match for Ignite. All other
> suggestions are either too limiting, too hard, or not applicable for
> Ignite.
> 
> My vote would be to start with "cache==schema" and get some user feedback.
> We can always add "tablespaces" in future, if our users demand it.
> 
> D.
> 
> On Tue, Jan 17, 2017 at 1:29 AM, Sergi Vladykin <se...@gmail.com>
> wrote:
> 
>> Sergey,
>> 
>> See inline:
>> 
>> 2017-01-17 0:50 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
>> 
>>> For the approach schema==cache it is not clear how it works for some
>> cases
>>> (especialy if SQL and regular cache operations are mixed):
>>> 
>>> 1. I want to store two tables in same cache (same schema) and both tables
>>> have identical structure like id Integer, name String.
>>> How they will be processed for regular cache operations
>>> put/putall/get/getall etc where we get cache instance by its name (on SQL
>>> level we use the table name) ?
>>> 
>> 
>> This already works. We can have multiple QueryEntities configured for a
>> single cache. The only restriction here is that cache key must be unique
>> across all the tables inside the same schema.
>> 
>> 
>>> 
>>> 2. What's about ALTER TABLE statement? Will the changes for one table
>> lock
>>> all tables in the cache?
>>> 
>> 
>> It will not be needed, any table structure modification will be performed
>> in the table scope.
>> 
>> 
>>> 3. In the future if we will introduce table-based features like SQL roles
>>> (as next step of SQL implementation) "many tables one cache" rule looks
>>> like more difficult for its implementation.
>>> 
>> 
>> I don't think so, must be the same thing.
>> 
>> 
>>> 
>>> p.s. may be we need to think about new entity called "table" (based on
>> type
>>> descriptor) inside cache to better support of SQL DDL
>>> 
>> 
>> We already have it: QueryEntity. It has exactly that semantics.
>> 
>> Sergi
>> 
>> 
>>> 
>>> 
>>> On Mon, Jan 16, 2017 at 11:06 PM, Sergi Vladykin <
>> sergi.vladykin@gmail.com
>>>> 
>>> wrote:
>>> 
>>>> Dima,
>>>> 
>>>> I agree that cache==table is definitely a wrong choice, but as far as I
>>> see
>>>> Vova suggests having cache==tablespace instead of cache==schema. I tend
>>> to
>>>> agree with this decoupling of physical and logical grouping, but the
>>>> concern is that it will require much more work to do.
>>>> 
>>>> Sergi
>>>> 
>>>> 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
>>>> 
>>>>> Vova,
>>>>> 
>>>>> Currently I see only 2 ways we can proceed here:
>>>>> 
>>>>>   1. cache == table
>>>>>   2. cache == schema
>>>>> 
>>>>> I agree that "cache==table" may be more flexible, but I don't think
>> it
>>>> will
>>>>> work in Ignite.
>>>>> We may end up with 1,000s of caches, which will carry significant
>>>> overhead
>>>>> on memory and cluster overall. I think that we have no choice but to
>>> take
>>>>> "cache==schema" approach.
>>>>> 
>>>>> D.
>>>>> 
>>>>> On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <
>> vozerov@gridgain.com
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Sergi, Dima,
>>>>>> 
>>>>>> In the scope of Ignite 1.x it is perfectly fine to have "schema =
>>>> cache".
>>>>>> Nobody suffers from it because nobody use Ignite as database. But
>> in
>>>>>> future, thanks to page memory, we are going to target real database
>>> use
>>>>>> cases. Users will have multiple tables in Ignite. Plus views,
>>> triggers,
>>>>>> constraints, etc.. All these features are very useful and easy to
>>>>> implement
>>>>>> provided that we already have table and index implementations. And
>> in
>>>>>> databases all related objects are *logically *grouped in a
>> "schema".
>>>> This
>>>>>> is convenient for users: less boilerplate in SQL, better
>>> manageability
>>>>>> (remember that database users will definitely need some console
>>> and/or
>>>> UI
>>>>>> tools to manage Ignite as a database).
>>>>>> 
>>>>>> What you offer is to group database objects *physically *rather
>> than
>>>>>> logically. It will lead to:
>>>>>> - Boilerplate in queries
>>>>>> - Inconvenient database management. All the things database users
>> are
>>>>> used
>>>>>> to - import/export tools, UIs, "USING" keyword, etc, will look
>> weird
>>> in
>>>>>> Ignite as there will be no way to group arbitrary objects
>> logically.
>>>>>> 
>>>>>> With this approach almost every user will have to use two schemes
>>>> instead
>>>>>> of one - one for operational data (PARTITIONED) and one for
>> reference
>>>>> data
>>>>>> (REPLICATED). No conventional database works this way.
>>>>>> 
>>>>>> Vladimir.
>>>>>> 
>>>>>> On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
>>>>> dsetrakyan@apache.org>
>>>>>> wrote:
>>>>>> 
>>>>>>> Vova,
>>>>>>> 
>>>>>>> I will join Sergi here. It seems like "schema = cache" will take
>>> care
>>>>> of
>>>>>>> all different configuration properties required for different
>>> groups
>>>> of
>>>>>>> caches. In addition, it cleanly maps into current Ignite
>>>> architecture.
>>>>> We
>>>>>>> will need to have a very strong reason to move away from it.
>>>>>>> 
>>>>>>> D.
>>>>>>> 
>>>>>>> On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <
>>>> vozerov@gridgain.com
>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Correct, it worked, because Ignite has never had real database
>>> use
>>>>> case
>>>>>>> in
>>>>>>>> mind. Unfortunately, if our global plans go as expected, it
>> will
>>>> not
>>>>>> work
>>>>>>>> for Ignite 2.x+.
>>>>>>>> 
>>>>>>>> On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
>>>>>>> sergi.vladykin@gmail.com
>>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Lets move on with SQL schema == Ignite cache. It worked
>> always
>>>> like
>>>>>>>> this, I
>>>>>>>>> see no reasons to change this.
>>>>>>>>> 
>>>>>>>>> Sergi
>>>>>>>>> 
>>>>>>>>> 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <
>>> vozerov@gridgain.com
>>>>> :
>>>>>>>>> 
>>>>>>>>>> "Tablespace" (Oracle, PostgreSQL) is what maps better than
>>>>> "schema"
>>>>>>> to
>>>>>>>>> our
>>>>>>>>>> cache. But not ideally still.
>>>>>>>>>> 
>>>>>>>>>> On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
>>>>>>>> vozerov@gridgain.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Alex,
>>>>>>>>>>> 
>>>>>>>>>>> Currently Ignite is not used as database. It is used as
>>>> search
>>>>>>>> engine -
>>>>>>>>>>> several types, several tables, several joins. This is why
>>>>> having
>>>>>>>>> "SCHEMA
>>>>>>>>>> ==
>>>>>>>>>>> cache" was never a problem. Users have never build
>> complex
>>>> SQL
>>>>>>>>>> applications
>>>>>>>>>>> on top of Ignite. But we are going towards database. And
>> my
>>>>>>> question
>>>>>>>>>> stands
>>>>>>>>>>> still - suppose it is Y2019, how is user going to migrate
>>> his
>>>>>>>> database
>>>>>>>>>>> containing 20-30-50-100 tables in a single schema in
>> Oracle
>>>> to
>>>>>>>> Ignite?
>>>>>>>>>>> 
>>>>>>>>>>> Single cache for all tables? Doens't work - not flexible.
>>>> Users
>>>>>>> will
>>>>>>>>>>> definitely require different cache modes, different
>>>> co-location
>>>>>>>> rules,
>>>>>>>>>>> different number of backups, etc..
>>>>>>>>>>> Schema per table? Doesn't work either - unmanageable and
>>> not
>>>>>>>> convenient
>>>>>>>>>>> for users even for relatively small databases.
>>>>>>>>>>> 
>>>>>>>>>>> From user perspective schema is logical grouping of
>>> database
>>>>>>> objects,
>>>>>>>>>>> nothing more.
>>>>>>>>>>> 
>>>>>>>>>>> For Ignite schema could be a logical group of resources
>>>> (nodes,
>>>>>>>> memory
>>>>>>>>>>> pools, caches, etc.). And multiple tables over multiple
>>>> caches
>>>>>>> should
>>>>>>>>>>> reside in it. To the contrast, table definition governs
>> how
>>>>> data
>>>>>> is
>>>>>>>>>> stored.
>>>>>>>>>>> This is similar to, for example, MySQL approach, where
>> you
>>>>> define
>>>>>>> how
>>>>>>>>> you
>>>>>>>>>>> store data on per-table level, and on schema level you
>>> define
>>>>>> only
>>>>>>>>> minor
>>>>>>>>>>> things like collation.
>>>>>>>>>>> 
>>>>>>>>>>> Vladimir.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
>>>>>>>>>>> alexander.a.paschenko@gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Vova,
>>>>>>>>>>>> 
>>>>>>>>>>>> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
>>>>> vozerov@gridgain.com
>>>>>>> :
>>>>>>>>>>>>> I am not quite sure I understand the idea of "SCHEMA
>> ==
>>>>>> cache".
>>>>>>>>>> Consider
>>>>>>>>>>>>> some small database with, say, ~30 tables. And user
>>> wants
>>>> to
>>>>>>>> migrate
>>>>>>>>>> to
>>>>>>>>>>>>> Ignite. How is he supposed to do so? 30 schemas
>> leading
>>> to
>>>>>>> rewrite
>>>>>>>>> of
>>>>>>>>>>>> all
>>>>>>>>>>>>> his SQL scripts? Or 30 key-value pairs in a single
>> cache
>>>>>> leading
>>>>>>>> to
>>>>>>>>>>>> lack of
>>>>>>>>>>>>> flexibility and performance problems?
>>>>>>>>>>>> 
>>>>>>>>>>>> But currently schema *is* semantically equal to cache
>>> while
>>>>>> table
>>>>>>> is
>>>>>>>>>>>> equal to type descriptor (i.e. type of stored entities),
>>>>> nothing
>>>>>>> new
>>>>>>>>>>>> here.
>>>>>>>>>>>> 
>>>>>>>>>>>> Say, in single cache we may have entities of types
>> Person
>>>> and
>>>>>>>>>>>> Organization, those map to two tables with same names,
>> and
>>>> can
>>>>>> be
>>>>>>>>>>>> accessed within the same cache (i.e. schema).
>>>>>>>>>>>> 
>>>>>>>>>>>> If we want to limit the user with having single type
>>>>> descriptor
>>>>>>> per
>>>>>>>>>>>> cache (i.e. cache has only one type of stored entities -
>>>> BTW,
>>>>>>> where
>>>>>>>> we
>>>>>>>>>>>> are with this 2.0-wise?), then this notion could change.
>>> But
>>>>>>>> currently
>>>>>>>>>>>> what has been suggested already fits quite good with
>> what
>>> we
>>>>> do
>>>>>>> have
>>>>>>>>>>>> at the moment regarding semantic of SQL objects.
>>>>>>>>>>>> 
>>>>>>>>>>>> - Alex
>>>>>>>>>>>> 
>>>>>>>>>>>>> Another example is how to deal with referene tables?
>>> Lots
>>>>>>> database
>>>>>>>>> has
>>>>>>>>>>>>> small reference tables which is best to fit REPLICATED
>>>>> cache,
>>>>>>>> while
>>>>>>>>>>>> others
>>>>>>>>>>>>> are usually bound to PARTITIONED mode. "SCHEMA ==
>> cache"
>>>>> will
>>>>>>>> force
>>>>>>>>>>>> users
>>>>>>>>>>>>> to split them into separate schemes leading to poor
>> user
>>>>>>>> experience.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I understand that we may have some implementation
>>> details
>>>>>> around
>>>>>>>> it
>>>>>>>>> at
>>>>>>>>>>>> the
>>>>>>>>>>>>> moment. But from user perspective "SCHEMA == cache"
>>>> doesn't
>>>>>> make
>>>>>>>>>> sense.
>>>>>>>>>>>> As
>>>>>>>>>>>>> we are going towards AI 2.0 we'd better to rethink
>> this
>>>>>>> approach.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
>>>>>>> dmagda@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
>>>>>>>>>>>> dsetrakyan@apache.org>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
>>>>>>>>>>>>>> sergi.vladykin@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The xml config was only for example. We can put in
>>>> this
>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>> string cache config parameters directly like this:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> CREATE SCHEMA "MyCacheName" WITH
>>>>>>>>>>>>>>>> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> This approach makes sense, if it can be easily
>>>> supported
>>>>>> with
>>>>>>>> H2.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> What’s for affinity keys? Can we make an exception
>> for
>>>> them
>>>>>> by
>>>>>>>>>>>> defining in
>>>>>>>>>>>>>> this part of the statement
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> CREATE TABLE employee (
>>>>>>>>>>>>>>   id BIGINT PRIMARY KEY,
>>>>>>>>>>>>>>   dept_id BIGINT AFFINITY KEY,
>>>>>>>>>>>>>>   name VARCHAR(128),
>>>>>>>>>>>>>> );
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> or that l
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> CREATE TABLE employee (
>>>>>>>>>>>>>>   id BIGINT PRIMARY KEY,
>>>>>>>>>>>>>>   dept_id BIGINT,
>>>>>>>>>>>>>>   name VARCHAR(128),
>>>>>>>>>>>>>>   CONSTRAINT affKey AFFINITY KEY(dept_id)
>>>>>>>>>>>>>> );
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> —
>>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Sergey Kozlov
>>> GridGain Systems
>>> www.gridgain.com
>>> 
>>

Re: DDL implementation details

Posted by Dmitriy Setrakyan <ds...@apache.org>.

After looking at all the restrictions of "cache==schema" approach, I still
think that it would be the most natural match for Ignite. All other
suggestions are either too limiting, too hard, or not applicable for
Ignite.

My vote would be to start with "cache==schema" and get some user feedback.
We can always add "tablespaces" in future, if our users demand it.

D.

On Tue, Jan 17, 2017 at 1:29 AM, Sergi Vladykin <se...@gmail.com>
wrote:

> Sergey,
>
> See inline:
>
> 2017-01-17 0:50 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
>
> > For the approach schema==cache it is not clear how it works for some
> cases
> > (especialy if SQL and regular cache operations are mixed):
> >
> > 1. I want to store two tables in same cache (same schema) and both tables
> > have identical structure like id Integer, name String.
> > How they will be processed for regular cache operations
> > put/putall/get/getall etc where we get cache instance by its name (on SQL
> > level we use the table name) ?
> >
>
> This already works. We can have multiple QueryEntities configured for a
> single cache. The only restriction here is that cache key must be unique
> across all the tables inside the same schema.
>
>
> >
> > 2. What's about ALTER TABLE statement? Will the changes for one table
> lock
> > all tables in the cache?
> >
>
> It will not be needed, any table structure modification will be performed
> in the table scope.
>
>
> > 3. In the future if we will introduce table-based features like SQL roles
> > (as next step of SQL implementation) "many tables one cache" rule looks
> > like more difficult for its implementation.
> >
>
> I don't think so, must be the same thing.
>
>
> >
> > p.s. may be we need to think about new entity called "table" (based on
> type
> > descriptor) inside cache to better support of SQL DDL
> >
>
> We already have it: QueryEntity. It has exactly that semantics.
>
> Sergi
>
>
> >
> >
> > On Mon, Jan 16, 2017 at 11:06 PM, Sergi Vladykin <
> sergi.vladykin@gmail.com
> > >
> > wrote:
> >
> > > Dima,
> > >
> > > I agree that cache==table is definitely a wrong choice, but as far as I
> > see
> > > Vova suggests having cache==tablespace instead of cache==schema. I tend
> > to
> > > agree with this decoupling of physical and logical grouping, but the
> > > concern is that it will require much more work to do.
> > >
> > > Sergi
> > >
> > > 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
> > >
> > > > Vova,
> > > >
> > > > Currently I see only 2 ways we can proceed here:
> > > >
> > > >    1. cache == table
> > > >    2. cache == schema
> > > >
> > > > I agree that "cache==table" may be more flexible, but I don't think
> it
> > > will
> > > > work in Ignite.
> > > > We may end up with 1,000s of caches, which will carry significant
> > > overhead
> > > > on memory and cluster overall. I think that we have no choice but to
> > take
> > > > "cache==schema" approach.
> > > >
> > > > D.
> > > >
> > > > On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <
> vozerov@gridgain.com
> > >
> > > > wrote:
> > > >
> > > > > Sergi, Dima,
> > > > >
> > > > > In the scope of Ignite 1.x it is perfectly fine to have "schema =
> > > cache".
> > > > > Nobody suffers from it because nobody use Ignite as database. But
> in
> > > > > future, thanks to page memory, we are going to target real database
> > use
> > > > > cases. Users will have multiple tables in Ignite. Plus views,
> > triggers,
> > > > > constraints, etc.. All these features are very useful and easy to
> > > > implement
> > > > > provided that we already have table and index implementations. And
> in
> > > > > databases all related objects are *logically *grouped in a
> "schema".
> > > This
> > > > > is convenient for users: less boilerplate in SQL, better
> > manageability
> > > > > (remember that database users will definitely need some console
> > and/or
> > > UI
> > > > > tools to manage Ignite as a database).
> > > > >
> > > > > What you offer is to group database objects *physically *rather
> than
> > > > > logically. It will lead to:
> > > > > - Boilerplate in queries
> > > > > - Inconvenient database management. All the things database users
> are
> > > > used
> > > > > to - import/export tools, UIs, "USING" keyword, etc, will look
> weird
> > in
> > > > > Ignite as there will be no way to group arbitrary objects
> logically.
> > > > >
> > > > > With this approach almost every user will have to use two schemes
> > > instead
> > > > > of one - one for operational data (PARTITIONED) and one for
> reference
> > > > data
> > > > > (REPLICATED). No conventional database works this way.
> > > > >
> > > > > Vladimir.
> > > > >
> > > > > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> > > > dsetrakyan@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Vova,
> > > > > >
> > > > > > I will join Sergi here. It seems like "schema = cache" will take
> > care
> > > > of
> > > > > > all different configuration properties required for different
> > groups
> > > of
> > > > > > caches. In addition, it cleanly maps into current Ignite
> > > architecture.
> > > > We
> > > > > > will need to have a very strong reason to move away from it.
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <
> > > vozerov@gridgain.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Correct, it worked, because Ignite has never had real database
> > use
> > > > case
> > > > > > in
> > > > > > > mind. Unfortunately, if our global plans go as expected, it
> will
> > > not
> > > > > work
> > > > > > > for Ignite 2.x+.
> > > > > > >
> > > > > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > > > > > sergi.vladykin@gmail.com
> > > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Lets move on with SQL schema == Ignite cache. It worked
> always
> > > like
> > > > > > > this, I
> > > > > > > > see no reasons to change this.
> > > > > > > >
> > > > > > > > Sergi
> > > > > > > >
> > > > > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <
> > vozerov@gridgain.com
> > > >:
> > > > > > > >
> > > > > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than
> > > > "schema"
> > > > > > to
> > > > > > > > our
> > > > > > > > > cache. But not ideally still.
> > > > > > > > >
> > > > > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > > > > > > vozerov@gridgain.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Alex,
> > > > > > > > > >
> > > > > > > > > > Currently Ignite is not used as database. It is used as
> > > search
> > > > > > > engine -
> > > > > > > > > > several types, several tables, several joins. This is why
> > > > having
> > > > > > > > "SCHEMA
> > > > > > > > > ==
> > > > > > > > > > cache" was never a problem. Users have never build
> complex
> > > SQL
> > > > > > > > > applications
> > > > > > > > > > on top of Ignite. But we are going towards database. And
> my
> > > > > > question
> > > > > > > > > stands
> > > > > > > > > > still - suppose it is Y2019, how is user going to migrate
> > his
> > > > > > > database
> > > > > > > > > > containing 20-30-50-100 tables in a single schema in
> Oracle
> > > to
> > > > > > > Ignite?
> > > > > > > > > >
> > > > > > > > > > Single cache for all tables? Doens't work - not flexible.
> > > Users
> > > > > > will
> > > > > > > > > > definitely require different cache modes, different
> > > co-location
> > > > > > > rules,
> > > > > > > > > > different number of backups, etc..
> > > > > > > > > > Schema per table? Doesn't work either - unmanageable and
> > not
> > > > > > > convenient
> > > > > > > > > > for users even for relatively small databases.
> > > > > > > > > >
> > > > > > > > > > From user perspective schema is logical grouping of
> > database
> > > > > > objects,
> > > > > > > > > > nothing more.
> > > > > > > > > >
> > > > > > > > > > For Ignite schema could be a logical group of resources
> > > (nodes,
> > > > > > > memory
> > > > > > > > > > pools, caches, etc.). And multiple tables over multiple
> > > caches
> > > > > > should
> > > > > > > > > > reside in it. To the contrast, table definition governs
> how
> > > > data
> > > > > is
> > > > > > > > > stored.
> > > > > > > > > > This is similar to, for example, MySQL approach, where
> you
> > > > define
> > > > > > how
> > > > > > > > you
> > > > > > > > > > store data on per-table level, and on schema level you
> > define
> > > > > only
> > > > > > > > minor
> > > > > > > > > > things like collation.
> > > > > > > > > >
> > > > > > > > > > Vladimir.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > > > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > >> Vova,
> > > > > > > > > >>
> > > > > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
> > > > vozerov@gridgain.com
> > > > > >:
> > > > > > > > > >> > I am not quite sure I understand the idea of "SCHEMA
> ==
> > > > > cache".
> > > > > > > > > Consider
> > > > > > > > > >> > some small database with, say, ~30 tables. And user
> > wants
> > > to
> > > > > > > migrate
> > > > > > > > > to
> > > > > > > > > >> > Ignite. How is he supposed to do so? 30 schemas
> leading
> > to
> > > > > > rewrite
> > > > > > > > of
> > > > > > > > > >> all
> > > > > > > > > >> > his SQL scripts? Or 30 key-value pairs in a single
> cache
> > > > > leading
> > > > > > > to
> > > > > > > > > >> lack of
> > > > > > > > > >> > flexibility and performance problems?
> > > > > > > > > >>
> > > > > > > > > >> But currently schema *is* semantically equal to cache
> > while
> > > > > table
> > > > > > is
> > > > > > > > > >> equal to type descriptor (i.e. type of stored entities),
> > > > nothing
> > > > > > new
> > > > > > > > > >> here.
> > > > > > > > > >>
> > > > > > > > > >> Say, in single cache we may have entities of types
> Person
> > > and
> > > > > > > > > >> Organization, those map to two tables with same names,
> and
> > > can
> > > > > be
> > > > > > > > > >> accessed within the same cache (i.e. schema).
> > > > > > > > > >>
> > > > > > > > > >> If we want to limit the user with having single type
> > > > descriptor
> > > > > > per
> > > > > > > > > >> cache (i.e. cache has only one type of stored entities -
> > > BTW,
> > > > > > where
> > > > > > > we
> > > > > > > > > >> are with this 2.0-wise?), then this notion could change.
> > But
> > > > > > > currently
> > > > > > > > > >> what has been suggested already fits quite good with
> what
> > we
> > > > do
> > > > > > have
> > > > > > > > > >> at the moment regarding semantic of SQL objects.
> > > > > > > > > >>
> > > > > > > > > >> - Alex
> > > > > > > > > >>
> > > > > > > > > >> > Another example is how to deal with referene tables?
> > Lots
> > > > > > database
> > > > > > > > has
> > > > > > > > > >> > small reference tables which is best to fit REPLICATED
> > > > cache,
> > > > > > > while
> > > > > > > > > >> others
> > > > > > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA ==
> cache"
> > > > will
> > > > > > > force
> > > > > > > > > >> users
> > > > > > > > > >> > to split them into separate schemes leading to poor
> user
> > > > > > > experience.
> > > > > > > > > >> >
> > > > > > > > > >> > I understand that we may have some implementation
> > details
> > > > > around
> > > > > > > it
> > > > > > > > at
> > > > > > > > > >> the
> > > > > > > > > >> > moment. But from user perspective "SCHEMA == cache"
> > > doesn't
> > > > > make
> > > > > > > > > sense.
> > > > > > > > > >> As
> > > > > > > > > >> > we are going towards AI 2.0 we'd better to rethink
> this
> > > > > > approach.
> > > > > > > > > >> >
> > > > > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> > > > > > dmagda@apache.org>
> > > > > > > > > >> wrote:
> > > > > > > > > >> >
> > > > > > > > > >> >>
> > > > > > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > > > > > > >> dsetrakyan@apache.org>
> > > > > > > > > >> >> wrote:
> > > > > > > > > >> >> >
> > > > > > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > > > > > > >> >> sergi.vladykin@gmail.com>
> > > > > > > > > >> >> > wrote:
> > > > > > > > > >> >> >
> > > > > > > > > >> >> >> The xml config was only for example. We can put in
> > > this
> > > > > > > > > >> configuration
> > > > > > > > > >> >> >> string cache config parameters directly like this:
> > > > > > > > > >> >> >>
> > > > > > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > > > > > >> >> >>
> > > > > > > > > >> >> >
> > > > > > > > > >> >> > This approach makes sense, if it can be easily
> > > supported
> > > > > with
> > > > > > > H2.
> > > > > > > > > >> >>
> > > > > > > > > >> >> What’s for affinity keys? Can we make an exception
> for
> > > them
> > > > > by
> > > > > > > > > >> defining in
> > > > > > > > > >> >> this part of the statement
> > > > > > > > > >> >>
> > > > > > > > > >> >> CREATE TABLE employee (
> > > > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > > > > > > >> >>    name VARCHAR(128),
> > > > > > > > > >> >> );
> > > > > > > > > >> >>
> > > > > > > > > >> >> or that l
> > > > > > > > > >> >>
> > > > > > > > > >> >> CREATE TABLE employee (
> > > > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > > > >> >>    dept_id BIGINT,
> > > > > > > > > >> >>    name VARCHAR(128),
> > > > > > > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > > > > > > >> >> );
> > > > > > > > > >> >>
> > > > > > > > > >> >> ?
> > > > > > > > > >> >>
> > > > > > > > > >> >> —
> > > > > > > > > >> >> Denis
> > > > > > > > > >> >>
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Sergey Kozlov
> > GridGain Systems
> > www.gridgain.com
> >
>

Re: DDL implementation details

Posted by Sergi Vladykin <se...@gmail.com>.

Sergey,

See inline:

2017-01-17 0:50 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:

> For the approach schema==cache it is not clear how it works for some cases
> (especialy if SQL and regular cache operations are mixed):
>
> 1. I want to store two tables in same cache (same schema) and both tables
> have identical structure like id Integer, name String.
> How they will be processed for regular cache operations
> put/putall/get/getall etc where we get cache instance by its name (on SQL
> level we use the table name) ?
>

This already works. We can have multiple QueryEntities configured for a
single cache. The only restriction here is that cache key must be unique
across all the tables inside the same schema.


>
> 2. What's about ALTER TABLE statement? Will the changes for one table lock
> all tables in the cache?
>

It will not be needed, any table structure modification will be performed
in the table scope.


> 3. In the future if we will introduce table-based features like SQL roles
> (as next step of SQL implementation) "many tables one cache" rule looks
> like more difficult for its implementation.
>

I don't think so, must be the same thing.


>
> p.s. may be we need to think about new entity called "table" (based on type
> descriptor) inside cache to better support of SQL DDL
>

We already have it: QueryEntity. It has exactly that semantics.

Sergi


>
>
> On Mon, Jan 16, 2017 at 11:06 PM, Sergi Vladykin <sergi.vladykin@gmail.com
> >
> wrote:
>
> > Dima,
> >
> > I agree that cache==table is definitely a wrong choice, but as far as I
> see
> > Vova suggests having cache==tablespace instead of cache==schema. I tend
> to
> > agree with this decoupling of physical and logical grouping, but the
> > concern is that it will require much more work to do.
> >
> > Sergi
> >
> > 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
> >
> > > Vova,
> > >
> > > Currently I see only 2 ways we can proceed here:
> > >
> > >    1. cache == table
> > >    2. cache == schema
> > >
> > > I agree that "cache==table" may be more flexible, but I don't think it
> > will
> > > work in Ignite.
> > > We may end up with 1,000s of caches, which will carry significant
> > overhead
> > > on memory and cluster overall. I think that we have no choice but to
> take
> > > "cache==schema" approach.
> > >
> > > D.
> > >
> > > On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <vozerov@gridgain.com
> >
> > > wrote:
> > >
> > > > Sergi, Dima,
> > > >
> > > > In the scope of Ignite 1.x it is perfectly fine to have "schema =
> > cache".
> > > > Nobody suffers from it because nobody use Ignite as database. But in
> > > > future, thanks to page memory, we are going to target real database
> use
> > > > cases. Users will have multiple tables in Ignite. Plus views,
> triggers,
> > > > constraints, etc.. All these features are very useful and easy to
> > > implement
> > > > provided that we already have table and index implementations. And in
> > > > databases all related objects are *logically *grouped in a "schema".
> > This
> > > > is convenient for users: less boilerplate in SQL, better
> manageability
> > > > (remember that database users will definitely need some console
> and/or
> > UI
> > > > tools to manage Ignite as a database).
> > > >
> > > > What you offer is to group database objects *physically *rather than
> > > > logically. It will lead to:
> > > > - Boilerplate in queries
> > > > - Inconvenient database management. All the things database users are
> > > used
> > > > to - import/export tools, UIs, "USING" keyword, etc, will look weird
> in
> > > > Ignite as there will be no way to group arbitrary objects logically.
> > > >
> > > > With this approach almost every user will have to use two schemes
> > instead
> > > > of one - one for operational data (PARTITIONED) and one for reference
> > > data
> > > > (REPLICATED). No conventional database works this way.
> > > >
> > > > Vladimir.
> > > >
> > > > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> > > dsetrakyan@apache.org>
> > > > wrote:
> > > >
> > > > > Vova,
> > > > >
> > > > > I will join Sergi here. It seems like "schema = cache" will take
> care
> > > of
> > > > > all different configuration properties required for different
> groups
> > of
> > > > > caches. In addition, it cleanly maps into current Ignite
> > architecture.
> > > We
> > > > > will need to have a very strong reason to move away from it.
> > > > >
> > > > > D.
> > > > >
> > > > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <
> > vozerov@gridgain.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Correct, it worked, because Ignite has never had real database
> use
> > > case
> > > > > in
> > > > > > mind. Unfortunately, if our global plans go as expected, it will
> > not
> > > > work
> > > > > > for Ignite 2.x+.
> > > > > >
> > > > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > > > > sergi.vladykin@gmail.com
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Lets move on with SQL schema == Ignite cache. It worked always
> > like
> > > > > > this, I
> > > > > > > see no reasons to change this.
> > > > > > >
> > > > > > > Sergi
> > > > > > >
> > > > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <
> vozerov@gridgain.com
> > >:
> > > > > > >
> > > > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than
> > > "schema"
> > > > > to
> > > > > > > our
> > > > > > > > cache. But not ideally still.
> > > > > > > >
> > > > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > > > > > vozerov@gridgain.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Alex,
> > > > > > > > >
> > > > > > > > > Currently Ignite is not used as database. It is used as
> > search
> > > > > > engine -
> > > > > > > > > several types, several tables, several joins. This is why
> > > having
> > > > > > > "SCHEMA
> > > > > > > > ==
> > > > > > > > > cache" was never a problem. Users have never build complex
> > SQL
> > > > > > > > applications
> > > > > > > > > on top of Ignite. But we are going towards database. And my
> > > > > question
> > > > > > > > stands
> > > > > > > > > still - suppose it is Y2019, how is user going to migrate
> his
> > > > > > database
> > > > > > > > > containing 20-30-50-100 tables in a single schema in Oracle
> > to
> > > > > > Ignite?
> > > > > > > > >
> > > > > > > > > Single cache for all tables? Doens't work - not flexible.
> > Users
> > > > > will
> > > > > > > > > definitely require different cache modes, different
> > co-location
> > > > > > rules,
> > > > > > > > > different number of backups, etc..
> > > > > > > > > Schema per table? Doesn't work either - unmanageable and
> not
> > > > > > convenient
> > > > > > > > > for users even for relatively small databases.
> > > > > > > > >
> > > > > > > > > From user perspective schema is logical grouping of
> database
> > > > > objects,
> > > > > > > > > nothing more.
> > > > > > > > >
> > > > > > > > > For Ignite schema could be a logical group of resources
> > (nodes,
> > > > > > memory
> > > > > > > > > pools, caches, etc.). And multiple tables over multiple
> > caches
> > > > > should
> > > > > > > > > reside in it. To the contrast, table definition governs how
> > > data
> > > > is
> > > > > > > > stored.
> > > > > > > > > This is similar to, for example, MySQL approach, where you
> > > define
> > > > > how
> > > > > > > you
> > > > > > > > > store data on per-table level, and on schema level you
> define
> > > > only
> > > > > > > minor
> > > > > > > > > things like collation.
> > > > > > > > >
> > > > > > > > > Vladimir.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > >> Vova,
> > > > > > > > >>
> > > > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
> > > vozerov@gridgain.com
> > > > >:
> > > > > > > > >> > I am not quite sure I understand the idea of "SCHEMA ==
> > > > cache".
> > > > > > > > Consider
> > > > > > > > >> > some small database with, say, ~30 tables. And user
> wants
> > to
> > > > > > migrate
> > > > > > > > to
> > > > > > > > >> > Ignite. How is he supposed to do so? 30 schemas leading
> to
> > > > > rewrite
> > > > > > > of
> > > > > > > > >> all
> > > > > > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache
> > > > leading
> > > > > > to
> > > > > > > > >> lack of
> > > > > > > > >> > flexibility and performance problems?
> > > > > > > > >>
> > > > > > > > >> But currently schema *is* semantically equal to cache
> while
> > > > table
> > > > > is
> > > > > > > > >> equal to type descriptor (i.e. type of stored entities),
> > > nothing
> > > > > new
> > > > > > > > >> here.
> > > > > > > > >>
> > > > > > > > >> Say, in single cache we may have entities of types Person
> > and
> > > > > > > > >> Organization, those map to two tables with same names, and
> > can
> > > > be
> > > > > > > > >> accessed within the same cache (i.e. schema).
> > > > > > > > >>
> > > > > > > > >> If we want to limit the user with having single type
> > > descriptor
> > > > > per
> > > > > > > > >> cache (i.e. cache has only one type of stored entities -
> > BTW,
> > > > > where
> > > > > > we
> > > > > > > > >> are with this 2.0-wise?), then this notion could change.
> But
> > > > > > currently
> > > > > > > > >> what has been suggested already fits quite good with what
> we
> > > do
> > > > > have
> > > > > > > > >> at the moment regarding semantic of SQL objects.
> > > > > > > > >>
> > > > > > > > >> - Alex
> > > > > > > > >>
> > > > > > > > >> > Another example is how to deal with referene tables?
> Lots
> > > > > database
> > > > > > > has
> > > > > > > > >> > small reference tables which is best to fit REPLICATED
> > > cache,
> > > > > > while
> > > > > > > > >> others
> > > > > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache"
> > > will
> > > > > > force
> > > > > > > > >> users
> > > > > > > > >> > to split them into separate schemes leading to poor user
> > > > > > experience.
> > > > > > > > >> >
> > > > > > > > >> > I understand that we may have some implementation
> details
> > > > around
> > > > > > it
> > > > > > > at
> > > > > > > > >> the
> > > > > > > > >> > moment. But from user perspective "SCHEMA == cache"
> > doesn't
> > > > make
> > > > > > > > sense.
> > > > > > > > >> As
> > > > > > > > >> > we are going towards AI 2.0 we'd better to rethink this
> > > > > approach.
> > > > > > > > >> >
> > > > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> > > > > dmagda@apache.org>
> > > > > > > > >> wrote:
> > > > > > > > >> >
> > > > > > > > >> >>
> > > > > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > > > > > >> dsetrakyan@apache.org>
> > > > > > > > >> >> wrote:
> > > > > > > > >> >> >
> > > > > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > > > > > >> >> sergi.vladykin@gmail.com>
> > > > > > > > >> >> > wrote:
> > > > > > > > >> >> >
> > > > > > > > >> >> >> The xml config was only for example. We can put in
> > this
> > > > > > > > >> configuration
> > > > > > > > >> >> >> string cache config parameters directly like this:
> > > > > > > > >> >> >>
> > > > > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > > > > >> >> >>
> > > > > > > > >> >> >
> > > > > > > > >> >> > This approach makes sense, if it can be easily
> > supported
> > > > with
> > > > > > H2.
> > > > > > > > >> >>
> > > > > > > > >> >> What’s for affinity keys? Can we make an exception for
> > them
> > > > by
> > > > > > > > >> defining in
> > > > > > > > >> >> this part of the statement
> > > > > > > > >> >>
> > > > > > > > >> >> CREATE TABLE employee (
> > > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > > > > > >> >>    name VARCHAR(128),
> > > > > > > > >> >> );
> > > > > > > > >> >>
> > > > > > > > >> >> or that l
> > > > > > > > >> >>
> > > > > > > > >> >> CREATE TABLE employee (
> > > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > > >> >>    dept_id BIGINT,
> > > > > > > > >> >>    name VARCHAR(128),
> > > > > > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > > > > > >> >> );
> > > > > > > > >> >>
> > > > > > > > >> >> ?
> > > > > > > > >> >>
> > > > > > > > >> >> —
> > > > > > > > >> >> Denis
> > > > > > > > >> >>
> > > > > > > > >> >>
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Sergey Kozlov
> GridGain Systems
> www.gridgain.com
>

Re: DDL implementation details

Posted by Sergey Kozlov <sk...@gridgain.com>.

For the approach schema==cache it is not clear how it works for some cases
(especialy if SQL and regular cache operations are mixed):

1. I want to store two tables in same cache (same schema) and both tables
have identical structure like id Integer, name String.
How they will be processed for regular cache operations
put/putall/get/getall etc where we get cache instance by its name (on SQL
level we use the table name) ?

2. What's about ALTER TABLE statement? Will the changes for one table lock
all tables in the cache?

3. In the future if we will introduce table-based features like SQL roles
(as next step of SQL implementation) "many tables one cache" rule looks
like more difficult for its implementation.

p.s. may be we need to think about new entity called "table" (based on type
descriptor) inside cache to better support of SQL DDL


On Mon, Jan 16, 2017 at 11:06 PM, Sergi Vladykin <se...@gmail.com>
wrote:

> Dima,
>
> I agree that cache==table is definitely a wrong choice, but as far as I see
> Vova suggests having cache==tablespace instead of cache==schema. I tend to
> agree with this decoupling of physical and logical grouping, but the
> concern is that it will require much more work to do.
>
> Sergi
>
> 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
>
> > Vova,
> >
> > Currently I see only 2 ways we can proceed here:
> >
> >    1. cache == table
> >    2. cache == schema
> >
> > I agree that "cache==table" may be more flexible, but I don't think it
> will
> > work in Ignite.
> > We may end up with 1,000s of caches, which will carry significant
> overhead
> > on memory and cluster overall. I think that we have no choice but to take
> > "cache==schema" approach.
> >
> > D.
> >
> > On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <vo...@gridgain.com>
> > wrote:
> >
> > > Sergi, Dima,
> > >
> > > In the scope of Ignite 1.x it is perfectly fine to have "schema =
> cache".
> > > Nobody suffers from it because nobody use Ignite as database. But in
> > > future, thanks to page memory, we are going to target real database use
> > > cases. Users will have multiple tables in Ignite. Plus views, triggers,
> > > constraints, etc.. All these features are very useful and easy to
> > implement
> > > provided that we already have table and index implementations. And in
> > > databases all related objects are *logically *grouped in a "schema".
> This
> > > is convenient for users: less boilerplate in SQL, better manageability
> > > (remember that database users will definitely need some console and/or
> UI
> > > tools to manage Ignite as a database).
> > >
> > > What you offer is to group database objects *physically *rather than
> > > logically. It will lead to:
> > > - Boilerplate in queries
> > > - Inconvenient database management. All the things database users are
> > used
> > > to - import/export tools, UIs, "USING" keyword, etc, will look weird in
> > > Ignite as there will be no way to group arbitrary objects logically.
> > >
> > > With this approach almost every user will have to use two schemes
> instead
> > > of one - one for operational data (PARTITIONED) and one for reference
> > data
> > > (REPLICATED). No conventional database works this way.
> > >
> > > Vladimir.
> > >
> > > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> > dsetrakyan@apache.org>
> > > wrote:
> > >
> > > > Vova,
> > > >
> > > > I will join Sergi here. It seems like "schema = cache" will take care
> > of
> > > > all different configuration properties required for different groups
> of
> > > > caches. In addition, it cleanly maps into current Ignite
> architecture.
> > We
> > > > will need to have a very strong reason to move away from it.
> > > >
> > > > D.
> > > >
> > > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <
> vozerov@gridgain.com
> > >
> > > > wrote:
> > > >
> > > > > Correct, it worked, because Ignite has never had real database use
> > case
> > > > in
> > > > > mind. Unfortunately, if our global plans go as expected, it will
> not
> > > work
> > > > > for Ignite 2.x+.
> > > > >
> > > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > > > sergi.vladykin@gmail.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Lets move on with SQL schema == Ignite cache. It worked always
> like
> > > > > this, I
> > > > > > see no reasons to change this.
> > > > > >
> > > > > > Sergi
> > > > > >
> > > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <vozerov@gridgain.com
> >:
> > > > > >
> > > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than
> > "schema"
> > > > to
> > > > > > our
> > > > > > > cache. But not ideally still.
> > > > > > >
> > > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > > > > vozerov@gridgain.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Alex,
> > > > > > > >
> > > > > > > > Currently Ignite is not used as database. It is used as
> search
> > > > > engine -
> > > > > > > > several types, several tables, several joins. This is why
> > having
> > > > > > "SCHEMA
> > > > > > > ==
> > > > > > > > cache" was never a problem. Users have never build complex
> SQL
> > > > > > > applications
> > > > > > > > on top of Ignite. But we are going towards database. And my
> > > > question
> > > > > > > stands
> > > > > > > > still - suppose it is Y2019, how is user going to migrate his
> > > > > database
> > > > > > > > containing 20-30-50-100 tables in a single schema in Oracle
> to
> > > > > Ignite?
> > > > > > > >
> > > > > > > > Single cache for all tables? Doens't work - not flexible.
> Users
> > > > will
> > > > > > > > definitely require different cache modes, different
> co-location
> > > > > rules,
> > > > > > > > different number of backups, etc..
> > > > > > > > Schema per table? Doesn't work either - unmanageable and not
> > > > > convenient
> > > > > > > > for users even for relatively small databases.
> > > > > > > >
> > > > > > > > From user perspective schema is logical grouping of database
> > > > objects,
> > > > > > > > nothing more.
> > > > > > > >
> > > > > > > > For Ignite schema could be a logical group of resources
> (nodes,
> > > > > memory
> > > > > > > > pools, caches, etc.). And multiple tables over multiple
> caches
> > > > should
> > > > > > > > reside in it. To the contrast, table definition governs how
> > data
> > > is
> > > > > > > stored.
> > > > > > > > This is similar to, for example, MySQL approach, where you
> > define
> > > > how
> > > > > > you
> > > > > > > > store data on per-table level, and on schema level you define
> > > only
> > > > > > minor
> > > > > > > > things like collation.
> > > > > > > >
> > > > > > > > Vladimir.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > > > > >
> > > > > > > >> Vova,
> > > > > > > >>
> > > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
> > vozerov@gridgain.com
> > > >:
> > > > > > > >> > I am not quite sure I understand the idea of "SCHEMA ==
> > > cache".
> > > > > > > Consider
> > > > > > > >> > some small database with, say, ~30 tables. And user wants
> to
> > > > > migrate
> > > > > > > to
> > > > > > > >> > Ignite. How is he supposed to do so? 30 schemas leading to
> > > > rewrite
> > > > > > of
> > > > > > > >> all
> > > > > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache
> > > leading
> > > > > to
> > > > > > > >> lack of
> > > > > > > >> > flexibility and performance problems?
> > > > > > > >>
> > > > > > > >> But currently schema *is* semantically equal to cache while
> > > table
> > > > is
> > > > > > > >> equal to type descriptor (i.e. type of stored entities),
> > nothing
> > > > new
> > > > > > > >> here.
> > > > > > > >>
> > > > > > > >> Say, in single cache we may have entities of types Person
> and
> > > > > > > >> Organization, those map to two tables with same names, and
> can
> > > be
> > > > > > > >> accessed within the same cache (i.e. schema).
> > > > > > > >>
> > > > > > > >> If we want to limit the user with having single type
> > descriptor
> > > > per
> > > > > > > >> cache (i.e. cache has only one type of stored entities -
> BTW,
> > > > where
> > > > > we
> > > > > > > >> are with this 2.0-wise?), then this notion could change. But
> > > > > currently
> > > > > > > >> what has been suggested already fits quite good with what we
> > do
> > > > have
> > > > > > > >> at the moment regarding semantic of SQL objects.
> > > > > > > >>
> > > > > > > >> - Alex
> > > > > > > >>
> > > > > > > >> > Another example is how to deal with referene tables? Lots
> > > > database
> > > > > > has
> > > > > > > >> > small reference tables which is best to fit REPLICATED
> > cache,
> > > > > while
> > > > > > > >> others
> > > > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache"
> > will
> > > > > force
> > > > > > > >> users
> > > > > > > >> > to split them into separate schemes leading to poor user
> > > > > experience.
> > > > > > > >> >
> > > > > > > >> > I understand that we may have some implementation details
> > > around
> > > > > it
> > > > > > at
> > > > > > > >> the
> > > > > > > >> > moment. But from user perspective "SCHEMA == cache"
> doesn't
> > > make
> > > > > > > sense.
> > > > > > > >> As
> > > > > > > >> > we are going towards AI 2.0 we'd better to rethink this
> > > > approach.
> > > > > > > >> >
> > > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> > > > dmagda@apache.org>
> > > > > > > >> wrote:
> > > > > > > >> >
> > > > > > > >> >>
> > > > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > > > > >> dsetrakyan@apache.org>
> > > > > > > >> >> wrote:
> > > > > > > >> >> >
> > > > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > > > > >> >> sergi.vladykin@gmail.com>
> > > > > > > >> >> > wrote:
> > > > > > > >> >> >
> > > > > > > >> >> >> The xml config was only for example. We can put in
> this
> > > > > > > >> configuration
> > > > > > > >> >> >> string cache config parameters directly like this:
> > > > > > > >> >> >>
> > > > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > > > >> >> >>
> > > > > > > >> >> >
> > > > > > > >> >> > This approach makes sense, if it can be easily
> supported
> > > with
> > > > > H2.
> > > > > > > >> >>
> > > > > > > >> >> What’s for affinity keys? Can we make an exception for
> them
> > > by
> > > > > > > >> defining in
> > > > > > > >> >> this part of the statement
> > > > > > > >> >>
> > > > > > > >> >> CREATE TABLE employee (
> > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > > > > >> >>    name VARCHAR(128),
> > > > > > > >> >> );
> > > > > > > >> >>
> > > > > > > >> >> or that l
> > > > > > > >> >>
> > > > > > > >> >> CREATE TABLE employee (
> > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > >> >>    dept_id BIGINT,
> > > > > > > >> >>    name VARCHAR(128),
> > > > > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > > > > >> >> );
> > > > > > > >> >>
> > > > > > > >> >> ?
> > > > > > > >> >>
> > > > > > > >> >> —
> > > > > > > >> >> Denis
> > > > > > > >> >>
> > > > > > > >> >>
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
Sergey Kozlov
GridGain Systems
www.gridgain.com

Re: DDL implementation details

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Mon, Jan 16, 2017 at 12:06 PM, Sergi Vladykin <se...@gmail.com>
wrote:

> Dima,
>
> I agree that cache==table is definitely a wrong choice, but as far as I see
> Vova suggests having cache==tablespace instead of cache==schema. I tend to
> agree with this decoupling of physical and logical grouping, but the
> concern is that it will require much more work to do.
>

I am still unclear. What would a tablespace map to? Would it be an Ignite
cache?


>
> Sergi
>
> 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
>
> > Vova,
> >
> > Currently I see only 2 ways we can proceed here:
> >
> >    1. cache == table
> >    2. cache == schema
> >
> > I agree that "cache==table" may be more flexible, but I don't think it
> will
> > work in Ignite.
> > We may end up with 1,000s of caches, which will carry significant
> overhead
> > on memory and cluster overall. I think that we have no choice but to take
> > "cache==schema" approach.
> >
> > D.
> >
> > On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <vo...@gridgain.com>
> > wrote:
> >
> > > Sergi, Dima,
> > >
> > > In the scope of Ignite 1.x it is perfectly fine to have "schema =
> cache".
> > > Nobody suffers from it because nobody use Ignite as database. But in
> > > future, thanks to page memory, we are going to target real database use
> > > cases. Users will have multiple tables in Ignite. Plus views, triggers,
> > > constraints, etc.. All these features are very useful and easy to
> > implement
> > > provided that we already have table and index implementations. And in
> > > databases all related objects are *logically *grouped in a "schema".
> This
> > > is convenient for users: less boilerplate in SQL, better manageability
> > > (remember that database users will definitely need some console and/or
> UI
> > > tools to manage Ignite as a database).
> > >
> > > What you offer is to group database objects *physically *rather than
> > > logically. It will lead to:
> > > - Boilerplate in queries
> > > - Inconvenient database management. All the things database users are
> > used
> > > to - import/export tools, UIs, "USING" keyword, etc, will look weird in
> > > Ignite as there will be no way to group arbitrary objects logically.
> > >
> > > With this approach almost every user will have to use two schemes
> instead
> > > of one - one for operational data (PARTITIONED) and one for reference
> > data
> > > (REPLICATED). No conventional database works this way.
> > >
> > > Vladimir.
> > >
> > > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> > dsetrakyan@apache.org>
> > > wrote:
> > >
> > > > Vova,
> > > >
> > > > I will join Sergi here. It seems like "schema = cache" will take care
> > of
> > > > all different configuration properties required for different groups
> of
> > > > caches. In addition, it cleanly maps into current Ignite
> architecture.
> > We
> > > > will need to have a very strong reason to move away from it.
> > > >
> > > > D.
> > > >
> > > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <
> vozerov@gridgain.com
> > >
> > > > wrote:
> > > >
> > > > > Correct, it worked, because Ignite has never had real database use
> > case
> > > > in
> > > > > mind. Unfortunately, if our global plans go as expected, it will
> not
> > > work
> > > > > for Ignite 2.x+.
> > > > >
> > > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > > > sergi.vladykin@gmail.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Lets move on with SQL schema == Ignite cache. It worked always
> like
> > > > > this, I
> > > > > > see no reasons to change this.
> > > > > >
> > > > > > Sergi
> > > > > >
> > > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <vozerov@gridgain.com
> >:
> > > > > >
> > > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than
> > "schema"
> > > > to
> > > > > > our
> > > > > > > cache. But not ideally still.
> > > > > > >
> > > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > > > > vozerov@gridgain.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Alex,
> > > > > > > >
> > > > > > > > Currently Ignite is not used as database. It is used as
> search
> > > > > engine -
> > > > > > > > several types, several tables, several joins. This is why
> > having
> > > > > > "SCHEMA
> > > > > > > ==
> > > > > > > > cache" was never a problem. Users have never build complex
> SQL
> > > > > > > applications
> > > > > > > > on top of Ignite. But we are going towards database. And my
> > > > question
> > > > > > > stands
> > > > > > > > still - suppose it is Y2019, how is user going to migrate his
> > > > > database
> > > > > > > > containing 20-30-50-100 tables in a single schema in Oracle
> to
> > > > > Ignite?
> > > > > > > >
> > > > > > > > Single cache for all tables? Doens't work - not flexible.
> Users
> > > > will
> > > > > > > > definitely require different cache modes, different
> co-location
> > > > > rules,
> > > > > > > > different number of backups, etc..
> > > > > > > > Schema per table? Doesn't work either - unmanageable and not
> > > > > convenient
> > > > > > > > for users even for relatively small databases.
> > > > > > > >
> > > > > > > > From user perspective schema is logical grouping of database
> > > > objects,
> > > > > > > > nothing more.
> > > > > > > >
> > > > > > > > For Ignite schema could be a logical group of resources
> (nodes,
> > > > > memory
> > > > > > > > pools, caches, etc.). And multiple tables over multiple
> caches
> > > > should
> > > > > > > > reside in it. To the contrast, table definition governs how
> > data
> > > is
> > > > > > > stored.
> > > > > > > > This is similar to, for example, MySQL approach, where you
> > define
> > > > how
> > > > > > you
> > > > > > > > store data on per-table level, and on schema level you define
> > > only
> > > > > > minor
> > > > > > > > things like collation.
> > > > > > > >
> > > > > > > > Vladimir.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > > > > >
> > > > > > > >> Vova,
> > > > > > > >>
> > > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
> > vozerov@gridgain.com
> > > >:
> > > > > > > >> > I am not quite sure I understand the idea of "SCHEMA ==
> > > cache".
> > > > > > > Consider
> > > > > > > >> > some small database with, say, ~30 tables. And user wants
> to
> > > > > migrate
> > > > > > > to
> > > > > > > >> > Ignite. How is he supposed to do so? 30 schemas leading to
> > > > rewrite
> > > > > > of
> > > > > > > >> all
> > > > > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache
> > > leading
> > > > > to
> > > > > > > >> lack of
> > > > > > > >> > flexibility and performance problems?
> > > > > > > >>
> > > > > > > >> But currently schema *is* semantically equal to cache while
> > > table
> > > > is
> > > > > > > >> equal to type descriptor (i.e. type of stored entities),
> > nothing
> > > > new
> > > > > > > >> here.
> > > > > > > >>
> > > > > > > >> Say, in single cache we may have entities of types Person
> and
> > > > > > > >> Organization, those map to two tables with same names, and
> can
> > > be
> > > > > > > >> accessed within the same cache (i.e. schema).
> > > > > > > >>
> > > > > > > >> If we want to limit the user with having single type
> > descriptor
> > > > per
> > > > > > > >> cache (i.e. cache has only one type of stored entities -
> BTW,
> > > > where
> > > > > we
> > > > > > > >> are with this 2.0-wise?), then this notion could change. But
> > > > > currently
> > > > > > > >> what has been suggested already fits quite good with what we
> > do
> > > > have
> > > > > > > >> at the moment regarding semantic of SQL objects.
> > > > > > > >>
> > > > > > > >> - Alex
> > > > > > > >>
> > > > > > > >> > Another example is how to deal with referene tables? Lots
> > > > database
> > > > > > has
> > > > > > > >> > small reference tables which is best to fit REPLICATED
> > cache,
> > > > > while
> > > > > > > >> others
> > > > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache"
> > will
> > > > > force
> > > > > > > >> users
> > > > > > > >> > to split them into separate schemes leading to poor user
> > > > > experience.
> > > > > > > >> >
> > > > > > > >> > I understand that we may have some implementation details
> > > around
> > > > > it
> > > > > > at
> > > > > > > >> the
> > > > > > > >> > moment. But from user perspective "SCHEMA == cache"
> doesn't
> > > make
> > > > > > > sense.
> > > > > > > >> As
> > > > > > > >> > we are going towards AI 2.0 we'd better to rethink this
> > > > approach.
> > > > > > > >> >
> > > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> > > > dmagda@apache.org>
> > > > > > > >> wrote:
> > > > > > > >> >
> > > > > > > >> >>
> > > > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > > > > >> dsetrakyan@apache.org>
> > > > > > > >> >> wrote:
> > > > > > > >> >> >
> > > > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > > > > >> >> sergi.vladykin@gmail.com>
> > > > > > > >> >> > wrote:
> > > > > > > >> >> >
> > > > > > > >> >> >> The xml config was only for example. We can put in
> this
> > > > > > > >> configuration
> > > > > > > >> >> >> string cache config parameters directly like this:
> > > > > > > >> >> >>
> > > > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > > > >> >> >>
> > > > > > > >> >> >
> > > > > > > >> >> > This approach makes sense, if it can be easily
> supported
> > > with
> > > > > H2.
> > > > > > > >> >>
> > > > > > > >> >> What’s for affinity keys? Can we make an exception for
> them
> > > by
> > > > > > > >> defining in
> > > > > > > >> >> this part of the statement
> > > > > > > >> >>
> > > > > > > >> >> CREATE TABLE employee (
> > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > > > > >> >>    name VARCHAR(128),
> > > > > > > >> >> );
> > > > > > > >> >>
> > > > > > > >> >> or that l
> > > > > > > >> >>
> > > > > > > >> >> CREATE TABLE employee (
> > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > >> >>    dept_id BIGINT,
> > > > > > > >> >>    name VARCHAR(128),
> > > > > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > > > > >> >> );
> > > > > > > >> >>
> > > > > > > >> >> ?
> > > > > > > >> >>
> > > > > > > >> >> —
> > > > > > > >> >> Denis
> > > > > > > >> >>
> > > > > > > >> >>
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: DDL implementation details

Posted by Sergi Vladykin <se...@gmail.com>.

Dima,

I agree that cache==table is definitely a wrong choice, but as far as I see
Vova suggests having cache==tablespace instead of cache==schema. I tend to
agree with this decoupling of physical and logical grouping, but the
concern is that it will require much more work to do.

Sergi

2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:

> Vova,
>
> Currently I see only 2 ways we can proceed here:
>
>    1. cache == table
>    2. cache == schema
>
> I agree that "cache==table" may be more flexible, but I don't think it will
> work in Ignite.
> We may end up with 1,000s of caches, which will carry significant overhead
> on memory and cluster overall. I think that we have no choice but to take
> "cache==schema" approach.
>
> D.
>
> On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
> > Sergi, Dima,
> >
> > In the scope of Ignite 1.x it is perfectly fine to have "schema = cache".
> > Nobody suffers from it because nobody use Ignite as database. But in
> > future, thanks to page memory, we are going to target real database use
> > cases. Users will have multiple tables in Ignite. Plus views, triggers,
> > constraints, etc.. All these features are very useful and easy to
> implement
> > provided that we already have table and index implementations. And in
> > databases all related objects are *logically *grouped in a "schema". This
> > is convenient for users: less boilerplate in SQL, better manageability
> > (remember that database users will definitely need some console and/or UI
> > tools to manage Ignite as a database).
> >
> > What you offer is to group database objects *physically *rather than
> > logically. It will lead to:
> > - Boilerplate in queries
> > - Inconvenient database management. All the things database users are
> used
> > to - import/export tools, UIs, "USING" keyword, etc, will look weird in
> > Ignite as there will be no way to group arbitrary objects logically.
> >
> > With this approach almost every user will have to use two schemes instead
> > of one - one for operational data (PARTITIONED) and one for reference
> data
> > (REPLICATED). No conventional database works this way.
> >
> > Vladimir.
> >
> > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> dsetrakyan@apache.org>
> > wrote:
> >
> > > Vova,
> > >
> > > I will join Sergi here. It seems like "schema = cache" will take care
> of
> > > all different configuration properties required for different groups of
> > > caches. In addition, it cleanly maps into current Ignite architecture.
> We
> > > will need to have a very strong reason to move away from it.
> > >
> > > D.
> > >
> > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <vozerov@gridgain.com
> >
> > > wrote:
> > >
> > > > Correct, it worked, because Ignite has never had real database use
> case
> > > in
> > > > mind. Unfortunately, if our global plans go as expected, it will not
> > work
> > > > for Ignite 2.x+.
> > > >
> > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > > sergi.vladykin@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Lets move on with SQL schema == Ignite cache. It worked always like
> > > > this, I
> > > > > see no reasons to change this.
> > > > >
> > > > > Sergi
> > > > >
> > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> > > > >
> > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than
> "schema"
> > > to
> > > > > our
> > > > > > cache. But not ideally still.
> > > > > >
> > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > > > vozerov@gridgain.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Alex,
> > > > > > >
> > > > > > > Currently Ignite is not used as database. It is used as search
> > > > engine -
> > > > > > > several types, several tables, several joins. This is why
> having
> > > > > "SCHEMA
> > > > > > ==
> > > > > > > cache" was never a problem. Users have never build complex SQL
> > > > > > applications
> > > > > > > on top of Ignite. But we are going towards database. And my
> > > question
> > > > > > stands
> > > > > > > still - suppose it is Y2019, how is user going to migrate his
> > > > database
> > > > > > > containing 20-30-50-100 tables in a single schema in Oracle to
> > > > Ignite?
> > > > > > >
> > > > > > > Single cache for all tables? Doens't work - not flexible. Users
> > > will
> > > > > > > definitely require different cache modes, different co-location
> > > > rules,
> > > > > > > different number of backups, etc..
> > > > > > > Schema per table? Doesn't work either - unmanageable and not
> > > > convenient
> > > > > > > for users even for relatively small databases.
> > > > > > >
> > > > > > > From user perspective schema is logical grouping of database
> > > objects,
> > > > > > > nothing more.
> > > > > > >
> > > > > > > For Ignite schema could be a logical group of resources (nodes,
> > > > memory
> > > > > > > pools, caches, etc.). And multiple tables over multiple caches
> > > should
> > > > > > > reside in it. To the contrast, table definition governs how
> data
> > is
> > > > > > stored.
> > > > > > > This is similar to, for example, MySQL approach, where you
> define
> > > how
> > > > > you
> > > > > > > store data on per-table level, and on schema level you define
> > only
> > > > > minor
> > > > > > > things like collation.
> > > > > > >
> > > > > > > Vladimir.
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > > > >
> > > > > > >> Vova,
> > > > > > >>
> > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
> vozerov@gridgain.com
> > >:
> > > > > > >> > I am not quite sure I understand the idea of "SCHEMA ==
> > cache".
> > > > > > Consider
> > > > > > >> > some small database with, say, ~30 tables. And user wants to
> > > > migrate
> > > > > > to
> > > > > > >> > Ignite. How is he supposed to do so? 30 schemas leading to
> > > rewrite
> > > > > of
> > > > > > >> all
> > > > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache
> > leading
> > > > to
> > > > > > >> lack of
> > > > > > >> > flexibility and performance problems?
> > > > > > >>
> > > > > > >> But currently schema *is* semantically equal to cache while
> > table
> > > is
> > > > > > >> equal to type descriptor (i.e. type of stored entities),
> nothing
> > > new
> > > > > > >> here.
> > > > > > >>
> > > > > > >> Say, in single cache we may have entities of types Person and
> > > > > > >> Organization, those map to two tables with same names, and can
> > be
> > > > > > >> accessed within the same cache (i.e. schema).
> > > > > > >>
> > > > > > >> If we want to limit the user with having single type
> descriptor
> > > per
> > > > > > >> cache (i.e. cache has only one type of stored entities - BTW,
> > > where
> > > > we
> > > > > > >> are with this 2.0-wise?), then this notion could change. But
> > > > currently
> > > > > > >> what has been suggested already fits quite good with what we
> do
> > > have
> > > > > > >> at the moment regarding semantic of SQL objects.
> > > > > > >>
> > > > > > >> - Alex
> > > > > > >>
> > > > > > >> > Another example is how to deal with referene tables? Lots
> > > database
> > > > > has
> > > > > > >> > small reference tables which is best to fit REPLICATED
> cache,
> > > > while
> > > > > > >> others
> > > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache"
> will
> > > > force
> > > > > > >> users
> > > > > > >> > to split them into separate schemes leading to poor user
> > > > experience.
> > > > > > >> >
> > > > > > >> > I understand that we may have some implementation details
> > around
> > > > it
> > > > > at
> > > > > > >> the
> > > > > > >> > moment. But from user perspective "SCHEMA == cache" doesn't
> > make
> > > > > > sense.
> > > > > > >> As
> > > > > > >> > we are going towards AI 2.0 we'd better to rethink this
> > > approach.
> > > > > > >> >
> > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> > > dmagda@apache.org>
> > > > > > >> wrote:
> > > > > > >> >
> > > > > > >> >>
> > > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > > > >> dsetrakyan@apache.org>
> > > > > > >> >> wrote:
> > > > > > >> >> >
> > > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > > > >> >> sergi.vladykin@gmail.com>
> > > > > > >> >> > wrote:
> > > > > > >> >> >
> > > > > > >> >> >> The xml config was only for example. We can put in this
> > > > > > >> configuration
> > > > > > >> >> >> string cache config parameters directly like this:
> > > > > > >> >> >>
> > > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > > >> >> >>
> > > > > > >> >> >
> > > > > > >> >> > This approach makes sense, if it can be easily supported
> > with
> > > > H2.
> > > > > > >> >>
> > > > > > >> >> What’s for affinity keys? Can we make an exception for them
> > by
> > > > > > >> defining in
> > > > > > >> >> this part of the statement
> > > > > > >> >>
> > > > > > >> >> CREATE TABLE employee (
> > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > > > >> >>    name VARCHAR(128),
> > > > > > >> >> );
> > > > > > >> >>
> > > > > > >> >> or that l
> > > > > > >> >>
> > > > > > >> >> CREATE TABLE employee (
> > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > >> >>    dept_id BIGINT,
> > > > > > >> >>    name VARCHAR(128),
> > > > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > > > >> >> );
> > > > > > >> >>
> > > > > > >> >> ?
> > > > > > >> >>
> > > > > > >> >> —
> > > > > > >> >> Denis
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: DDL implementation details

Posted by Dmitriy Setrakyan <ds...@apache.org>.

Vova,

Currently I see only 2 ways we can proceed here:

   1. cache == table
   2. cache == schema

I agree that "cache==table" may be more flexible, but I don't think it will
work in Ignite.
We may end up with 1,000s of caches, which will carry significant overhead
on memory and cluster overall. I think that we have no choice but to take
"cache==schema" approach.

D.

On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> Sergi, Dima,
>
> In the scope of Ignite 1.x it is perfectly fine to have "schema = cache".
> Nobody suffers from it because nobody use Ignite as database. But in
> future, thanks to page memory, we are going to target real database use
> cases. Users will have multiple tables in Ignite. Plus views, triggers,
> constraints, etc.. All these features are very useful and easy to implement
> provided that we already have table and index implementations. And in
> databases all related objects are *logically *grouped in a "schema". This
> is convenient for users: less boilerplate in SQL, better manageability
> (remember that database users will definitely need some console and/or UI
> tools to manage Ignite as a database).
>
> What you offer is to group database objects *physically *rather than
> logically. It will lead to:
> - Boilerplate in queries
> - Inconvenient database management. All the things database users are used
> to - import/export tools, UIs, "USING" keyword, etc, will look weird in
> Ignite as there will be no way to group arbitrary objects logically.
>
> With this approach almost every user will have to use two schemes instead
> of one - one for operational data (PARTITIONED) and one for reference data
> (REPLICATED). No conventional database works this way.
>
> Vladimir.
>
> On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <ds...@apache.org>
> wrote:
>
> > Vova,
> >
> > I will join Sergi here. It seems like "schema = cache" will take care of
> > all different configuration properties required for different groups of
> > caches. In addition, it cleanly maps into current Ignite architecture. We
> > will need to have a very strong reason to move away from it.
> >
> > D.
> >
> > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <vo...@gridgain.com>
> > wrote:
> >
> > > Correct, it worked, because Ignite has never had real database use case
> > in
> > > mind. Unfortunately, if our global plans go as expected, it will not
> work
> > > for Ignite 2.x+.
> > >
> > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > sergi.vladykin@gmail.com
> > > >
> > > wrote:
> > >
> > > > Lets move on with SQL schema == Ignite cache. It worked always like
> > > this, I
> > > > see no reasons to change this.
> > > >
> > > > Sergi
> > > >
> > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> > > >
> > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than "schema"
> > to
> > > > our
> > > > > cache. But not ideally still.
> > > > >
> > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > > vozerov@gridgain.com>
> > > > > wrote:
> > > > >
> > > > > > Alex,
> > > > > >
> > > > > > Currently Ignite is not used as database. It is used as search
> > > engine -
> > > > > > several types, several tables, several joins. This is why having
> > > > "SCHEMA
> > > > > ==
> > > > > > cache" was never a problem. Users have never build complex SQL
> > > > > applications
> > > > > > on top of Ignite. But we are going towards database. And my
> > question
> > > > > stands
> > > > > > still - suppose it is Y2019, how is user going to migrate his
> > > database
> > > > > > containing 20-30-50-100 tables in a single schema in Oracle to
> > > Ignite?
> > > > > >
> > > > > > Single cache for all tables? Doens't work - not flexible. Users
> > will
> > > > > > definitely require different cache modes, different co-location
> > > rules,
> > > > > > different number of backups, etc..
> > > > > > Schema per table? Doesn't work either - unmanageable and not
> > > convenient
> > > > > > for users even for relatively small databases.
> > > > > >
> > > > > > From user perspective schema is logical grouping of database
> > objects,
> > > > > > nothing more.
> > > > > >
> > > > > > For Ignite schema could be a logical group of resources (nodes,
> > > memory
> > > > > > pools, caches, etc.). And multiple tables over multiple caches
> > should
> > > > > > reside in it. To the contrast, table definition governs how data
> is
> > > > > stored.
> > > > > > This is similar to, for example, MySQL approach, where you define
> > how
> > > > you
> > > > > > store data on per-table level, and on schema level you define
> only
> > > > minor
> > > > > > things like collation.
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > > >
> > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > > >
> > > > > >> Vova,
> > > > > >>
> > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <vozerov@gridgain.com
> >:
> > > > > >> > I am not quite sure I understand the idea of "SCHEMA ==
> cache".
> > > > > Consider
> > > > > >> > some small database with, say, ~30 tables. And user wants to
> > > migrate
> > > > > to
> > > > > >> > Ignite. How is he supposed to do so? 30 schemas leading to
> > rewrite
> > > > of
> > > > > >> all
> > > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache
> leading
> > > to
> > > > > >> lack of
> > > > > >> > flexibility and performance problems?
> > > > > >>
> > > > > >> But currently schema *is* semantically equal to cache while
> table
> > is
> > > > > >> equal to type descriptor (i.e. type of stored entities), nothing
> > new
> > > > > >> here.
> > > > > >>
> > > > > >> Say, in single cache we may have entities of types Person and
> > > > > >> Organization, those map to two tables with same names, and can
> be
> > > > > >> accessed within the same cache (i.e. schema).
> > > > > >>
> > > > > >> If we want to limit the user with having single type descriptor
> > per
> > > > > >> cache (i.e. cache has only one type of stored entities - BTW,
> > where
> > > we
> > > > > >> are with this 2.0-wise?), then this notion could change. But
> > > currently
> > > > > >> what has been suggested already fits quite good with what we do
> > have
> > > > > >> at the moment regarding semantic of SQL objects.
> > > > > >>
> > > > > >> - Alex
> > > > > >>
> > > > > >> > Another example is how to deal with referene tables? Lots
> > database
> > > > has
> > > > > >> > small reference tables which is best to fit REPLICATED cache,
> > > while
> > > > > >> others
> > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache" will
> > > force
> > > > > >> users
> > > > > >> > to split them into separate schemes leading to poor user
> > > experience.
> > > > > >> >
> > > > > >> > I understand that we may have some implementation details
> around
> > > it
> > > > at
> > > > > >> the
> > > > > >> > moment. But from user perspective "SCHEMA == cache" doesn't
> make
> > > > > sense.
> > > > > >> As
> > > > > >> > we are going towards AI 2.0 we'd better to rethink this
> > approach.
> > > > > >> >
> > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> > dmagda@apache.org>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> >>
> > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > > >> dsetrakyan@apache.org>
> > > > > >> >> wrote:
> > > > > >> >> >
> > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > > >> >> sergi.vladykin@gmail.com>
> > > > > >> >> > wrote:
> > > > > >> >> >
> > > > > >> >> >> The xml config was only for example. We can put in this
> > > > > >> configuration
> > > > > >> >> >> string cache config parameters directly like this:
> > > > > >> >> >>
> > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > >> >> >>
> > > > > >> >> >
> > > > > >> >> > This approach makes sense, if it can be easily supported
> with
> > > H2.
> > > > > >> >>
> > > > > >> >> What’s for affinity keys? Can we make an exception for them
> by
> > > > > >> defining in
> > > > > >> >> this part of the statement
> > > > > >> >>
> > > > > >> >> CREATE TABLE employee (
> > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > > >> >>    name VARCHAR(128),
> > > > > >> >> );
> > > > > >> >>
> > > > > >> >> or that l
> > > > > >> >>
> > > > > >> >> CREATE TABLE employee (
> > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > >> >>    dept_id BIGINT,
> > > > > >> >>    name VARCHAR(128),
> > > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > > >> >> );
> > > > > >> >>
> > > > > >> >> ?
> > > > > >> >>
> > > > > >> >> —
> > > > > >> >> Denis
> > > > > >> >>
> > > > > >> >>
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: DDL implementation details

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Sergi, Dima,

In the scope of Ignite 1.x it is perfectly fine to have "schema = cache".
Nobody suffers from it because nobody use Ignite as database. But in
future, thanks to page memory, we are going to target real database use
cases. Users will have multiple tables in Ignite. Plus views, triggers,
constraints, etc.. All these features are very useful and easy to implement
provided that we already have table and index implementations. And in
databases all related objects are *logically *grouped in a "schema". This
is convenient for users: less boilerplate in SQL, better manageability
(remember that database users will definitely need some console and/or UI
tools to manage Ignite as a database).

What you offer is to group database objects *physically *rather than
logically. It will lead to:
- Boilerplate in queries
- Inconvenient database management. All the things database users are used
to - import/export tools, UIs, "USING" keyword, etc, will look weird in
Ignite as there will be no way to group arbitrary objects logically.

With this approach almost every user will have to use two schemes instead
of one - one for operational data (PARTITIONED) and one for reference data
(REPLICATED). No conventional database works this way.

Vladimir.

On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <ds...@apache.org>
wrote:

> Vova,
>
> I will join Sergi here. It seems like "schema = cache" will take care of
> all different configuration properties required for different groups of
> caches. In addition, it cleanly maps into current Ignite architecture. We
> will need to have a very strong reason to move away from it.
>
> D.
>
> On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
> > Correct, it worked, because Ignite has never had real database use case
> in
> > mind. Unfortunately, if our global plans go as expected, it will not work
> > for Ignite 2.x+.
> >
> > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> sergi.vladykin@gmail.com
> > >
> > wrote:
> >
> > > Lets move on with SQL schema == Ignite cache. It worked always like
> > this, I
> > > see no reasons to change this.
> > >
> > > Sergi
> > >
> > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> > >
> > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than "schema"
> to
> > > our
> > > > cache. But not ideally still.
> > > >
> > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > vozerov@gridgain.com>
> > > > wrote:
> > > >
> > > > > Alex,
> > > > >
> > > > > Currently Ignite is not used as database. It is used as search
> > engine -
> > > > > several types, several tables, several joins. This is why having
> > > "SCHEMA
> > > > ==
> > > > > cache" was never a problem. Users have never build complex SQL
> > > > applications
> > > > > on top of Ignite. But we are going towards database. And my
> question
> > > > stands
> > > > > still - suppose it is Y2019, how is user going to migrate his
> > database
> > > > > containing 20-30-50-100 tables in a single schema in Oracle to
> > Ignite?
> > > > >
> > > > > Single cache for all tables? Doens't work - not flexible. Users
> will
> > > > > definitely require different cache modes, different co-location
> > rules,
> > > > > different number of backups, etc..
> > > > > Schema per table? Doesn't work either - unmanageable and not
> > convenient
> > > > > for users even for relatively small databases.
> > > > >
> > > > > From user perspective schema is logical grouping of database
> objects,
> > > > > nothing more.
> > > > >
> > > > > For Ignite schema could be a logical group of resources (nodes,
> > memory
> > > > > pools, caches, etc.). And multiple tables over multiple caches
> should
> > > > > reside in it. To the contrast, table definition governs how data is
> > > > stored.
> > > > > This is similar to, for example, MySQL approach, where you define
> how
> > > you
> > > > > store data on per-table level, and on schema level you define only
> > > minor
> > > > > things like collation.
> > > > >
> > > > > Vladimir.
> > > > >
> > > > >
> > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > >
> > > > >> Vova,
> > > > >>
> > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <vo...@gridgain.com>:
> > > > >> > I am not quite sure I understand the idea of "SCHEMA == cache".
> > > > Consider
> > > > >> > some small database with, say, ~30 tables. And user wants to
> > migrate
> > > > to
> > > > >> > Ignite. How is he supposed to do so? 30 schemas leading to
> rewrite
> > > of
> > > > >> all
> > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache leading
> > to
> > > > >> lack of
> > > > >> > flexibility and performance problems?
> > > > >>
> > > > >> But currently schema *is* semantically equal to cache while table
> is
> > > > >> equal to type descriptor (i.e. type of stored entities), nothing
> new
> > > > >> here.
> > > > >>
> > > > >> Say, in single cache we may have entities of types Person and
> > > > >> Organization, those map to two tables with same names, and can be
> > > > >> accessed within the same cache (i.e. schema).
> > > > >>
> > > > >> If we want to limit the user with having single type descriptor
> per
> > > > >> cache (i.e. cache has only one type of stored entities - BTW,
> where
> > we
> > > > >> are with this 2.0-wise?), then this notion could change. But
> > currently
> > > > >> what has been suggested already fits quite good with what we do
> have
> > > > >> at the moment regarding semantic of SQL objects.
> > > > >>
> > > > >> - Alex
> > > > >>
> > > > >> > Another example is how to deal with referene tables? Lots
> database
> > > has
> > > > >> > small reference tables which is best to fit REPLICATED cache,
> > while
> > > > >> others
> > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache" will
> > force
> > > > >> users
> > > > >> > to split them into separate schemes leading to poor user
> > experience.
> > > > >> >
> > > > >> > I understand that we may have some implementation details around
> > it
> > > at
> > > > >> the
> > > > >> > moment. But from user perspective "SCHEMA == cache" doesn't make
> > > > sense.
> > > > >> As
> > > > >> > we are going towards AI 2.0 we'd better to rethink this
> approach.
> > > > >> >
> > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> dmagda@apache.org>
> > > > >> wrote:
> > > > >> >
> > > > >> >>
> > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > >> dsetrakyan@apache.org>
> > > > >> >> wrote:
> > > > >> >> >
> > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > >> >> sergi.vladykin@gmail.com>
> > > > >> >> > wrote:
> > > > >> >> >
> > > > >> >> >> The xml config was only for example. We can put in this
> > > > >> configuration
> > > > >> >> >> string cache config parameters directly like this:
> > > > >> >> >>
> > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > >> >> >>
> > > > >> >> >
> > > > >> >> > This approach makes sense, if it can be easily supported with
> > H2.
> > > > >> >>
> > > > >> >> What’s for affinity keys? Can we make an exception for them by
> > > > >> defining in
> > > > >> >> this part of the statement
> > > > >> >>
> > > > >> >> CREATE TABLE employee (
> > > > >> >>    id BIGINT PRIMARY KEY,
> > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > >> >>    name VARCHAR(128),
> > > > >> >> );
> > > > >> >>
> > > > >> >> or that l
> > > > >> >>
> > > > >> >> CREATE TABLE employee (
> > > > >> >>    id BIGINT PRIMARY KEY,
> > > > >> >>    dept_id BIGINT,
> > > > >> >>    name VARCHAR(128),
> > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > >> >> );
> > > > >> >>
> > > > >> >> ?
> > > > >> >>
> > > > >> >> —
> > > > >> >> Denis
> > > > >> >>
> > > > >> >>
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: DDL implementation details

Posted by Dmitriy Setrakyan <ds...@apache.org>.

Vova,

I will join Sergi here. It seems like "schema = cache" will take care of
all different configuration properties required for different groups of
caches. In addition, it cleanly maps into current Ignite architecture. We
will need to have a very strong reason to move away from it.

D.

On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> Correct, it worked, because Ignite has never had real database use case in
> mind. Unfortunately, if our global plans go as expected, it will not work
> for Ignite 2.x+.
>
> On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <sergi.vladykin@gmail.com
> >
> wrote:
>
> > Lets move on with SQL schema == Ignite cache. It worked always like
> this, I
> > see no reasons to change this.
> >
> > Sergi
> >
> > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> >
> > > "Tablespace" (Oracle, PostgreSQL) is what maps better than "schema" to
> > our
> > > cache. But not ideally still.
> > >
> > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> vozerov@gridgain.com>
> > > wrote:
> > >
> > > > Alex,
> > > >
> > > > Currently Ignite is not used as database. It is used as search
> engine -
> > > > several types, several tables, several joins. This is why having
> > "SCHEMA
> > > ==
> > > > cache" was never a problem. Users have never build complex SQL
> > > applications
> > > > on top of Ignite. But we are going towards database. And my question
> > > stands
> > > > still - suppose it is Y2019, how is user going to migrate his
> database
> > > > containing 20-30-50-100 tables in a single schema in Oracle to
> Ignite?
> > > >
> > > > Single cache for all tables? Doens't work - not flexible. Users will
> > > > definitely require different cache modes, different co-location
> rules,
> > > > different number of backups, etc..
> > > > Schema per table? Doesn't work either - unmanageable and not
> convenient
> > > > for users even for relatively small databases.
> > > >
> > > > From user perspective schema is logical grouping of database objects,
> > > > nothing more.
> > > >
> > > > For Ignite schema could be a logical group of resources (nodes,
> memory
> > > > pools, caches, etc.). And multiple tables over multiple caches should
> > > > reside in it. To the contrast, table definition governs how data is
> > > stored.
> > > > This is similar to, for example, MySQL approach, where you define how
> > you
> > > > store data on per-table level, and on schema level you define only
> > minor
> > > > things like collation.
> > > >
> > > > Vladimir.
> > > >
> > > >
> > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > alexander.a.paschenko@gmail.com> wrote:
> > > >
> > > >> Vova,
> > > >>
> > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <vo...@gridgain.com>:
> > > >> > I am not quite sure I understand the idea of "SCHEMA == cache".
> > > Consider
> > > >> > some small database with, say, ~30 tables. And user wants to
> migrate
> > > to
> > > >> > Ignite. How is he supposed to do so? 30 schemas leading to rewrite
> > of
> > > >> all
> > > >> > his SQL scripts? Or 30 key-value pairs in a single cache leading
> to
> > > >> lack of
> > > >> > flexibility and performance problems?
> > > >>
> > > >> But currently schema *is* semantically equal to cache while table is
> > > >> equal to type descriptor (i.e. type of stored entities), nothing new
> > > >> here.
> > > >>
> > > >> Say, in single cache we may have entities of types Person and
> > > >> Organization, those map to two tables with same names, and can be
> > > >> accessed within the same cache (i.e. schema).
> > > >>
> > > >> If we want to limit the user with having single type descriptor per
> > > >> cache (i.e. cache has only one type of stored entities - BTW, where
> we
> > > >> are with this 2.0-wise?), then this notion could change. But
> currently
> > > >> what has been suggested already fits quite good with what we do have
> > > >> at the moment regarding semantic of SQL objects.
> > > >>
> > > >> - Alex
> > > >>
> > > >> > Another example is how to deal with referene tables? Lots database
> > has
> > > >> > small reference tables which is best to fit REPLICATED cache,
> while
> > > >> others
> > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache" will
> force
> > > >> users
> > > >> > to split them into separate schemes leading to poor user
> experience.
> > > >> >
> > > >> > I understand that we may have some implementation details around
> it
> > at
> > > >> the
> > > >> > moment. But from user perspective "SCHEMA == cache" doesn't make
> > > sense.
> > > >> As
> > > >> > we are going towards AI 2.0 we'd better to rethink this approach.
> > > >> >
> > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <dm...@apache.org>
> > > >> wrote:
> > > >> >
> > > >> >>
> > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > >> dsetrakyan@apache.org>
> > > >> >> wrote:
> > > >> >> >
> > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > >> >> sergi.vladykin@gmail.com>
> > > >> >> > wrote:
> > > >> >> >
> > > >> >> >> The xml config was only for example. We can put in this
> > > >> configuration
> > > >> >> >> string cache config parameters directly like this:
> > > >> >> >>
> > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > >> >> >>
> > > >> >> >
> > > >> >> > This approach makes sense, if it can be easily supported with
> H2.
> > > >> >>
> > > >> >> What’s for affinity keys? Can we make an exception for them by
> > > >> defining in
> > > >> >> this part of the statement
> > > >> >>
> > > >> >> CREATE TABLE employee (
> > > >> >>    id BIGINT PRIMARY KEY,
> > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > >> >>    name VARCHAR(128),
> > > >> >> );
> > > >> >>
> > > >> >> or that l
> > > >> >>
> > > >> >> CREATE TABLE employee (
> > > >> >>    id BIGINT PRIMARY KEY,
> > > >> >>    dept_id BIGINT,
> > > >> >>    name VARCHAR(128),
> > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > >> >> );
> > > >> >>
> > > >> >> ?
> > > >> >>
> > > >> >> —
> > > >> >> Denis
> > > >> >>
> > > >> >>
> > > >>
> > > >
> > > >
> > >
> >
>

Re: DDL implementation details

Posted by Sergi Vladykin <se...@gmail.com>.

Could you please elaborate why it will not work this way?

Sergi

2017-01-13 12:39 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:

> Correct, it worked, because Ignite has never had real database use case in
> mind. Unfortunately, if our global plans go as expected, it will not work
> for Ignite 2.x+.
>
> On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <sergi.vladykin@gmail.com
> >
> wrote:
>
> > Lets move on with SQL schema == Ignite cache. It worked always like
> this, I
> > see no reasons to change this.
> >
> > Sergi
> >
> > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> >
> > > "Tablespace" (Oracle, PostgreSQL) is what maps better than "schema" to
> > our
> > > cache. But not ideally still.
> > >
> > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> vozerov@gridgain.com>
> > > wrote:
> > >
> > > > Alex,
> > > >
> > > > Currently Ignite is not used as database. It is used as search
> engine -
> > > > several types, several tables, several joins. This is why having
> > "SCHEMA
> > > ==
> > > > cache" was never a problem. Users have never build complex SQL
> > > applications
> > > > on top of Ignite. But we are going towards database. And my question
> > > stands
> > > > still - suppose it is Y2019, how is user going to migrate his
> database
> > > > containing 20-30-50-100 tables in a single schema in Oracle to
> Ignite?
> > > >
> > > > Single cache for all tables? Doens't work - not flexible. Users will
> > > > definitely require different cache modes, different co-location
> rules,
> > > > different number of backups, etc..
> > > > Schema per table? Doesn't work either - unmanageable and not
> convenient
> > > > for users even for relatively small databases.
> > > >
> > > > From user perspective schema is logical grouping of database objects,
> > > > nothing more.
> > > >
> > > > For Ignite schema could be a logical group of resources (nodes,
> memory
> > > > pools, caches, etc.). And multiple tables over multiple caches should
> > > > reside in it. To the contrast, table definition governs how data is
> > > stored.
> > > > This is similar to, for example, MySQL approach, where you define how
> > you
> > > > store data on per-table level, and on schema level you define only
> > minor
> > > > things like collation.
> > > >
> > > > Vladimir.
> > > >
> > > >
> > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > alexander.a.paschenko@gmail.com> wrote:
> > > >
> > > >> Vova,
> > > >>
> > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <vo...@gridgain.com>:
> > > >> > I am not quite sure I understand the idea of "SCHEMA == cache".
> > > Consider
> > > >> > some small database with, say, ~30 tables. And user wants to
> migrate
> > > to
> > > >> > Ignite. How is he supposed to do so? 30 schemas leading to rewrite
> > of
> > > >> all
> > > >> > his SQL scripts? Or 30 key-value pairs in a single cache leading
> to
> > > >> lack of
> > > >> > flexibility and performance problems?
> > > >>
> > > >> But currently schema *is* semantically equal to cache while table is
> > > >> equal to type descriptor (i.e. type of stored entities), nothing new
> > > >> here.
> > > >>
> > > >> Say, in single cache we may have entities of types Person and
> > > >> Organization, those map to two tables with same names, and can be
> > > >> accessed within the same cache (i.e. schema).
> > > >>
> > > >> If we want to limit the user with having single type descriptor per
> > > >> cache (i.e. cache has only one type of stored entities - BTW, where
> we
> > > >> are with this 2.0-wise?), then this notion could change. But
> currently
> > > >> what has been suggested already fits quite good with what we do have
> > > >> at the moment regarding semantic of SQL objects.
> > > >>
> > > >> - Alex
> > > >>
> > > >> > Another example is how to deal with referene tables? Lots database
> > has
> > > >> > small reference tables which is best to fit REPLICATED cache,
> while
> > > >> others
> > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache" will
> force
> > > >> users
> > > >> > to split them into separate schemes leading to poor user
> experience.
> > > >> >
> > > >> > I understand that we may have some implementation details around
> it
> > at
> > > >> the
> > > >> > moment. But from user perspective "SCHEMA == cache" doesn't make
> > > sense.
> > > >> As
> > > >> > we are going towards AI 2.0 we'd better to rethink this approach.
> > > >> >
> > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <dm...@apache.org>
> > > >> wrote:
> > > >> >
> > > >> >>
> > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > >> dsetrakyan@apache.org>
> > > >> >> wrote:
> > > >> >> >
> > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > >> >> sergi.vladykin@gmail.com>
> > > >> >> > wrote:
> > > >> >> >
> > > >> >> >> The xml config was only for example. We can put in this
> > > >> configuration
> > > >> >> >> string cache config parameters directly like this:
> > > >> >> >>
> > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > >> >> >>
> > > >> >> >
> > > >> >> > This approach makes sense, if it can be easily supported with
> H2.
> > > >> >>
> > > >> >> What’s for affinity keys? Can we make an exception for them by
> > > >> defining in
> > > >> >> this part of the statement
> > > >> >>
> > > >> >> CREATE TABLE employee (
> > > >> >>    id BIGINT PRIMARY KEY,
> > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > >> >>    name VARCHAR(128),
> > > >> >> );
> > > >> >>
> > > >> >> or that l
> > > >> >>
> > > >> >> CREATE TABLE employee (
> > > >> >>    id BIGINT PRIMARY KEY,
> > > >> >>    dept_id BIGINT,
> > > >> >>    name VARCHAR(128),
> > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > >> >> );
> > > >> >>
> > > >> >> ?
> > > >> >>
> > > >> >> —
> > > >> >> Denis
> > > >> >>
> > > >> >>
> > > >>
> > > >
> > > >
> > >
> >
>

Re: DDL implementation details

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Correct, it worked, because Ignite has never had real database use case in
mind. Unfortunately, if our global plans go as expected, it will not work
for Ignite 2.x+.

On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <se...@gmail.com>
wrote:

> Lets move on with SQL schema == Ignite cache. It worked always like this, I
> see no reasons to change this.
>
> Sergi
>
> 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
>
> > "Tablespace" (Oracle, PostgreSQL) is what maps better than "schema" to
> our
> > cache. But not ideally still.
> >
> > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <vo...@gridgain.com>
> > wrote:
> >
> > > Alex,
> > >
> > > Currently Ignite is not used as database. It is used as search engine -
> > > several types, several tables, several joins. This is why having
> "SCHEMA
> > ==
> > > cache" was never a problem. Users have never build complex SQL
> > applications
> > > on top of Ignite. But we are going towards database. And my question
> > stands
> > > still - suppose it is Y2019, how is user going to migrate his database
> > > containing 20-30-50-100 tables in a single schema in Oracle to Ignite?
> > >
> > > Single cache for all tables? Doens't work - not flexible. Users will
> > > definitely require different cache modes, different co-location rules,
> > > different number of backups, etc..
> > > Schema per table? Doesn't work either - unmanageable and not convenient
> > > for users even for relatively small databases.
> > >
> > > From user perspective schema is logical grouping of database objects,
> > > nothing more.
> > >
> > > For Ignite schema could be a logical group of resources (nodes, memory
> > > pools, caches, etc.). And multiple tables over multiple caches should
> > > reside in it. To the contrast, table definition governs how data is
> > stored.
> > > This is similar to, for example, MySQL approach, where you define how
> you
> > > store data on per-table level, and on schema level you define only
> minor
> > > things like collation.
> > >
> > > Vladimir.
> > >
> > >
> > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > alexander.a.paschenko@gmail.com> wrote:
> > >
> > >> Vova,
> > >>
> > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <vo...@gridgain.com>:
> > >> > I am not quite sure I understand the idea of "SCHEMA == cache".
> > Consider
> > >> > some small database with, say, ~30 tables. And user wants to migrate
> > to
> > >> > Ignite. How is he supposed to do so? 30 schemas leading to rewrite
> of
> > >> all
> > >> > his SQL scripts? Or 30 key-value pairs in a single cache leading to
> > >> lack of
> > >> > flexibility and performance problems?
> > >>
> > >> But currently schema *is* semantically equal to cache while table is
> > >> equal to type descriptor (i.e. type of stored entities), nothing new
> > >> here.
> > >>
> > >> Say, in single cache we may have entities of types Person and
> > >> Organization, those map to two tables with same names, and can be
> > >> accessed within the same cache (i.e. schema).
> > >>
> > >> If we want to limit the user with having single type descriptor per
> > >> cache (i.e. cache has only one type of stored entities - BTW, where we
> > >> are with this 2.0-wise?), then this notion could change. But currently
> > >> what has been suggested already fits quite good with what we do have
> > >> at the moment regarding semantic of SQL objects.
> > >>
> > >> - Alex
> > >>
> > >> > Another example is how to deal with referene tables? Lots database
> has
> > >> > small reference tables which is best to fit REPLICATED cache, while
> > >> others
> > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache" will force
> > >> users
> > >> > to split them into separate schemes leading to poor user experience.
> > >> >
> > >> > I understand that we may have some implementation details around it
> at
> > >> the
> > >> > moment. But from user perspective "SCHEMA == cache" doesn't make
> > sense.
> > >> As
> > >> > we are going towards AI 2.0 we'd better to rethink this approach.
> > >> >
> > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <dm...@apache.org>
> > >> wrote:
> > >> >
> > >> >>
> > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > >> dsetrakyan@apache.org>
> > >> >> wrote:
> > >> >> >
> > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > >> >> sergi.vladykin@gmail.com>
> > >> >> > wrote:
> > >> >> >
> > >> >> >> The xml config was only for example. We can put in this
> > >> configuration
> > >> >> >> string cache config parameters directly like this:
> > >> >> >>
> > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > >> >> >>
> > >> >> >
> > >> >> > This approach makes sense, if it can be easily supported with H2.
> > >> >>
> > >> >> What’s for affinity keys? Can we make an exception for them by
> > >> defining in
> > >> >> this part of the statement
> > >> >>
> > >> >> CREATE TABLE employee (
> > >> >>    id BIGINT PRIMARY KEY,
> > >> >>    dept_id BIGINT AFFINITY KEY,
> > >> >>    name VARCHAR(128),
> > >> >> );
> > >> >>
> > >> >> or that l
> > >> >>
> > >> >> CREATE TABLE employee (
> > >> >>    id BIGINT PRIMARY KEY,
> > >> >>    dept_id BIGINT,
> > >> >>    name VARCHAR(128),
> > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > >> >> );
> > >> >>
> > >> >> ?
> > >> >>
> > >> >> —
> > >> >> Denis
> > >> >>
> > >> >>
> > >>
> > >
> > >
> >
>

Re: DDL implementation details

Posted by Sergi Vladykin <se...@gmail.com>.

Lets move on with SQL schema == Ignite cache. It worked always like this, I
see no reasons to change this.

Sergi

2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:

> "Tablespace" (Oracle, PostgreSQL) is what maps better than "schema" to our
> cache. But not ideally still.
>
> On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
> > Alex,
> >
> > Currently Ignite is not used as database. It is used as search engine -
> > several types, several tables, several joins. This is why having "SCHEMA
> ==
> > cache" was never a problem. Users have never build complex SQL
> applications
> > on top of Ignite. But we are going towards database. And my question
> stands
> > still - suppose it is Y2019, how is user going to migrate his database
> > containing 20-30-50-100 tables in a single schema in Oracle to Ignite?
> >
> > Single cache for all tables? Doens't work - not flexible. Users will
> > definitely require different cache modes, different co-location rules,
> > different number of backups, etc..
> > Schema per table? Doesn't work either - unmanageable and not convenient
> > for users even for relatively small databases.
> >
> > From user perspective schema is logical grouping of database objects,
> > nothing more.
> >
> > For Ignite schema could be a logical group of resources (nodes, memory
> > pools, caches, etc.). And multiple tables over multiple caches should
> > reside in it. To the contrast, table definition governs how data is
> stored.
> > This is similar to, for example, MySQL approach, where you define how you
> > store data on per-table level, and on schema level you define only minor
> > things like collation.
> >
> > Vladimir.
> >
> >
> > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > alexander.a.paschenko@gmail.com> wrote:
> >
> >> Vova,
> >>
> >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <vo...@gridgain.com>:
> >> > I am not quite sure I understand the idea of "SCHEMA == cache".
> Consider
> >> > some small database with, say, ~30 tables. And user wants to migrate
> to
> >> > Ignite. How is he supposed to do so? 30 schemas leading to rewrite of
> >> all
> >> > his SQL scripts? Or 30 key-value pairs in a single cache leading to
> >> lack of
> >> > flexibility and performance problems?
> >>
> >> But currently schema *is* semantically equal to cache while table is
> >> equal to type descriptor (i.e. type of stored entities), nothing new
> >> here.
> >>
> >> Say, in single cache we may have entities of types Person and
> >> Organization, those map to two tables with same names, and can be
> >> accessed within the same cache (i.e. schema).
> >>
> >> If we want to limit the user with having single type descriptor per
> >> cache (i.e. cache has only one type of stored entities - BTW, where we
> >> are with this 2.0-wise?), then this notion could change. But currently
> >> what has been suggested already fits quite good with what we do have
> >> at the moment regarding semantic of SQL objects.
> >>
> >> - Alex
> >>
> >> > Another example is how to deal with referene tables? Lots database has
> >> > small reference tables which is best to fit REPLICATED cache, while
> >> others
> >> > are usually bound to PARTITIONED mode. "SCHEMA == cache" will force
> >> users
> >> > to split them into separate schemes leading to poor user experience.
> >> >
> >> > I understand that we may have some implementation details around it at
> >> the
> >> > moment. But from user perspective "SCHEMA == cache" doesn't make
> sense.
> >> As
> >> > we are going towards AI 2.0 we'd better to rethink this approach.
> >> >
> >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <dm...@apache.org>
> >> wrote:
> >> >
> >> >>
> >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> >> dsetrakyan@apache.org>
> >> >> wrote:
> >> >> >
> >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> >> >> sergi.vladykin@gmail.com>
> >> >> > wrote:
> >> >> >
> >> >> >> The xml config was only for example. We can put in this
> >> configuration
> >> >> >> string cache config parameters directly like this:
> >> >> >>
> >> >> >> CREATE SCHEMA "MyCacheName" WITH
> >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> >> >> >>
> >> >> >
> >> >> > This approach makes sense, if it can be easily supported with H2.
> >> >>
> >> >> What’s for affinity keys? Can we make an exception for them by
> >> defining in
> >> >> this part of the statement
> >> >>
> >> >> CREATE TABLE employee (
> >> >>    id BIGINT PRIMARY KEY,
> >> >>    dept_id BIGINT AFFINITY KEY,
> >> >>    name VARCHAR(128),
> >> >> );
> >> >>
> >> >> or that l
> >> >>
> >> >> CREATE TABLE employee (
> >> >>    id BIGINT PRIMARY KEY,
> >> >>    dept_id BIGINT,
> >> >>    name VARCHAR(128),
> >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> >> >> );
> >> >>
> >> >> ?
> >> >>
> >> >> —
> >> >> Denis
> >> >>
> >> >>
> >>
> >
> >
>

Re: DDL implementation details

Posted by Vladimir Ozerov <vo...@gridgain.com>.

"Tablespace" (Oracle, PostgreSQL) is what maps better than "schema" to our
cache. But not ideally still.

On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> Alex,
>
> Currently Ignite is not used as database. It is used as search engine -
> several types, several tables, several joins. This is why having "SCHEMA ==
> cache" was never a problem. Users have never build complex SQL applications
> on top of Ignite. But we are going towards database. And my question stands
> still - suppose it is Y2019, how is user going to migrate his database
> containing 20-30-50-100 tables in a single schema in Oracle to Ignite?
>
> Single cache for all tables? Doens't work - not flexible. Users will
> definitely require different cache modes, different co-location rules,
> different number of backups, etc..
> Schema per table? Doesn't work either - unmanageable and not convenient
> for users even for relatively small databases.
>
> From user perspective schema is logical grouping of database objects,
> nothing more.
>
> For Ignite schema could be a logical group of resources (nodes, memory
> pools, caches, etc.). And multiple tables over multiple caches should
> reside in it. To the contrast, table definition governs how data is stored.
> This is similar to, for example, MySQL approach, where you define how you
> store data on per-table level, and on schema level you define only minor
> things like collation.
>
> Vladimir.
>
>
> On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> alexander.a.paschenko@gmail.com> wrote:
>
>> Vova,
>>
>> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <vo...@gridgain.com>:
>> > I am not quite sure I understand the idea of "SCHEMA == cache". Consider
>> > some small database with, say, ~30 tables. And user wants to migrate to
>> > Ignite. How is he supposed to do so? 30 schemas leading to rewrite of
>> all
>> > his SQL scripts? Or 30 key-value pairs in a single cache leading to
>> lack of
>> > flexibility and performance problems?
>>
>> But currently schema *is* semantically equal to cache while table is
>> equal to type descriptor (i.e. type of stored entities), nothing new
>> here.
>>
>> Say, in single cache we may have entities of types Person and
>> Organization, those map to two tables with same names, and can be
>> accessed within the same cache (i.e. schema).
>>
>> If we want to limit the user with having single type descriptor per
>> cache (i.e. cache has only one type of stored entities - BTW, where we
>> are with this 2.0-wise?), then this notion could change. But currently
>> what has been suggested already fits quite good with what we do have
>> at the moment regarding semantic of SQL objects.
>>
>> - Alex
>>
>> > Another example is how to deal with referene tables? Lots database has
>> > small reference tables which is best to fit REPLICATED cache, while
>> others
>> > are usually bound to PARTITIONED mode. "SCHEMA == cache" will force
>> users
>> > to split them into separate schemes leading to poor user experience.
>> >
>> > I understand that we may have some implementation details around it at
>> the
>> > moment. But from user perspective "SCHEMA == cache" doesn't make sense.
>> As
>> > we are going towards AI 2.0 we'd better to rethink this approach.
>> >
>> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <dm...@apache.org>
>> wrote:
>> >
>> >>
>> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
>> dsetrakyan@apache.org>
>> >> wrote:
>> >> >
>> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
>> >> sergi.vladykin@gmail.com>
>> >> > wrote:
>> >> >
>> >> >> The xml config was only for example. We can put in this
>> configuration
>> >> >> string cache config parameters directly like this:
>> >> >>
>> >> >> CREATE SCHEMA "MyCacheName" WITH
>> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
>> >> >>
>> >> >
>> >> > This approach makes sense, if it can be easily supported with H2.
>> >>
>> >> What’s for affinity keys? Can we make an exception for them by
>> defining in
>> >> this part of the statement
>> >>
>> >> CREATE TABLE employee (
>> >>    id BIGINT PRIMARY KEY,
>> >>    dept_id BIGINT AFFINITY KEY,
>> >>    name VARCHAR(128),
>> >> );
>> >>
>> >> or that l
>> >>
>> >> CREATE TABLE employee (
>> >>    id BIGINT PRIMARY KEY,
>> >>    dept_id BIGINT,
>> >>    name VARCHAR(128),
>> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
>> >> );
>> >>
>> >> ?
>> >>
>> >> —
>> >> Denis
>> >>
>> >>
>>
>
>

Re: DDL implementation details

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Alex,

Currently Ignite is not used as database. It is used as search engine -
several types, several tables, several joins. This is why having "SCHEMA ==
cache" was never a problem. Users have never build complex SQL applications
on top of Ignite. But we are going towards database. And my question stands
still - suppose it is Y2019, how is user going to migrate his database
containing 20-30-50-100 tables in a single schema in Oracle to Ignite?

Single cache for all tables? Doens't work - not flexible. Users will
definitely require different cache modes, different co-location rules,
different number of backups, etc..
Schema per table? Doesn't work either - unmanageable and not convenient for
users even for relatively small databases.

From user perspective schema is logical grouping of database objects,
nothing more.

For Ignite schema could be a logical group of resources (nodes, memory
pools, caches, etc.). And multiple tables over multiple caches should
reside in it. To the contrast, table definition governs how data is stored.
This is similar to, for example, MySQL approach, where you define how you
store data on per-table level, and on schema level you define only minor
things like collation.

Vladimir.


On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
alexander.a.paschenko@gmail.com> wrote:

> Vova,
>
> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <vo...@gridgain.com>:
> > I am not quite sure I understand the idea of "SCHEMA == cache". Consider
> > some small database with, say, ~30 tables. And user wants to migrate to
> > Ignite. How is he supposed to do so? 30 schemas leading to rewrite of all
> > his SQL scripts? Or 30 key-value pairs in a single cache leading to lack
> of
> > flexibility and performance problems?
>
> But currently schema *is* semantically equal to cache while table is
> equal to type descriptor (i.e. type of stored entities), nothing new
> here.
>
> Say, in single cache we may have entities of types Person and
> Organization, those map to two tables with same names, and can be
> accessed within the same cache (i.e. schema).
>
> If we want to limit the user with having single type descriptor per
> cache (i.e. cache has only one type of stored entities - BTW, where we
> are with this 2.0-wise?), then this notion could change. But currently
> what has been suggested already fits quite good with what we do have
> at the moment regarding semantic of SQL objects.
>
> - Alex
>
> > Another example is how to deal with referene tables? Lots database has
> > small reference tables which is best to fit REPLICATED cache, while
> others
> > are usually bound to PARTITIONED mode. "SCHEMA == cache" will force users
> > to split them into separate schemes leading to poor user experience.
> >
> > I understand that we may have some implementation details around it at
> the
> > moment. But from user perspective "SCHEMA == cache" doesn't make sense.
> As
> > we are going towards AI 2.0 we'd better to rethink this approach.
> >
> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <dm...@apache.org> wrote:
> >
> >>
> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> dsetrakyan@apache.org>
> >> wrote:
> >> >
> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> >> sergi.vladykin@gmail.com>
> >> > wrote:
> >> >
> >> >> The xml config was only for example. We can put in this configuration
> >> >> string cache config parameters directly like this:
> >> >>
> >> >> CREATE SCHEMA "MyCacheName" WITH
> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> >> >>
> >> >
> >> > This approach makes sense, if it can be easily supported with H2.
> >>
> >> What’s for affinity keys? Can we make an exception for them by defining
> in
> >> this part of the statement
> >>
> >> CREATE TABLE employee (
> >>    id BIGINT PRIMARY KEY,
> >>    dept_id BIGINT AFFINITY KEY,
> >>    name VARCHAR(128),
> >> );
> >>
> >> or that l
> >>
> >> CREATE TABLE employee (
> >>    id BIGINT PRIMARY KEY,
> >>    dept_id BIGINT,
> >>    name VARCHAR(128),
> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> >> );
> >>
> >> ?
> >>
> >> —
> >> Denis
> >>
> >>
>

Re: DDL implementation details

Posted by Alexander Paschenko <al...@gmail.com>.

Vova,

2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <vo...@gridgain.com>:
> I am not quite sure I understand the idea of "SCHEMA == cache". Consider
> some small database with, say, ~30 tables. And user wants to migrate to
> Ignite. How is he supposed to do so? 30 schemas leading to rewrite of all
> his SQL scripts? Or 30 key-value pairs in a single cache leading to lack of
> flexibility and performance problems?

But currently schema *is* semantically equal to cache while table is
equal to type descriptor (i.e. type of stored entities), nothing new
here.

Say, in single cache we may have entities of types Person and
Organization, those map to two tables with same names, and can be
accessed within the same cache (i.e. schema).

If we want to limit the user with having single type descriptor per
cache (i.e. cache has only one type of stored entities - BTW, where we
are with this 2.0-wise?), then this notion could change. But currently
what has been suggested already fits quite good with what we do have
at the moment regarding semantic of SQL objects.

- Alex

> Another example is how to deal with referene tables? Lots database has
> small reference tables which is best to fit REPLICATED cache, while others
> are usually bound to PARTITIONED mode. "SCHEMA == cache" will force users
> to split them into separate schemes leading to poor user experience.
>
> I understand that we may have some implementation details around it at the
> moment. But from user perspective "SCHEMA == cache" doesn't make sense. As
> we are going towards AI 2.0 we'd better to rethink this approach.
>
> On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <dm...@apache.org> wrote:
>
>>
>> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <ds...@apache.org>
>> wrote:
>> >
>> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
>> sergi.vladykin@gmail.com>
>> > wrote:
>> >
>> >> The xml config was only for example. We can put in this configuration
>> >> string cache config parameters directly like this:
>> >>
>> >> CREATE SCHEMA "MyCacheName" WITH
>> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
>> >>
>> >
>> > This approach makes sense, if it can be easily supported with H2.
>>
>> What’s for affinity keys? Can we make an exception for them by defining in
>> this part of the statement
>>
>> CREATE TABLE employee (
>>    id BIGINT PRIMARY KEY,
>>    dept_id BIGINT AFFINITY KEY,
>>    name VARCHAR(128),
>> );
>>
>> or that l
>>
>> CREATE TABLE employee (
>>    id BIGINT PRIMARY KEY,
>>    dept_id BIGINT,
>>    name VARCHAR(128),
>>    CONSTRAINT affKey AFFINITY KEY(dept_id)
>> );
>>
>> ?
>>
>> —
>> Denis
>>
>>

Re: DDL implementation details

Posted by Vladimir Ozerov <vo...@gridgain.com>.

I am not quite sure I understand the idea of "SCHEMA == cache". Consider
some small database with, say, ~30 tables. And user wants to migrate to
Ignite. How is he supposed to do so? 30 schemas leading to rewrite of all
his SQL scripts? Or 30 key-value pairs in a single cache leading to lack of
flexibility and performance problems?

Another example is how to deal with referene tables? Lots database has
small reference tables which is best to fit REPLICATED cache, while others
are usually bound to PARTITIONED mode. "SCHEMA == cache" will force users
to split them into separate schemes leading to poor user experience.

I understand that we may have some implementation details around it at the
moment. But from user perspective "SCHEMA == cache" doesn't make sense. As
we are going towards AI 2.0 we'd better to rethink this approach.

On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <dm...@apache.org> wrote:

>
> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <ds...@apache.org>
> wrote:
> >
> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> sergi.vladykin@gmail.com>
> > wrote:
> >
> >> The xml config was only for example. We can put in this configuration
> >> string cache config parameters directly like this:
> >>
> >> CREATE SCHEMA "MyCacheName" WITH
> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> >>
> >
> > This approach makes sense, if it can be easily supported with H2.
>
> What’s for affinity keys? Can we make an exception for them by defining in
> this part of the statement
>
> CREATE TABLE employee (
>    id BIGINT PRIMARY KEY,
>    dept_id BIGINT AFFINITY KEY,
>    name VARCHAR(128),
> );
>
> or that l
>
> CREATE TABLE employee (
>    id BIGINT PRIMARY KEY,
>    dept_id BIGINT,
>    name VARCHAR(128),
>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> );
>
> ?
>
> —
> Denis
>
>

Re: DDL implementation details

Posted by Sergey Kozlov <sk...@gridgain.com>.

Denis

The affinity key term is close to the partition key definition for MySQL:
https://dev.mysql.com/doc/refman/5.7/en/partitioning-key.html

On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <dm...@apache.org> wrote:

>
> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <ds...@apache.org>
> wrote:
> >
> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> sergi.vladykin@gmail.com>
> > wrote:
> >
> >> The xml config was only for example. We can put in this configuration
> >> string cache config parameters directly like this:
> >>
> >> CREATE SCHEMA "MyCacheName" WITH
> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> >>
> >
> > This approach makes sense, if it can be easily supported with H2.
>
> What’s for affinity keys? Can we make an exception for them by defining in
> this part of the statement
>
> CREATE TABLE employee (
>    id BIGINT PRIMARY KEY,
>    dept_id BIGINT AFFINITY KEY,
>    name VARCHAR(128),
> );
>
> or that l
>
> CREATE TABLE employee (
>    id BIGINT PRIMARY KEY,
>    dept_id BIGINT,
>    name VARCHAR(128),
>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> );
>
> ?
>
> —
> Denis
>
>


-- 
Sergey Kozlov
GridGain Systems
www.gridgain.com

Re: DDL implementation details

Posted by Denis Magda <dm...@apache.org>.

> On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <ds...@apache.org> wrote:
> 
> On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <se...@gmail.com>
> wrote:
> 
>> The xml config was only for example. We can put in this configuration
>> string cache config parameters directly like this:
>> 
>> CREATE SCHEMA "MyCacheName" WITH
>> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
>> 
> 
> This approach makes sense, if it can be easily supported with H2.

What’s for affinity keys? Can we make an exception for them by defining in this part of the statement

CREATE TABLE employee (
   id BIGINT PRIMARY KEY,
   dept_id BIGINT AFFINITY KEY,
   name VARCHAR(128),
); 

or that l

CREATE TABLE employee (
   id BIGINT PRIMARY KEY,
   dept_id BIGINT,
   name VARCHAR(128),
   CONSTRAINT affKey AFFINITY KEY(dept_id)
); 

?

—
Denis

Re: DDL implementation details

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <se...@gmail.com>
wrote:

> The xml config was only for example. We can put in this configuration
> string cache config parameters directly like this:
>
> CREATE SCHEMA "MyCacheName" WITH
> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
>

This approach makes sense, if it can be easily supported with H2.

Re: DDL implementation details

Posted by Sergi Vladykin <se...@gmail.com>.

The xml config was only for example. We can put in this configuration
string cache config parameters directly like this:

CREATE SCHEMA "MyCacheName" WITH
"cacheMode=REPLICATED;atomicityMode=ATOMIC"

Sergi

2017-01-12 20:21 GMT+03:00 Alexander Paschenko <
alexander.a.paschenko@gmail.com>:

> Sergi, Alexey G.,
>
> I see your point and am rather inclined to agree that we should let
> current notion of "single schema - multiple tables" live.
>
> Still, if we create schema with cache config file, what's the whole
> point of SQL then if the user anyway has to write XML? This probably
> could be useful to propagate configuration to all cluster nodes tho.
>
> And if we skip CREATE TABLE now, it means that we leave user facing
> the need to write XML configuration, no other options. Is this what we
> want?
>
> Still I must admit that leaving user with his familiar XML stuff looks
> attractive - no messing with bunch of unknown new params, just write
> your XML and go. Also it's portable and allows to re-use
> configurations easily, so undoubtedly is a good approach from some
> point.
>
> - Alex
>
> 2017-01-12 23:51 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
> > Hi,
> >
> > 1. For now I'm against inventing any custom SQL syntax and implementing
> > parsing.
> > Currently H2 supports the following syntax:
> >
> > CREATE TABLE test(...) WITH "myCustomParamString"
> >
> > This is enough for us to pass the needed parameters.
> >
> > 2. Agree with AG, we have to separate cache creation from table creation.
> > Cache == SQL schema for us. We just have to add the same WITH syntax in
> H2
> > for schema creation like this:
> >
> > CREATE SCHEMA "MyCacheName" WITH "cacheConfig=myCache.xml"
> >
> > 3. If we want to create tables then I suggest to put this functionality
> to
> > 2.0+PageMemory right away and think where and how we are going to store
> all
> > the related metadata.This is especially important for persistent
> storages.
> >
> > Sergi
> >
> >
> > 2017-01-12 16:56 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
> >
> >> I am afraid in this case user will have to define too much schemes -
> >> boilerplate.
> >> Does it make sense at all to pack multiple tuples into a single cache
> from
> >> user perspective?
> >>
> >> On Thu, Jan 12, 2017 at 4:40 PM, Alexey Goncharuk <
> >> alexey.goncharuk@gmail.com> wrote:
> >>
> >> > Alexander,
> >> >
> >> > Will we keep the old option to have multiple tables in one cache? If
> so,
> >> > how will create table statement know which cache to choose?
> >> >
> >> > It seems to me that to be consistent with the current DML
> implementation
> >> we
> >> > should have a CREATE SCHEMA statement which will define the cache and
> >> cache
> >> > configuration, and CREATE TABLE should specify the schema name.
> >> >
> >> > Otherwise, we should enforce the single type per cache rule at the
> >> > configuration level and in runtime.
> >> >
> >> > As for affinity and primary key - agree with Vladimir.
> >> >
> >> > --
> >> > AG
> >> >
> >> > 2017-01-12 11:41 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
> >> >
> >> > > As first stage of DDL we can implement following CREATE TABLE
> statement
> >> > > support:
> >> > >  - CREATE TABLE without cache properties (use default cache
> properties
> >> or
> >> > > cache properties defined in SQL Schema)
> >> > >  - CREATE TABLE .. LIKE where we can create a cache based on an
> another
> >> > > existing cache.
> >> > >
> >> > > On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <
> >> > dsetrakyan@apache.org>
> >> > > wrote:
> >> > >
> >> > > > Agree with Sergey. We should be able to specify cache properties
> >> inside
> >> > > of
> >> > > > SQL statements. Does H2 have any support to process SQL hints?
> Can we
> >> > > > change it?
> >> > > >
> >> > > > Having said that, while we finalize the above, I think we should
> >> start
> >> > > > working on DDL implementation to use the default settings, as
> >> specified
> >> > > in
> >> > > > Alexander's email.
> >> > > >
> >> > > > Also agree with the stop-the-world on the cache for index
> creation.
> >> We
> >> > > can
> >> > > > always improve on it in future.
> >> > > >
> >> > > > D.
> >> > > >
> >> > > > On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <
> >> skozlov@gridgain.com>
> >> > > > wrote:
> >> > > >
> >> > > > > Hi
> >> > > > >
> >> > > > > I suppose we should put any ignite cache properties as
> additional
> >> > > > > non-standard attributes after CREATE TABLE () clause as it does
> >> > > > Postgress,
> >> > > > > MySQL and other RDBMS.
> >> > > > > Take a look on CREATE TABLE with using TABLESPACE (Postgess) or
> for
> >> > > > CREATE
> >> > > > > TABLE with using PARTITIONS (MySQL).
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
> >> > > vozerov@gridgain.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > I believe custom synthax and parsing is a *must* for us, as
> well
> >> as
> >> > > for
> >> > > > > any
> >> > > > > > distributed database. At the very least we need to specify
> >> affinity
> >> > > key
> >> > > > > > column somehow. Any cache property can be specified at the
> very
> >> end
> >> > > of
> >> > > > > > table definition. Key columns can be determined as the ones
> with
> >> > > > PRIMARY
> >> > > > > > KEY constraint (Alex K. idea) + affinity column(s):
> >> > > > > >
> >> > > > > > CREATE TABLE employee (
> >> > > > > >     id BIGINT PRIMARY KEY,
> >> > > > > >     dept_id BIGINT AFFINITY KEY,
> >> > > > > >     name VARCHAR(128),
> >> > > > > >     address VARCHAR(256)
> >> > > > > >     BACKUPS 2,
> >> > > > > >     ATOMICITY_MODE ATOMIC,
> >> > > > > > );
> >> > > > > >
> >> > > > > > "id" and "dept_id" form key type, "name" and "address" form
> value
> >> > > type.
> >> > > > > >
> >> > > > > > Vladimir.
> >> > > > > >
> >> > > > > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
> >> > > > akuznetsov@apache.org
> >> > > > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Hi, Alex!
> >> > > > > > >
> >> > > > > > > As far as I know most RDBMS allow something like: create
> table
> >> t1
> >> > > (id
> >> > > > > > > integer primary key, ....)
> >> > > > > > > How about to take as key field that marked as "primary key"?
> >> > > > > > >
> >> > > > > > > As of atomicity and replication - I think it is a cache
> >> > properties
> >> > > > and
> >> > > > > > with
> >> > > > > > > create table we will create "types" in cache. No?
> >> > > > > > > I thought that cache it is a kind of "schema" in RDBMS.
> >> > > > > > >
> >> > > > > > > Could you describe what will be created with CREATE TABLE?
> >> > > > > > >
> >> > > > > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> >> > > > > > > alexander.a.paschenko@gmail.com> wrote:
> >> > > > > > >
> >> > > > > > > > Hello Igniters,
> >> > > > > > > >
> >> > > > > > > > I would like to start discussion about implementation of
> SQL
> >> > DDL
> >> > > > > > > commands.
> >> > > > > > > >
> >> > > > > > > > At the first stage, the most important ones seem to be
> CREATE
> >> > > TABLE
> >> > > > > > > > (that will obviously correspond to creation of a cache)
> and
> >> > > CREATE
> >> > > > > > > > INDEX.
> >> > > > > > > >
> >> > > > > > > > Regarding first one: SQL command for CREATE TABLE does not
> >> > > contain
> >> > > > > any
> >> > > > > > > > hints about cache settings (atomicity, replication,
> etc.), so
> >> > > these
> >> > > > > > > > will probably be defined by some configuration properties
> >> (like
> >> > > > > > > > ignite.ddl.default_cache_atomiticity, etc).
> >> > > > > > > >
> >> > > > > > > > Also it does not allow to distinguish between key and
> value
> >> > > > columns -
> >> > > > > > > > currently it is handled by keyFields property of
> QueryEntity,
> >> > but
> >> > > > it
> >> > > > > > > > is unclear how to declare key fields via CREATE TABLE.
> >> > > > > > > >
> >> > > > > > > > So at a first glance it seems like we should either
> implement
> >> > > some
> >> > > > > > > > sort of custom parsing (I believe Sergi will be against
> it)
> >> or
> >> > > > > > > > introduce some kind of name prefix that would tell SQL
> engine
> >> > > that
> >> > > > > > > > certain column is a key field column.
> >> > > > > > > >
> >> > > > > > > > Of course, this problem disappears is key is of SQL type.
> >> > > > > > > >
> >> > > > > > > > Regarding CREATE INDEX: probably at first we will have to
> >> > > implement
> >> > > > > > > > this in "stop-the-world" manner, i.e. all cache will be
> >> blocked
> >> > > > > during
> >> > > > > > > > the index's initial buildup.
> >> > > > > > > >
> >> > > > > > > > Any thoughts?
> >> > > > > > > >
> >> > > > > > > > Currently I'm working on parsing of those commands as that
> >> will
> >> > > be
> >> > > > > > > > needed anyway and does not affect further implementation.
> >> > > > > > > >
> >> > > > > > > > - Alex
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > --
> >> > > > > > > Alexey Kuznetsov
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Sergey Kozlov
> >> > > > > GridGain Systems
> >> > > > > www.gridgain.com
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Sergey Kozlov
> >> > > GridGain Systems
> >> > > www.gridgain.com
> >> > >
> >> >
> >>
>

Re: DDL implementation details

Posted by Alexander Paschenko <al...@gmail.com>.

Sergi, Alexey G.,

I see your point and am rather inclined to agree that we should let
current notion of "single schema - multiple tables" live.

Still, if we create schema with cache config file, what's the whole
point of SQL then if the user anyway has to write XML? This probably
could be useful to propagate configuration to all cluster nodes tho.

And if we skip CREATE TABLE now, it means that we leave user facing
the need to write XML configuration, no other options. Is this what we
want?

Still I must admit that leaving user with his familiar XML stuff looks
attractive - no messing with bunch of unknown new params, just write
your XML and go. Also it's portable and allows to re-use
configurations easily, so undoubtedly is a good approach from some
point.

- Alex

2017-01-12 23:51 GMT+08:00 Sergi Vladykin <se...@gmail.com>:
> Hi,
>
> 1. For now I'm against inventing any custom SQL syntax and implementing
> parsing.
> Currently H2 supports the following syntax:
>
> CREATE TABLE test(...) WITH "myCustomParamString"
>
> This is enough for us to pass the needed parameters.
>
> 2. Agree with AG, we have to separate cache creation from table creation.
> Cache == SQL schema for us. We just have to add the same WITH syntax in H2
> for schema creation like this:
>
> CREATE SCHEMA "MyCacheName" WITH "cacheConfig=myCache.xml"
>
> 3. If we want to create tables then I suggest to put this functionality to
> 2.0+PageMemory right away and think where and how we are going to store all
> the related metadata.This is especially important for persistent storages.
>
> Sergi
>
>
> 2017-01-12 16:56 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:
>
>> I am afraid in this case user will have to define too much schemes -
>> boilerplate.
>> Does it make sense at all to pack multiple tuples into a single cache from
>> user perspective?
>>
>> On Thu, Jan 12, 2017 at 4:40 PM, Alexey Goncharuk <
>> alexey.goncharuk@gmail.com> wrote:
>>
>> > Alexander,
>> >
>> > Will we keep the old option to have multiple tables in one cache? If so,
>> > how will create table statement know which cache to choose?
>> >
>> > It seems to me that to be consistent with the current DML implementation
>> we
>> > should have a CREATE SCHEMA statement which will define the cache and
>> cache
>> > configuration, and CREATE TABLE should specify the schema name.
>> >
>> > Otherwise, we should enforce the single type per cache rule at the
>> > configuration level and in runtime.
>> >
>> > As for affinity and primary key - agree with Vladimir.
>> >
>> > --
>> > AG
>> >
>> > 2017-01-12 11:41 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
>> >
>> > > As first stage of DDL we can implement following CREATE TABLE statement
>> > > support:
>> > >  - CREATE TABLE without cache properties (use default cache properties
>> or
>> > > cache properties defined in SQL Schema)
>> > >  - CREATE TABLE .. LIKE where we can create a cache based on an another
>> > > existing cache.
>> > >
>> > > On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <
>> > dsetrakyan@apache.org>
>> > > wrote:
>> > >
>> > > > Agree with Sergey. We should be able to specify cache properties
>> inside
>> > > of
>> > > > SQL statements. Does H2 have any support to process SQL hints? Can we
>> > > > change it?
>> > > >
>> > > > Having said that, while we finalize the above, I think we should
>> start
>> > > > working on DDL implementation to use the default settings, as
>> specified
>> > > in
>> > > > Alexander's email.
>> > > >
>> > > > Also agree with the stop-the-world on the cache for index creation.
>> We
>> > > can
>> > > > always improve on it in future.
>> > > >
>> > > > D.
>> > > >
>> > > > On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <
>> skozlov@gridgain.com>
>> > > > wrote:
>> > > >
>> > > > > Hi
>> > > > >
>> > > > > I suppose we should put any ignite cache properties as additional
>> > > > > non-standard attributes after CREATE TABLE () clause as it does
>> > > > Postgress,
>> > > > > MySQL and other RDBMS.
>> > > > > Take a look on CREATE TABLE with using TABLESPACE (Postgess) or for
>> > > > CREATE
>> > > > > TABLE with using PARTITIONS (MySQL).
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
>> > > vozerov@gridgain.com>
>> > > > > wrote:
>> > > > >
>> > > > > > I believe custom synthax and parsing is a *must* for us, as well
>> as
>> > > for
>> > > > > any
>> > > > > > distributed database. At the very least we need to specify
>> affinity
>> > > key
>> > > > > > column somehow. Any cache property can be specified at the very
>> end
>> > > of
>> > > > > > table definition. Key columns can be determined as the ones with
>> > > > PRIMARY
>> > > > > > KEY constraint (Alex K. idea) + affinity column(s):
>> > > > > >
>> > > > > > CREATE TABLE employee (
>> > > > > >     id BIGINT PRIMARY KEY,
>> > > > > >     dept_id BIGINT AFFINITY KEY,
>> > > > > >     name VARCHAR(128),
>> > > > > >     address VARCHAR(256)
>> > > > > >     BACKUPS 2,
>> > > > > >     ATOMICITY_MODE ATOMIC,
>> > > > > > );
>> > > > > >
>> > > > > > "id" and "dept_id" form key type, "name" and "address" form value
>> > > type.
>> > > > > >
>> > > > > > Vladimir.
>> > > > > >
>> > > > > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
>> > > > akuznetsov@apache.org
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hi, Alex!
>> > > > > > >
>> > > > > > > As far as I know most RDBMS allow something like: create table
>> t1
>> > > (id
>> > > > > > > integer primary key, ....)
>> > > > > > > How about to take as key field that marked as "primary key"?
>> > > > > > >
>> > > > > > > As of atomicity and replication - I think it is a cache
>> > properties
>> > > > and
>> > > > > > with
>> > > > > > > create table we will create "types" in cache. No?
>> > > > > > > I thought that cache it is a kind of "schema" in RDBMS.
>> > > > > > >
>> > > > > > > Could you describe what will be created with CREATE TABLE?
>> > > > > > >
>> > > > > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
>> > > > > > > alexander.a.paschenko@gmail.com> wrote:
>> > > > > > >
>> > > > > > > > Hello Igniters,
>> > > > > > > >
>> > > > > > > > I would like to start discussion about implementation of SQL
>> > DDL
>> > > > > > > commands.
>> > > > > > > >
>> > > > > > > > At the first stage, the most important ones seem to be CREATE
>> > > TABLE
>> > > > > > > > (that will obviously correspond to creation of a cache) and
>> > > CREATE
>> > > > > > > > INDEX.
>> > > > > > > >
>> > > > > > > > Regarding first one: SQL command for CREATE TABLE does not
>> > > contain
>> > > > > any
>> > > > > > > > hints about cache settings (atomicity, replication, etc.), so
>> > > these
>> > > > > > > > will probably be defined by some configuration properties
>> (like
>> > > > > > > > ignite.ddl.default_cache_atomiticity, etc).
>> > > > > > > >
>> > > > > > > > Also it does not allow to distinguish between key and value
>> > > > columns -
>> > > > > > > > currently it is handled by keyFields property of QueryEntity,
>> > but
>> > > > it
>> > > > > > > > is unclear how to declare key fields via CREATE TABLE.
>> > > > > > > >
>> > > > > > > > So at a first glance it seems like we should either implement
>> > > some
>> > > > > > > > sort of custom parsing (I believe Sergi will be against it)
>> or
>> > > > > > > > introduce some kind of name prefix that would tell SQL engine
>> > > that
>> > > > > > > > certain column is a key field column.
>> > > > > > > >
>> > > > > > > > Of course, this problem disappears is key is of SQL type.
>> > > > > > > >
>> > > > > > > > Regarding CREATE INDEX: probably at first we will have to
>> > > implement
>> > > > > > > > this in "stop-the-world" manner, i.e. all cache will be
>> blocked
>> > > > > during
>> > > > > > > > the index's initial buildup.
>> > > > > > > >
>> > > > > > > > Any thoughts?
>> > > > > > > >
>> > > > > > > > Currently I'm working on parsing of those commands as that
>> will
>> > > be
>> > > > > > > > needed anyway and does not affect further implementation.
>> > > > > > > >
>> > > > > > > > - Alex
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > > Alexey Kuznetsov
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Sergey Kozlov
>> > > > > GridGain Systems
>> > > > > www.gridgain.com
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Sergey Kozlov
>> > > GridGain Systems
>> > > www.gridgain.com
>> > >
>> >
>>

Re: DDL implementation details

Posted by Alexander Paschenko <al...@gmail.com>.

Sergi,

Regarding CREATE SCHEMA/TABLE parsing:

2017-01-12 18:51 GMT+03:00 Sergi Vladykin <se...@gmail.com>:
> Hi,
>
> 1. For now I'm against inventing any custom SQL syntax and implementing
> parsing.
> Currently H2 supports the following syntax:
>
> CREATE TABLE test(...) WITH "myCustomParamString"
>
> This is enough for us to pass the needed parameters.

H2's grammar requires to supply table engine class name as well, you
can't just go with WITH:

http://www.h2database.com/html/grammar.html#create_table

And code of H2's org.h2.command.Parser works exactly like this - you
can't just set default engine in connection string, in order to use
WITH, you have to specify both engine and WITH.

So we have either to force users to specify engine in their queries
(ugly and hardly acceptable), or have to somehow analyze CREATE TABLE
and sneak ENGINE param into the query ourselves.

But this again smells custom parsing, if you ask me.

>
> 2. Agree with AG, we have to separate cache creation from table creation.
> Cache == SQL schema for us. We just have to add the same WITH syntax in H2
> for schema creation like this:
>
> CREATE SCHEMA "MyCacheName" WITH "cacheConfig=myCache.xml"
>

Currently H2 does not allow using WITH with CREATE SCHEMA, proof:
http://www.h2database.com/html/grammar.html#create_schema

Do you suggest that we should somehow introduce this syntax to H2 engine first?

In the light of this, CREATE CACHE might seem legit. Thus, if we
depart from CREATE SCHEMA == create cache and introduce custom syntax
for CREATE CACHE, then we'll end up closer to what Vlad suggest –
schema is a logical object, caches exist on their own right.

Also, to add to Vlad's thoughts on Ignite becoming a database, I would
like to point out that for now we have talked only about schemas,
tables, and caches, but there are also *databases* as organizational
entities. Shouldn't we start thinking about those as well?

- Alex

Re: DDL implementation details

Posted by Sergi Vladykin <se...@gmail.com>.

Hi,

1. For now I'm against inventing any custom SQL syntax and implementing
parsing.
Currently H2 supports the following syntax:

CREATE TABLE test(...) WITH "myCustomParamString"

This is enough for us to pass the needed parameters.

2. Agree with AG, we have to separate cache creation from table creation.
Cache == SQL schema for us. We just have to add the same WITH syntax in H2
for schema creation like this:

CREATE SCHEMA "MyCacheName" WITH "cacheConfig=myCache.xml"

3. If we want to create tables then I suggest to put this functionality to
2.0+PageMemory right away and think where and how we are going to store all
the related metadata.This is especially important for persistent storages.

Sergi


2017-01-12 16:56 GMT+03:00 Vladimir Ozerov <vo...@gridgain.com>:

> I am afraid in this case user will have to define too much schemes -
> boilerplate.
> Does it make sense at all to pack multiple tuples into a single cache from
> user perspective?
>
> On Thu, Jan 12, 2017 at 4:40 PM, Alexey Goncharuk <
> alexey.goncharuk@gmail.com> wrote:
>
> > Alexander,
> >
> > Will we keep the old option to have multiple tables in one cache? If so,
> > how will create table statement know which cache to choose?
> >
> > It seems to me that to be consistent with the current DML implementation
> we
> > should have a CREATE SCHEMA statement which will define the cache and
> cache
> > configuration, and CREATE TABLE should specify the schema name.
> >
> > Otherwise, we should enforce the single type per cache rule at the
> > configuration level and in runtime.
> >
> > As for affinity and primary key - agree with Vladimir.
> >
> > --
> > AG
> >
> > 2017-01-12 11:41 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
> >
> > > As first stage of DDL we can implement following CREATE TABLE statement
> > > support:
> > >  - CREATE TABLE without cache properties (use default cache properties
> or
> > > cache properties defined in SQL Schema)
> > >  - CREATE TABLE .. LIKE where we can create a cache based on an another
> > > existing cache.
> > >
> > > On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <
> > dsetrakyan@apache.org>
> > > wrote:
> > >
> > > > Agree with Sergey. We should be able to specify cache properties
> inside
> > > of
> > > > SQL statements. Does H2 have any support to process SQL hints? Can we
> > > > change it?
> > > >
> > > > Having said that, while we finalize the above, I think we should
> start
> > > > working on DDL implementation to use the default settings, as
> specified
> > > in
> > > > Alexander's email.
> > > >
> > > > Also agree with the stop-the-world on the cache for index creation.
> We
> > > can
> > > > always improve on it in future.
> > > >
> > > > D.
> > > >
> > > > On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <
> skozlov@gridgain.com>
> > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > I suppose we should put any ignite cache properties as additional
> > > > > non-standard attributes after CREATE TABLE () clause as it does
> > > > Postgress,
> > > > > MySQL and other RDBMS.
> > > > > Take a look on CREATE TABLE with using TABLESPACE (Postgess) or for
> > > > CREATE
> > > > > TABLE with using PARTITIONS (MySQL).
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
> > > vozerov@gridgain.com>
> > > > > wrote:
> > > > >
> > > > > > I believe custom synthax and parsing is a *must* for us, as well
> as
> > > for
> > > > > any
> > > > > > distributed database. At the very least we need to specify
> affinity
> > > key
> > > > > > column somehow. Any cache property can be specified at the very
> end
> > > of
> > > > > > table definition. Key columns can be determined as the ones with
> > > > PRIMARY
> > > > > > KEY constraint (Alex K. idea) + affinity column(s):
> > > > > >
> > > > > > CREATE TABLE employee (
> > > > > >     id BIGINT PRIMARY KEY,
> > > > > >     dept_id BIGINT AFFINITY KEY,
> > > > > >     name VARCHAR(128),
> > > > > >     address VARCHAR(256)
> > > > > >     BACKUPS 2,
> > > > > >     ATOMICITY_MODE ATOMIC,
> > > > > > );
> > > > > >
> > > > > > "id" and "dept_id" form key type, "name" and "address" form value
> > > type.
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
> > > > akuznetsov@apache.org
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi, Alex!
> > > > > > >
> > > > > > > As far as I know most RDBMS allow something like: create table
> t1
> > > (id
> > > > > > > integer primary key, ....)
> > > > > > > How about to take as key field that marked as "primary key"?
> > > > > > >
> > > > > > > As of atomicity and replication - I think it is a cache
> > properties
> > > > and
> > > > > > with
> > > > > > > create table we will create "types" in cache. No?
> > > > > > > I thought that cache it is a kind of "schema" in RDBMS.
> > > > > > >
> > > > > > > Could you describe what will be created with CREATE TABLE?
> > > > > > >
> > > > > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> > > > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > > > >
> > > > > > > > Hello Igniters,
> > > > > > > >
> > > > > > > > I would like to start discussion about implementation of SQL
> > DDL
> > > > > > > commands.
> > > > > > > >
> > > > > > > > At the first stage, the most important ones seem to be CREATE
> > > TABLE
> > > > > > > > (that will obviously correspond to creation of a cache) and
> > > CREATE
> > > > > > > > INDEX.
> > > > > > > >
> > > > > > > > Regarding first one: SQL command for CREATE TABLE does not
> > > contain
> > > > > any
> > > > > > > > hints about cache settings (atomicity, replication, etc.), so
> > > these
> > > > > > > > will probably be defined by some configuration properties
> (like
> > > > > > > > ignite.ddl.default_cache_atomiticity, etc).
> > > > > > > >
> > > > > > > > Also it does not allow to distinguish between key and value
> > > > columns -
> > > > > > > > currently it is handled by keyFields property of QueryEntity,
> > but
> > > > it
> > > > > > > > is unclear how to declare key fields via CREATE TABLE.
> > > > > > > >
> > > > > > > > So at a first glance it seems like we should either implement
> > > some
> > > > > > > > sort of custom parsing (I believe Sergi will be against it)
> or
> > > > > > > > introduce some kind of name prefix that would tell SQL engine
> > > that
> > > > > > > > certain column is a key field column.
> > > > > > > >
> > > > > > > > Of course, this problem disappears is key is of SQL type.
> > > > > > > >
> > > > > > > > Regarding CREATE INDEX: probably at first we will have to
> > > implement
> > > > > > > > this in "stop-the-world" manner, i.e. all cache will be
> blocked
> > > > > during
> > > > > > > > the index's initial buildup.
> > > > > > > >
> > > > > > > > Any thoughts?
> > > > > > > >
> > > > > > > > Currently I'm working on parsing of those commands as that
> will
> > > be
> > > > > > > > needed anyway and does not affect further implementation.
> > > > > > > >
> > > > > > > > - Alex
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Alexey Kuznetsov
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Sergey Kozlov
> > > > > GridGain Systems
> > > > > www.gridgain.com
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Sergey Kozlov
> > > GridGain Systems
> > > www.gridgain.com
> > >
> >
>

Re: DDL implementation details

Posted by Vladimir Ozerov <vo...@gridgain.com>.

I am afraid in this case user will have to define too much schemes -
boilerplate.
Does it make sense at all to pack multiple tuples into a single cache from
user perspective?

On Thu, Jan 12, 2017 at 4:40 PM, Alexey Goncharuk <
alexey.goncharuk@gmail.com> wrote:

> Alexander,
>
> Will we keep the old option to have multiple tables in one cache? If so,
> how will create table statement know which cache to choose?
>
> It seems to me that to be consistent with the current DML implementation we
> should have a CREATE SCHEMA statement which will define the cache and cache
> configuration, and CREATE TABLE should specify the schema name.
>
> Otherwise, we should enforce the single type per cache rule at the
> configuration level and in runtime.
>
> As for affinity and primary key - agree with Vladimir.
>
> --
> AG
>
> 2017-01-12 11:41 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:
>
> > As first stage of DDL we can implement following CREATE TABLE statement
> > support:
> >  - CREATE TABLE without cache properties (use default cache properties or
> > cache properties defined in SQL Schema)
> >  - CREATE TABLE .. LIKE where we can create a cache based on an another
> > existing cache.
> >
> > On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <
> dsetrakyan@apache.org>
> > wrote:
> >
> > > Agree with Sergey. We should be able to specify cache properties inside
> > of
> > > SQL statements. Does H2 have any support to process SQL hints? Can we
> > > change it?
> > >
> > > Having said that, while we finalize the above, I think we should start
> > > working on DDL implementation to use the default settings, as specified
> > in
> > > Alexander's email.
> > >
> > > Also agree with the stop-the-world on the cache for index creation. We
> > can
> > > always improve on it in future.
> > >
> > > D.
> > >
> > > On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <sk...@gridgain.com>
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > I suppose we should put any ignite cache properties as additional
> > > > non-standard attributes after CREATE TABLE () clause as it does
> > > Postgress,
> > > > MySQL and other RDBMS.
> > > > Take a look on CREATE TABLE with using TABLESPACE (Postgess) or for
> > > CREATE
> > > > TABLE with using PARTITIONS (MySQL).
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
> > vozerov@gridgain.com>
> > > > wrote:
> > > >
> > > > > I believe custom synthax and parsing is a *must* for us, as well as
> > for
> > > > any
> > > > > distributed database. At the very least we need to specify affinity
> > key
> > > > > column somehow. Any cache property can be specified at the very end
> > of
> > > > > table definition. Key columns can be determined as the ones with
> > > PRIMARY
> > > > > KEY constraint (Alex K. idea) + affinity column(s):
> > > > >
> > > > > CREATE TABLE employee (
> > > > >     id BIGINT PRIMARY KEY,
> > > > >     dept_id BIGINT AFFINITY KEY,
> > > > >     name VARCHAR(128),
> > > > >     address VARCHAR(256)
> > > > >     BACKUPS 2,
> > > > >     ATOMICITY_MODE ATOMIC,
> > > > > );
> > > > >
> > > > > "id" and "dept_id" form key type, "name" and "address" form value
> > type.
> > > > >
> > > > > Vladimir.
> > > > >
> > > > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
> > > akuznetsov@apache.org
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi, Alex!
> > > > > >
> > > > > > As far as I know most RDBMS allow something like: create table t1
> > (id
> > > > > > integer primary key, ....)
> > > > > > How about to take as key field that marked as "primary key"?
> > > > > >
> > > > > > As of atomicity and replication - I think it is a cache
> properties
> > > and
> > > > > with
> > > > > > create table we will create "types" in cache. No?
> > > > > > I thought that cache it is a kind of "schema" in RDBMS.
> > > > > >
> > > > > > Could you describe what will be created with CREATE TABLE?
> > > > > >
> > > > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> > > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > > >
> > > > > > > Hello Igniters,
> > > > > > >
> > > > > > > I would like to start discussion about implementation of SQL
> DDL
> > > > > > commands.
> > > > > > >
> > > > > > > At the first stage, the most important ones seem to be CREATE
> > TABLE
> > > > > > > (that will obviously correspond to creation of a cache) and
> > CREATE
> > > > > > > INDEX.
> > > > > > >
> > > > > > > Regarding first one: SQL command for CREATE TABLE does not
> > contain
> > > > any
> > > > > > > hints about cache settings (atomicity, replication, etc.), so
> > these
> > > > > > > will probably be defined by some configuration properties (like
> > > > > > > ignite.ddl.default_cache_atomiticity, etc).
> > > > > > >
> > > > > > > Also it does not allow to distinguish between key and value
> > > columns -
> > > > > > > currently it is handled by keyFields property of QueryEntity,
> but
> > > it
> > > > > > > is unclear how to declare key fields via CREATE TABLE.
> > > > > > >
> > > > > > > So at a first glance it seems like we should either implement
> > some
> > > > > > > sort of custom parsing (I believe Sergi will be against it) or
> > > > > > > introduce some kind of name prefix that would tell SQL engine
> > that
> > > > > > > certain column is a key field column.
> > > > > > >
> > > > > > > Of course, this problem disappears is key is of SQL type.
> > > > > > >
> > > > > > > Regarding CREATE INDEX: probably at first we will have to
> > implement
> > > > > > > this in "stop-the-world" manner, i.e. all cache will be blocked
> > > > during
> > > > > > > the index's initial buildup.
> > > > > > >
> > > > > > > Any thoughts?
> > > > > > >
> > > > > > > Currently I'm working on parsing of those commands as that will
> > be
> > > > > > > needed anyway and does not affect further implementation.
> > > > > > >
> > > > > > > - Alex
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Alexey Kuznetsov
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Sergey Kozlov
> > > > GridGain Systems
> > > > www.gridgain.com
> > > >
> > >
> >
> >
> >
> > --
> > Sergey Kozlov
> > GridGain Systems
> > www.gridgain.com
> >
>

Re: DDL implementation details

Posted by Alexey Goncharuk <al...@gmail.com>.

Alexander,

Will we keep the old option to have multiple tables in one cache? If so,
how will create table statement know which cache to choose?

It seems to me that to be consistent with the current DML implementation we
should have a CREATE SCHEMA statement which will define the cache and cache
configuration, and CREATE TABLE should specify the schema name.

Otherwise, we should enforce the single type per cache rule at the
configuration level and in runtime.

As for affinity and primary key - agree with Vladimir.

--
AG

2017-01-12 11:41 GMT+03:00 Sergey Kozlov <sk...@gridgain.com>:

> As first stage of DDL we can implement following CREATE TABLE statement
> support:
>  - CREATE TABLE without cache properties (use default cache properties or
> cache properties defined in SQL Schema)
>  - CREATE TABLE .. LIKE where we can create a cache based on an another
> existing cache.
>
> On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <ds...@apache.org>
> wrote:
>
> > Agree with Sergey. We should be able to specify cache properties inside
> of
> > SQL statements. Does H2 have any support to process SQL hints? Can we
> > change it?
> >
> > Having said that, while we finalize the above, I think we should start
> > working on DDL implementation to use the default settings, as specified
> in
> > Alexander's email.
> >
> > Also agree with the stop-the-world on the cache for index creation. We
> can
> > always improve on it in future.
> >
> > D.
> >
> > On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <sk...@gridgain.com>
> > wrote:
> >
> > > Hi
> > >
> > > I suppose we should put any ignite cache properties as additional
> > > non-standard attributes after CREATE TABLE () clause as it does
> > Postgress,
> > > MySQL and other RDBMS.
> > > Take a look on CREATE TABLE with using TABLESPACE (Postgess) or for
> > CREATE
> > > TABLE with using PARTITIONS (MySQL).
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <
> vozerov@gridgain.com>
> > > wrote:
> > >
> > > > I believe custom synthax and parsing is a *must* for us, as well as
> for
> > > any
> > > > distributed database. At the very least we need to specify affinity
> key
> > > > column somehow. Any cache property can be specified at the very end
> of
> > > > table definition. Key columns can be determined as the ones with
> > PRIMARY
> > > > KEY constraint (Alex K. idea) + affinity column(s):
> > > >
> > > > CREATE TABLE employee (
> > > >     id BIGINT PRIMARY KEY,
> > > >     dept_id BIGINT AFFINITY KEY,
> > > >     name VARCHAR(128),
> > > >     address VARCHAR(256)
> > > >     BACKUPS 2,
> > > >     ATOMICITY_MODE ATOMIC,
> > > > );
> > > >
> > > > "id" and "dept_id" form key type, "name" and "address" form value
> type.
> > > >
> > > > Vladimir.
> > > >
> > > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
> > akuznetsov@apache.org
> > > >
> > > > wrote:
> > > >
> > > > > Hi, Alex!
> > > > >
> > > > > As far as I know most RDBMS allow something like: create table t1
> (id
> > > > > integer primary key, ....)
> > > > > How about to take as key field that marked as "primary key"?
> > > > >
> > > > > As of atomicity and replication - I think it is a cache properties
> > and
> > > > with
> > > > > create table we will create "types" in cache. No?
> > > > > I thought that cache it is a kind of "schema" in RDBMS.
> > > > >
> > > > > Could you describe what will be created with CREATE TABLE?
> > > > >
> > > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > >
> > > > > > Hello Igniters,
> > > > > >
> > > > > > I would like to start discussion about implementation of SQL DDL
> > > > > commands.
> > > > > >
> > > > > > At the first stage, the most important ones seem to be CREATE
> TABLE
> > > > > > (that will obviously correspond to creation of a cache) and
> CREATE
> > > > > > INDEX.
> > > > > >
> > > > > > Regarding first one: SQL command for CREATE TABLE does not
> contain
> > > any
> > > > > > hints about cache settings (atomicity, replication, etc.), so
> these
> > > > > > will probably be defined by some configuration properties (like
> > > > > > ignite.ddl.default_cache_atomiticity, etc).
> > > > > >
> > > > > > Also it does not allow to distinguish between key and value
> > columns -
> > > > > > currently it is handled by keyFields property of QueryEntity, but
> > it
> > > > > > is unclear how to declare key fields via CREATE TABLE.
> > > > > >
> > > > > > So at a first glance it seems like we should either implement
> some
> > > > > > sort of custom parsing (I believe Sergi will be against it) or
> > > > > > introduce some kind of name prefix that would tell SQL engine
> that
> > > > > > certain column is a key field column.
> > > > > >
> > > > > > Of course, this problem disappears is key is of SQL type.
> > > > > >
> > > > > > Regarding CREATE INDEX: probably at first we will have to
> implement
> > > > > > this in "stop-the-world" manner, i.e. all cache will be blocked
> > > during
> > > > > > the index's initial buildup.
> > > > > >
> > > > > > Any thoughts?
> > > > > >
> > > > > > Currently I'm working on parsing of those commands as that will
> be
> > > > > > needed anyway and does not affect further implementation.
> > > > > >
> > > > > > - Alex
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Alexey Kuznetsov
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Sergey Kozlov
> > > GridGain Systems
> > > www.gridgain.com
> > >
> >
>
>
>
> --
> Sergey Kozlov
> GridGain Systems
> www.gridgain.com
>

Re: DDL implementation details

Posted by Sergey Kozlov <sk...@gridgain.com>.

As first stage of DDL we can implement following CREATE TABLE statement
support:
 - CREATE TABLE without cache properties (use default cache properties or
cache properties defined in SQL Schema)
 - CREATE TABLE .. LIKE where we can create a cache based on an another
existing cache.

On Thu, Jan 12, 2017 at 5:54 AM, Dmitriy Setrakyan <ds...@apache.org>
wrote:

> Agree with Sergey. We should be able to specify cache properties inside of
> SQL statements. Does H2 have any support to process SQL hints? Can we
> change it?
>
> Having said that, while we finalize the above, I think we should start
> working on DDL implementation to use the default settings, as specified in
> Alexander's email.
>
> Also agree with the stop-the-world on the cache for index creation. We can
> always improve on it in future.
>
> D.
>
> On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <sk...@gridgain.com>
> wrote:
>
> > Hi
> >
> > I suppose we should put any ignite cache properties as additional
> > non-standard attributes after CREATE TABLE () clause as it does
> Postgress,
> > MySQL and other RDBMS.
> > Take a look on CREATE TABLE with using TABLESPACE (Postgess) or for
> CREATE
> > TABLE with using PARTITIONS (MySQL).
> >
> >
> >
> >
> >
> > On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <vo...@gridgain.com>
> > wrote:
> >
> > > I believe custom synthax and parsing is a *must* for us, as well as for
> > any
> > > distributed database. At the very least we need to specify affinity key
> > > column somehow. Any cache property can be specified at the very end of
> > > table definition. Key columns can be determined as the ones with
> PRIMARY
> > > KEY constraint (Alex K. idea) + affinity column(s):
> > >
> > > CREATE TABLE employee (
> > >     id BIGINT PRIMARY KEY,
> > >     dept_id BIGINT AFFINITY KEY,
> > >     name VARCHAR(128),
> > >     address VARCHAR(256)
> > >     BACKUPS 2,
> > >     ATOMICITY_MODE ATOMIC,
> > > );
> > >
> > > "id" and "dept_id" form key type, "name" and "address" form value type.
> > >
> > > Vladimir.
> > >
> > > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <
> akuznetsov@apache.org
> > >
> > > wrote:
> > >
> > > > Hi, Alex!
> > > >
> > > > As far as I know most RDBMS allow something like: create table t1 (id
> > > > integer primary key, ....)
> > > > How about to take as key field that marked as "primary key"?
> > > >
> > > > As of atomicity and replication - I think it is a cache properties
> and
> > > with
> > > > create table we will create "types" in cache. No?
> > > > I thought that cache it is a kind of "schema" in RDBMS.
> > > >
> > > > Could you describe what will be created with CREATE TABLE?
> > > >
> > > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> > > > alexander.a.paschenko@gmail.com> wrote:
> > > >
> > > > > Hello Igniters,
> > > > >
> > > > > I would like to start discussion about implementation of SQL DDL
> > > > commands.
> > > > >
> > > > > At the first stage, the most important ones seem to be CREATE TABLE
> > > > > (that will obviously correspond to creation of a cache) and CREATE
> > > > > INDEX.
> > > > >
> > > > > Regarding first one: SQL command for CREATE TABLE does not contain
> > any
> > > > > hints about cache settings (atomicity, replication, etc.), so these
> > > > > will probably be defined by some configuration properties (like
> > > > > ignite.ddl.default_cache_atomiticity, etc).
> > > > >
> > > > > Also it does not allow to distinguish between key and value
> columns -
> > > > > currently it is handled by keyFields property of QueryEntity, but
> it
> > > > > is unclear how to declare key fields via CREATE TABLE.
> > > > >
> > > > > So at a first glance it seems like we should either implement some
> > > > > sort of custom parsing (I believe Sergi will be against it) or
> > > > > introduce some kind of name prefix that would tell SQL engine that
> > > > > certain column is a key field column.
> > > > >
> > > > > Of course, this problem disappears is key is of SQL type.
> > > > >
> > > > > Regarding CREATE INDEX: probably at first we will have to implement
> > > > > this in "stop-the-world" manner, i.e. all cache will be blocked
> > during
> > > > > the index's initial buildup.
> > > > >
> > > > > Any thoughts?
> > > > >
> > > > > Currently I'm working on parsing of those commands as that will be
> > > > > needed anyway and does not affect further implementation.
> > > > >
> > > > > - Alex
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Alexey Kuznetsov
> > > >
> > >
> >
> >
> >
> > --
> > Sergey Kozlov
> > GridGain Systems
> > www.gridgain.com
> >
>



-- 
Sergey Kozlov
GridGain Systems
www.gridgain.com

Re: DDL implementation details

Posted by Dmitriy Setrakyan <ds...@apache.org>.

Agree with Sergey. We should be able to specify cache properties inside of
SQL statements. Does H2 have any support to process SQL hints? Can we
change it?

Having said that, while we finalize the above, I think we should start
working on DDL implementation to use the default settings, as specified in
Alexander's email.

Also agree with the stop-the-world on the cache for index creation. We can
always improve on it in future.

D.

On Wed, Jan 11, 2017 at 11:28 AM, Sergey Kozlov <sk...@gridgain.com>
wrote:

> Hi
>
> I suppose we should put any ignite cache properties as additional
> non-standard attributes after CREATE TABLE () clause as it does Postgress,
> MySQL and other RDBMS.
> Take a look on CREATE TABLE with using TABLESPACE (Postgess) or for CREATE
> TABLE with using PARTITIONS (MySQL).
>
>
>
>
>
> On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
> > I believe custom synthax and parsing is a *must* for us, as well as for
> any
> > distributed database. At the very least we need to specify affinity key
> > column somehow. Any cache property can be specified at the very end of
> > table definition. Key columns can be determined as the ones with PRIMARY
> > KEY constraint (Alex K. idea) + affinity column(s):
> >
> > CREATE TABLE employee (
> >     id BIGINT PRIMARY KEY,
> >     dept_id BIGINT AFFINITY KEY,
> >     name VARCHAR(128),
> >     address VARCHAR(256)
> >     BACKUPS 2,
> >     ATOMICITY_MODE ATOMIC,
> > );
> >
> > "id" and "dept_id" form key type, "name" and "address" form value type.
> >
> > Vladimir.
> >
> > On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <akuznetsov@apache.org
> >
> > wrote:
> >
> > > Hi, Alex!
> > >
> > > As far as I know most RDBMS allow something like: create table t1 (id
> > > integer primary key, ....)
> > > How about to take as key field that marked as "primary key"?
> > >
> > > As of atomicity and replication - I think it is a cache properties and
> > with
> > > create table we will create "types" in cache. No?
> > > I thought that cache it is a kind of "schema" in RDBMS.
> > >
> > > Could you describe what will be created with CREATE TABLE?
> > >
> > > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> > > alexander.a.paschenko@gmail.com> wrote:
> > >
> > > > Hello Igniters,
> > > >
> > > > I would like to start discussion about implementation of SQL DDL
> > > commands.
> > > >
> > > > At the first stage, the most important ones seem to be CREATE TABLE
> > > > (that will obviously correspond to creation of a cache) and CREATE
> > > > INDEX.
> > > >
> > > > Regarding first one: SQL command for CREATE TABLE does not contain
> any
> > > > hints about cache settings (atomicity, replication, etc.), so these
> > > > will probably be defined by some configuration properties (like
> > > > ignite.ddl.default_cache_atomiticity, etc).
> > > >
> > > > Also it does not allow to distinguish between key and value columns -
> > > > currently it is handled by keyFields property of QueryEntity, but it
> > > > is unclear how to declare key fields via CREATE TABLE.
> > > >
> > > > So at a first glance it seems like we should either implement some
> > > > sort of custom parsing (I believe Sergi will be against it) or
> > > > introduce some kind of name prefix that would tell SQL engine that
> > > > certain column is a key field column.
> > > >
> > > > Of course, this problem disappears is key is of SQL type.
> > > >
> > > > Regarding CREATE INDEX: probably at first we will have to implement
> > > > this in "stop-the-world" manner, i.e. all cache will be blocked
> during
> > > > the index's initial buildup.
> > > >
> > > > Any thoughts?
> > > >
> > > > Currently I'm working on parsing of those commands as that will be
> > > > needed anyway and does not affect further implementation.
> > > >
> > > > - Alex
> > > >
> > >
> > >
> > >
> > > --
> > > Alexey Kuznetsov
> > >
> >
>
>
>
> --
> Sergey Kozlov
> GridGain Systems
> www.gridgain.com
>

Re: DDL implementation details

Posted by Sergey Kozlov <sk...@gridgain.com>.

Hi

I suppose we should put any ignite cache properties as additional
non-standard attributes after CREATE TABLE () clause as it does Postgress,
MySQL and other RDBMS.
Take a look on CREATE TABLE with using TABLESPACE (Postgess) or for CREATE
TABLE with using PARTITIONS (MySQL).





On Wed, Jan 11, 2017 at 10:05 PM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> I believe custom synthax and parsing is a *must* for us, as well as for any
> distributed database. At the very least we need to specify affinity key
> column somehow. Any cache property can be specified at the very end of
> table definition. Key columns can be determined as the ones with PRIMARY
> KEY constraint (Alex K. idea) + affinity column(s):
>
> CREATE TABLE employee (
>     id BIGINT PRIMARY KEY,
>     dept_id BIGINT AFFINITY KEY,
>     name VARCHAR(128),
>     address VARCHAR(256)
>     BACKUPS 2,
>     ATOMICITY_MODE ATOMIC,
> );
>
> "id" and "dept_id" form key type, "name" and "address" form value type.
>
> Vladimir.
>
> On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <ak...@apache.org>
> wrote:
>
> > Hi, Alex!
> >
> > As far as I know most RDBMS allow something like: create table t1 (id
> > integer primary key, ....)
> > How about to take as key field that marked as "primary key"?
> >
> > As of atomicity and replication - I think it is a cache properties and
> with
> > create table we will create "types" in cache. No?
> > I thought that cache it is a kind of "schema" in RDBMS.
> >
> > Could you describe what will be created with CREATE TABLE?
> >
> > On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> > alexander.a.paschenko@gmail.com> wrote:
> >
> > > Hello Igniters,
> > >
> > > I would like to start discussion about implementation of SQL DDL
> > commands.
> > >
> > > At the first stage, the most important ones seem to be CREATE TABLE
> > > (that will obviously correspond to creation of a cache) and CREATE
> > > INDEX.
> > >
> > > Regarding first one: SQL command for CREATE TABLE does not contain any
> > > hints about cache settings (atomicity, replication, etc.), so these
> > > will probably be defined by some configuration properties (like
> > > ignite.ddl.default_cache_atomiticity, etc).
> > >
> > > Also it does not allow to distinguish between key and value columns -
> > > currently it is handled by keyFields property of QueryEntity, but it
> > > is unclear how to declare key fields via CREATE TABLE.
> > >
> > > So at a first glance it seems like we should either implement some
> > > sort of custom parsing (I believe Sergi will be against it) or
> > > introduce some kind of name prefix that would tell SQL engine that
> > > certain column is a key field column.
> > >
> > > Of course, this problem disappears is key is of SQL type.
> > >
> > > Regarding CREATE INDEX: probably at first we will have to implement
> > > this in "stop-the-world" manner, i.e. all cache will be blocked during
> > > the index's initial buildup.
> > >
> > > Any thoughts?
> > >
> > > Currently I'm working on parsing of those commands as that will be
> > > needed anyway and does not affect further implementation.
> > >
> > > - Alex
> > >
> >
> >
> >
> > --
> > Alexey Kuznetsov
> >
>



-- 
Sergey Kozlov
GridGain Systems
www.gridgain.com

Re: DDL implementation details

Posted by Vladimir Ozerov <vo...@gridgain.com>.

I believe custom synthax and parsing is a *must* for us, as well as for any
distributed database. At the very least we need to specify affinity key
column somehow. Any cache property can be specified at the very end of
table definition. Key columns can be determined as the ones with PRIMARY
KEY constraint (Alex K. idea) + affinity column(s):

CREATE TABLE employee (
    id BIGINT PRIMARY KEY,
    dept_id BIGINT AFFINITY KEY,
    name VARCHAR(128),
    address VARCHAR(256)
    BACKUPS 2,
    ATOMICITY_MODE ATOMIC,
);

"id" and "dept_id" form key type, "name" and "address" form value type.

Vladimir.

On Wed, Jan 11, 2017 at 9:08 PM, Alexey Kuznetsov <ak...@apache.org>
wrote:

> Hi, Alex!
>
> As far as I know most RDBMS allow something like: create table t1 (id
> integer primary key, ....)
> How about to take as key field that marked as "primary key"?
>
> As of atomicity and replication - I think it is a cache properties and with
> create table we will create "types" in cache. No?
> I thought that cache it is a kind of "schema" in RDBMS.
>
> Could you describe what will be created with CREATE TABLE?
>
> On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
> alexander.a.paschenko@gmail.com> wrote:
>
> > Hello Igniters,
> >
> > I would like to start discussion about implementation of SQL DDL
> commands.
> >
> > At the first stage, the most important ones seem to be CREATE TABLE
> > (that will obviously correspond to creation of a cache) and CREATE
> > INDEX.
> >
> > Regarding first one: SQL command for CREATE TABLE does not contain any
> > hints about cache settings (atomicity, replication, etc.), so these
> > will probably be defined by some configuration properties (like
> > ignite.ddl.default_cache_atomiticity, etc).
> >
> > Also it does not allow to distinguish between key and value columns -
> > currently it is handled by keyFields property of QueryEntity, but it
> > is unclear how to declare key fields via CREATE TABLE.
> >
> > So at a first glance it seems like we should either implement some
> > sort of custom parsing (I believe Sergi will be against it) or
> > introduce some kind of name prefix that would tell SQL engine that
> > certain column is a key field column.
> >
> > Of course, this problem disappears is key is of SQL type.
> >
> > Regarding CREATE INDEX: probably at first we will have to implement
> > this in "stop-the-world" manner, i.e. all cache will be blocked during
> > the index's initial buildup.
> >
> > Any thoughts?
> >
> > Currently I'm working on parsing of those commands as that will be
> > needed anyway and does not affect further implementation.
> >
> > - Alex
> >
>
>
>
> --
> Alexey Kuznetsov
>

Re: DDL implementation details

Posted by Alexey Kuznetsov <ak...@apache.org>.

Hi, Alex!

As far as I know most RDBMS allow something like: create table t1 (id
integer primary key, ....)
How about to take as key field that marked as "primary key"?

As of atomicity and replication - I think it is a cache properties and with
create table we will create "types" in cache. No?
I thought that cache it is a kind of "schema" in RDBMS.

Could you describe what will be created with CREATE TABLE?

On Thu, Jan 12, 2017 at 12:54 AM, Alexander Paschenko <
alexander.a.paschenko@gmail.com> wrote:

> Hello Igniters,
>
> I would like to start discussion about implementation of SQL DDL commands.
>
> At the first stage, the most important ones seem to be CREATE TABLE
> (that will obviously correspond to creation of a cache) and CREATE
> INDEX.
>
> Regarding first one: SQL command for CREATE TABLE does not contain any
> hints about cache settings (atomicity, replication, etc.), so these
> will probably be defined by some configuration properties (like
> ignite.ddl.default_cache_atomiticity, etc).
>
> Also it does not allow to distinguish between key and value columns -
> currently it is handled by keyFields property of QueryEntity, but it
> is unclear how to declare key fields via CREATE TABLE.
>
> So at a first glance it seems like we should either implement some
> sort of custom parsing (I believe Sergi will be against it) or
> introduce some kind of name prefix that would tell SQL engine that
> certain column is a key field column.
>
> Of course, this problem disappears is key is of SQL type.
>
> Regarding CREATE INDEX: probably at first we will have to implement
> this in "stop-the-world" manner, i.e. all cache will be blocked during
> the index's initial buildup.
>
> Any thoughts?
>
> Currently I'm working on parsing of those commands as that will be
> needed anyway and does not affect further implementation.
>
> - Alex
>



-- 
Alexey Kuznetsov