You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Andrei Sereda <an...@sereda.cc> on 2018/06/28 16:57:22 UTC

Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Hello,

Elastic announced
<https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html>
that they will be deprecating mapping types in ES6 and indexes will be
single-typed only.

Historical analogy <https://www.elastic.co/blog/index-vs-type> between
RDBMS and elastic was that index is equivalent to a database and type
corresponds to table in that database. In a couple of releases (ES6-8) this
shall not longer be true.

Recent SQL addition
<https://www.elastic.co/blog/elasticsearch-6-3-0-released> to elastic confirms
this trend
<https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html>.
Index is equivalent to a table and there are no more ES types.

I would like to propose to include this logic in Calcite ES adapter. IE,
expose each ES single-typed index as a separate table inside calcite
schema. This is in contrast to  current integration where schema can only
have a single index. Current approach forces you to create multiple schemas
to query single-typed indexes (on the same ES cluster).

Legacy compatibility can always be controlled with configuration parameters.

Do you agree with such changes ? If yes, would you consider a PR ?

Regards,
Andrei.

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Andrei Sereda <an...@sereda.cc>.
Let's assume we have the following indexes  / types in ES (same cluster)

(Index with two types. Legacy format. Not supported in ES8+)
- index1: t1 + t2

(Index with single type. Default and most used format)
- index2: t1

(Index with two types having same name as index1)
- index3: t1 + t2

Technically these are 5 separate tables in relational world (as of ES2-6).
User might want to query all of them within same calcite schema.

There are a couple of (not ideal) options:
1) Not support this scenario.
2) Force user to have unique (type?) names.
3) Somehow merge index and type into unique calcite table name.
4) Force user to have different schemas





On Fri, Jun 29, 2018 at 2:28 PM Christian Beikov <ch...@gmail.com>
wrote:

> I'm not sure what the benefit of allowing users to specify this scheme
> would be. We'd have to parse it, interpret it, make sure the expressions
> don't result conflicting names etc.
>
> IMO a simple mode configuration would be way easier to implement and
> probably cover 99% of the use cases.
>
>
> Mit freundlichen Grüßen,
> ------------------------------------------------------------------------
> *Christian Beikov*
> Am 29.06.2018 um 20:19 schrieb Julian Hyde:
> > Andrei,
> >
> > I'm not an ES user so I don't fully understand this issue, but my two
> > cents anyway...
> >
> > Can you show how those examples affect SQL against the ES adapter
> > and/or how they affect JSON models?
> >
> > You seem to be using '_' as a separator character. Are we sure that
> > people will never use it in index or type name? Separator characters
> > often cause problems.
> >
> > Julian
> >
> >
> >
> >
> > On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc>
> wrote:
> >> I agree there should be a configuration option. How about the following
> >> approach.
> >>
> >> Expose both variables ${index} and ${type} in configuration (JSON) and
> user
> >> will use them to generate table name in calcite schema.
> >>
> >> Example
> >> "table_name": "${type}" // current
> >> "table_name": "${index}" // new (default?)
> >> "table_name": "${index}_${type}" // most generic. supports multiple
> types
> >> per index
> >>
> >>
> >>
> >>
> >>
> >> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org> wrote:
> >>
> >>> I think it sounds like you and Andrei are in a good position to tackle
> this
> >>> one so I'm happy to have you both work on whatever solution you think
> is
> >>> best.
> >>>
> >>> --
> >>> Michael Mior
> >>> mmior@apache.org
> >>>
> >>>
> >>>
> >>> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
> christian.beikov@gmail.com
> >>> a écrit :
> >>>
> >>>> IMO the best solution would be to make it configurable by introducing
> a
> >>>> "table_mapping" config with values
> >>>>
> >>>>    * type - every type in the known indices is mapped as table
> >>>>    * index - every known index is mapped as table
> >>>>
> >>>> We'd probably also need a "type_field" configuration for defining
> which
> >>>> field to use for the type determination as one of the possible future
> >>>> ways to do things is to introduce a custom field:
> >>>>
> >>>>
> >>>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
> >>>> We already detect the ES version, so we can set a smart default for
> this
> >>>> setting. Let's make the index config param optional.
> >>>>
> >>>>    * When no index is given, we discover indexes, the default for
> >>>>      "table_mapping" then is "index"
> >>>>    * When index is given, the we only discover types according to the
> >>>>      "type_field" configuration and the default for "table_mapping" is
> >>>> "type"
> >>>>
> >>>> This would also allow to discover indexes but still use "type" as
> >>>> "table_mapping".
> >>>>
> >>>> What do you think?
> >>>>
> >>>> Mit freundlichen Grüßen,
> >>>>
> ------------------------------------------------------------------------
> >>>> *Christian Beikov*
> >>>> Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
> >>>>> Yes. There is an API to list all indexes / types in elastic. They can
> >>> be
> >>>>> automatically imported into a schema.
> >>>>>
> >>>>> What needs to be agreed upon is how to expose those elements in
> calcite
> >>>>> schema (naming / behaviour).
> >>>>>
> >>>>> 1) Many (most?) of setups are single type per index. Natural way to
> >>> name
> >>>>> would be  "elastic.$index" (elastic being schema name). Multiple
> >>> indexes
> >>>>> would be under same schema "elastic.index1" "elastic.index2" etc.
> >>>>>
> >>>>> 2) What if index has several types should they exported as calcite
> >>>> tables:
> >>>>> "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
> behaviour)
> >>>> as
> >>>>> "elastic.type1" and "elastic.type2". Or as subschema
> >>>>> "elastic.$index.type1" ?
> >>>>>
> >>>>> Now what if one has combination of (1) and (2) ?
> >>>>> Setup (2) is already deprecated (and will be unsupported in next
> >>> version)
> >>>>>
> >>>>> On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
> >>>> christian.beikov@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Is there an API to discover indexes? If there is, I'd suggest we
> >>> allow a
> >>>>>> config option that to make the adapter discover the possible
> indexes.
> >>>>>> We'd still have to adapt the code a bit, but internally, the schema
> >>>>>> could just keep a cache of type name to index name map and be able
> to
> >>>>>> support both scenarios.
> >>>>>>
> >>>>>>
> >>>>>> Mit freundlichen Grüßen,
> >>>>>>
> >>>
> ------------------------------------------------------------------------
> >>>>>> *Christian Beikov*
> >>>>>> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
> >>>>>>>> 1) What's the time horizon for the current adapter no longer
> working
> >>>>>> with these
> >>>>>>> changes to ES ?
> >>>>>>> Current adapter will be working for a while with existing setup.
> The
> >>>>>>> problem is nomenclature and ease of use.
> >>>>>>>
> >>>>>>> Their new SQL concepts mapping
> >>>>>>> <
> >>>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >>>>>>> drops
> >>>>>>> the notion of ES type (which before was equivalent of RDBMS table)
> >>> and
> >>>>>> uses
> >>>>>>> ES index as new table equivalent (before ES index was equal to
> >>>> database).
> >>>>>>> Most users use elastic this way (one type , one index) index ==
> >>> table.
> >>>>>>> Currently calcite requires schema per index. In RDBMS parlance
> >>> database
> >>>>>> per
> >>>>>>> table (I'd like to change that).
> >>>>>>>
> >>>>>>>> 2) Any guess how complicated it would be to maintain code paths
> for
> >>>> both
> >>>>>>>> behaviours? I know this is probably really challenging to
> estimate,
> >>>> but
> >>>>>> I
> >>>>>>>> really have no idea of the scope of these changes. Would it mean
> two
> >>>>>>>> different ES adapters?
> >>>>>>> One can have just a separate calcite schema implementations (same
> >>>>>> adapter /
> >>>>>>> module) :
> >>>>>>> 1)  LegacySchema (old). Schema can have only one index (but
> multiple
> >>>>>>> types). Type == table in this case.
> >>>>>>> 2)  NewSchema (new). Single schema can have multiple indexes (type
> is
> >>>>>>> dropped). Index == table in this case
> >>>>>>>
> >>>>>>>> 3) Do we really need compatibility with the current version of the
> >>>>>>> adapter?
> >>>>>>>> IMO this depends on what versions of ES we would lose support for
> >>> and
> >>>>>> how
> >>>>>>>> complex it would be for users of the current ES adapter to make
> >>>> updates
> >>>>>>> for
> >>>>>>>> any Calcite API changes.
> >>>>>>> The issue is not in adapter but how calcite schema exposes tables.
> >>>>>> Should
> >>>>>>> it expose index as individual table (new), or ES type (old) ?
> >>>>>>>
> >>>>>>> Andrei.
> >>>>>>>
> >>>>>>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org>
> >>> wrote:
> >>>>>>>> Unfortunately I know very little about ES so I'm not in a great
> >>>>>> position to
> >>>>>>>> asses the impact of these changes. I will say that that legacy
> >>>>>>>> compatibility is great, but maintaining two sets of logic is
> always
> >>> a
> >>>>>>>> challenge. A few follow up questions:
> >>>>>>>>
> >>>>>>>> 1) What's the time horizon for the current adapter no longer
> working
> >>>>>> with
> >>>>>>>> these changes to ES?
> >>>>>>>>
> >>>>>>>> 2) Any guess how complicated it would be to maintain code paths
> for
> >>>> both
> >>>>>>>> behaviours? I know this is probably really challenging to
> estimate,
> >>>> but
> >>>>>> I
> >>>>>>>> really have no idea of the scope of these changes. Would it mean
> two
> >>>>>>>> different ES adapters?
> >>>>>>>>
> >>>>>>>> 3) Do we really need compatibility with the current version of the
> >>>>>> adapter?
> >>>>>>>> IMO this depends on what versions of ES we would lose support for
> >>> and
> >>>>>> how
> >>>>>>>> complex it would be for users of the current ES adapter to make
> >>>> updates
> >>>>>> for
> >>>>>>>> any Calcite API changes.
> >>>>>>>>
> >>>>>>>> Thanks for your continued work on the ES adapter Andrei!
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Michael Mior
> >>>>>>>> mmior@apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a
> >>>> écrit
> >>>>>> :
> >>>>>>>>> Hello,
> >>>>>>>>>
> >>>>>>>>> Elastic announced
> >>>>>>>>> <
> >>>>>>>>>
> >>>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> >>>>>>>>> that they will be deprecating mapping types in ES6 and indexes
> will
> >>>> be
> >>>>>>>>> single-typed only.
> >>>>>>>>>
> >>>>>>>>> Historical analogy <https://www.elastic.co/blog/index-vs-type>
> >>>> between
> >>>>>>>>> RDBMS and elastic was that index is equivalent to a database and
> >>> type
> >>>>>>>>> corresponds to table in that database. In a couple of releases
> >>>> (ES6-8)
> >>>>>>>> this
> >>>>>>>>> shall not longer be true.
> >>>>>>>>>
> >>>>>>>>> Recent SQL addition
> >>>>>>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to
> >>>> elastic
> >>>>>>>>> confirms
> >>>>>>>>> this trend
> >>>>>>>>> <
> >>>>>>>>>
> >>>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >>>>>>>>>> .
> >>>>>>>>> Index is equivalent to a table and there are no more ES types.
> >>>>>>>>>
> >>>>>>>>> I would like to propose to include this logic in Calcite ES
> >>> adapter.
> >>>>>> IE,
> >>>>>>>>> expose each ES single-typed index as a separate table inside
> >>> calcite
> >>>>>>>>> schema. This is in contrast to  current integration where schema
> >>> can
> >>>>>> only
> >>>>>>>>> have a single index. Current approach forces you to create
> multiple
> >>>>>>>> schemas
> >>>>>>>>> to query single-typed indexes (on the same ES cluster).
> >>>>>>>>>
> >>>>>>>>> Legacy compatibility can always be controlled with configuration
> >>>>>>>>> parameters.
> >>>>>>>>>
> >>>>>>>>> Do you agree with such changes ? If yes, would you consider a PR
> ?
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Andrei.
> >>>>>>>>>
> >>>>
>
>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Christian Beikov <ch...@gmail.com>.
I'm not sure what the benefit of allowing users to specify this scheme 
would be. We'd have to parse it, interpret it, make sure the expressions 
don't result conflicting names etc.

IMO a simple mode configuration would be way easier to implement and 
probably cover 99% of the use cases.


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 29.06.2018 um 20:19 schrieb Julian Hyde:
> Andrei,
>
> I'm not an ES user so I don't fully understand this issue, but my two
> cents anyway...
>
> Can you show how those examples affect SQL against the ES adapter
> and/or how they affect JSON models?
>
> You seem to be using '_' as a separator character. Are we sure that
> people will never use it in index or type name? Separator characters
> often cause problems.
>
> Julian
>
>
>
>
> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc> wrote:
>> I agree there should be a configuration option. How about the following
>> approach.
>>
>> Expose both variables ${index} and ${type} in configuration (JSON) and user
>> will use them to generate table name in calcite schema.
>>
>> Example
>> "table_name": "${type}" // current
>> "table_name": "${index}" // new (default?)
>> "table_name": "${index}_${type}" // most generic. supports multiple types
>> per index
>>
>>
>>
>>
>>
>> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org> wrote:
>>
>>> I think it sounds like you and Andrei are in a good position to tackle this
>>> one so I'm happy to have you both work on whatever solution you think is
>>> best.
>>>
>>> --
>>> Michael Mior
>>> mmior@apache.org
>>>
>>>
>>>
>>> Le ven. 29 juin 2018 à 04:19, Christian Beikov <christian.beikov@gmail.com
>>> a écrit :
>>>
>>>> IMO the best solution would be to make it configurable by introducing a
>>>> "table_mapping" config with values
>>>>
>>>>    * type - every type in the known indices is mapped as table
>>>>    * index - every known index is mapped as table
>>>>
>>>> We'd probably also need a "type_field" configuration for defining which
>>>> field to use for the type determination as one of the possible future
>>>> ways to do things is to introduce a custom field:
>>>>
>>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
>>>> We already detect the ES version, so we can set a smart default for this
>>>> setting. Let's make the index config param optional.
>>>>
>>>>    * When no index is given, we discover indexes, the default for
>>>>      "table_mapping" then is "index"
>>>>    * When index is given, the we only discover types according to the
>>>>      "type_field" configuration and the default for "table_mapping" is
>>>> "type"
>>>>
>>>> This would also allow to discover indexes but still use "type" as
>>>> "table_mapping".
>>>>
>>>> What do you think?
>>>>
>>>> Mit freundlichen Grüßen,
>>>> ------------------------------------------------------------------------
>>>> *Christian Beikov*
>>>> Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
>>>>> Yes. There is an API to list all indexes / types in elastic. They can
>>> be
>>>>> automatically imported into a schema.
>>>>>
>>>>> What needs to be agreed upon is how to expose those elements in calcite
>>>>> schema (naming / behaviour).
>>>>>
>>>>> 1) Many (most?) of setups are single type per index. Natural way to
>>> name
>>>>> would be  "elastic.$index" (elastic being schema name). Multiple
>>> indexes
>>>>> would be under same schema "elastic.index1" "elastic.index2" etc.
>>>>>
>>>>> 2) What if index has several types should they exported as calcite
>>>> tables:
>>>>> "elastic.$index_type1" "elastic.$index_type2" ?  Or (current behaviour)
>>>> as
>>>>> "elastic.type1" and "elastic.type2". Or as subschema
>>>>> "elastic.$index.type1" ?
>>>>>
>>>>> Now what if one has combination of (1) and (2) ?
>>>>> Setup (2) is already deprecated (and will be unsupported in next
>>> version)
>>>>>
>>>>> On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
>>>> christian.beikov@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Is there an API to discover indexes? If there is, I'd suggest we
>>> allow a
>>>>>> config option that to make the adapter discover the possible indexes.
>>>>>> We'd still have to adapt the code a bit, but internally, the schema
>>>>>> could just keep a cache of type name to index name map and be able to
>>>>>> support both scenarios.
>>>>>>
>>>>>>
>>>>>> Mit freundlichen Grüßen,
>>>>>>
>>> ------------------------------------------------------------------------
>>>>>> *Christian Beikov*
>>>>>> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
>>>>>>>> 1) What's the time horizon for the current adapter no longer working
>>>>>> with these
>>>>>>> changes to ES ?
>>>>>>> Current adapter will be working for a while with existing setup. The
>>>>>>> problem is nomenclature and ease of use.
>>>>>>>
>>>>>>> Their new SQL concepts mapping
>>>>>>> <
>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>>>>>>> drops
>>>>>>> the notion of ES type (which before was equivalent of RDBMS table)
>>> and
>>>>>> uses
>>>>>>> ES index as new table equivalent (before ES index was equal to
>>>> database).
>>>>>>> Most users use elastic this way (one type , one index) index ==
>>> table.
>>>>>>> Currently calcite requires schema per index. In RDBMS parlance
>>> database
>>>>>> per
>>>>>>> table (I'd like to change that).
>>>>>>>
>>>>>>>> 2) Any guess how complicated it would be to maintain code paths for
>>>> both
>>>>>>>> behaviours? I know this is probably really challenging to estimate,
>>>> but
>>>>>> I
>>>>>>>> really have no idea of the scope of these changes. Would it mean two
>>>>>>>> different ES adapters?
>>>>>>> One can have just a separate calcite schema implementations (same
>>>>>> adapter /
>>>>>>> module) :
>>>>>>> 1)  LegacySchema (old). Schema can have only one index (but multiple
>>>>>>> types). Type == table in this case.
>>>>>>> 2)  NewSchema (new). Single schema can have multiple indexes (type is
>>>>>>> dropped). Index == table in this case
>>>>>>>
>>>>>>>> 3) Do we really need compatibility with the current version of the
>>>>>>> adapter?
>>>>>>>> IMO this depends on what versions of ES we would lose support for
>>> and
>>>>>> how
>>>>>>>> complex it would be for users of the current ES adapter to make
>>>> updates
>>>>>>> for
>>>>>>>> any Calcite API changes.
>>>>>>> The issue is not in adapter but how calcite schema exposes tables.
>>>>>> Should
>>>>>>> it expose index as individual table (new), or ES type (old) ?
>>>>>>>
>>>>>>> Andrei.
>>>>>>>
>>>>>>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org>
>>> wrote:
>>>>>>>> Unfortunately I know very little about ES so I'm not in a great
>>>>>> position to
>>>>>>>> asses the impact of these changes. I will say that that legacy
>>>>>>>> compatibility is great, but maintaining two sets of logic is always
>>> a
>>>>>>>> challenge. A few follow up questions:
>>>>>>>>
>>>>>>>> 1) What's the time horizon for the current adapter no longer working
>>>>>> with
>>>>>>>> these changes to ES?
>>>>>>>>
>>>>>>>> 2) Any guess how complicated it would be to maintain code paths for
>>>> both
>>>>>>>> behaviours? I know this is probably really challenging to estimate,
>>>> but
>>>>>> I
>>>>>>>> really have no idea of the scope of these changes. Would it mean two
>>>>>>>> different ES adapters?
>>>>>>>>
>>>>>>>> 3) Do we really need compatibility with the current version of the
>>>>>> adapter?
>>>>>>>> IMO this depends on what versions of ES we would lose support for
>>> and
>>>>>> how
>>>>>>>> complex it would be for users of the current ES adapter to make
>>>> updates
>>>>>> for
>>>>>>>> any Calcite API changes.
>>>>>>>>
>>>>>>>> Thanks for your continued work on the ES adapter Andrei!
>>>>>>>>
>>>>>>>> --
>>>>>>>> Michael Mior
>>>>>>>> mmior@apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a
>>>> écrit
>>>>>> :
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Elastic announced
>>>>>>>>> <
>>>>>>>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
>>>>>>>>> that they will be deprecating mapping types in ES6 and indexes will
>>>> be
>>>>>>>>> single-typed only.
>>>>>>>>>
>>>>>>>>> Historical analogy <https://www.elastic.co/blog/index-vs-type>
>>>> between
>>>>>>>>> RDBMS and elastic was that index is equivalent to a database and
>>> type
>>>>>>>>> corresponds to table in that database. In a couple of releases
>>>> (ES6-8)
>>>>>>>> this
>>>>>>>>> shall not longer be true.
>>>>>>>>>
>>>>>>>>> Recent SQL addition
>>>>>>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to
>>>> elastic
>>>>>>>>> confirms
>>>>>>>>> this trend
>>>>>>>>> <
>>>>>>>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>>>>>>>>>> .
>>>>>>>>> Index is equivalent to a table and there are no more ES types.
>>>>>>>>>
>>>>>>>>> I would like to propose to include this logic in Calcite ES
>>> adapter.
>>>>>> IE,
>>>>>>>>> expose each ES single-typed index as a separate table inside
>>> calcite
>>>>>>>>> schema. This is in contrast to  current integration where schema
>>> can
>>>>>> only
>>>>>>>>> have a single index. Current approach forces you to create multiple
>>>>>>>> schemas
>>>>>>>>> to query single-typed indexes (on the same ES cluster).
>>>>>>>>>
>>>>>>>>> Legacy compatibility can always be controlled with configuration
>>>>>>>>> parameters.
>>>>>>>>>
>>>>>>>>> Do you agree with such changes ? If yes, would you consider a PR ?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Andrei.
>>>>>>>>>
>>>>


Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Christian Beikov <ch...@gmail.com>.
I like the idea of the regex filter, might be cool to have something 
like that in general for all adapters, but it's fine if you do it just 
for ES now. I guess you are considering include and exclude pattern 
parameters?

I'm more for a mode parameter and not let the user decide the name 
explicitly. Either the types or the index names will have to have 
meaningful unique names or the user will have to map certain indexes to 
a different schema. IMO that's a good solution.


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 30.06.2018 um 16:43 schrieb Andrei Sereda:
> Christian / Michael,
>
> Can you please weight-in for your preferred solution and I'll implement it.
>
> One more question. Sometimes it is nice to be able to filter (limit)
> indexes (tables) exposed by calcite. Say my cluster has 10 indexes but I
> want user to query only one. Would you be opposed if I add configuration
> parameter which allows to specify a (eg. regexp) filter for ES indexes ?
>
>
> On Fri, Jun 29, 2018 at 11:17 PM Andrei Sereda <an...@sereda.cc> wrote:
>
>> That's a reasonable alternative.
>>
>> On Fri, Jun 29, 2018 at 7:57 PM Julian Hyde <jh...@apache.org> wrote:
>>
>>> Maybe there could be a separator char as one of the adapter’s parameters.
>>> People should choose a value, say ‘$’ or ‘#’, that is legal in an unquoted
>>> SQL identifier but does not occur in any of their index or type names.
>>>
>>> If not specified, the adapter would end up in a simple mode, say looking
>>> for indexes first, then looking for types, and people would need to make
>>> sure indexes and types have distinct names. After the transition to
>>> single-type indexes, people could stop using the parameter.
>>>
>>> Julian
>>>
>>>
>>>> On Jun 29, 2018, at 4:43 PM, Andrei Sereda <an...@sereda.cc> wrote:
>>>>
>>>> That's a valid point. Then user would define a different pattern like
>>>> "i$index_t$type" for his cluster.
>>>>
>>>> I think  we should first answer wherever such scenarios should be
>>> supported
>>>> by calcite (given that they're already deprecated by the vendor). If
>>> yes,
>>>> what should be collision strategy ? User defined pattern like above or
>>>> failure or auto generated name ?
>>>>
>>>> On Fri, Jun 29, 2018, 19:14 Julian Hyde <jh...@apache.org> wrote:
>>>>
>>>>>> In elastic (index/type) pair is guaranteed to be unique therefore
>>>>>> "${index}_${type}" will be also unique (as string). This is only
>>>>> necessary
>>>>>> when we have several types per index. Valid question is wherever user
>>>>>> should be allowed such flexibility.
>>>>> Uniqueness is not my concern.
>>>>>
>>>>> Suppose there is an index called "x_y" with a type called "z", and
>>>>> another index called "x" with a type called "y_z". If I write "x_y_z"
>>>>> it's not clear how it should be broken into index/type.
>>>>>
>>>>>
>>>>> On Fri, Jun 29, 2018 at 3:15 PM, Andrei Sereda <an...@sereda.cc>
>>> wrote:
>>>>>>> Can you show how those examples affect SQL against the ES adapter
>>> and/or
>>>>>> how they affect JSON models?
>>>>>>
>>>>>> The discussion is how to properly bridge (index/type) concept from ES
>>>>> into
>>>>>> relational world. Proposal to use placeholders ($index / $type)
>>> affects
>>>>>> only how table is named in calcite. They're not used as SQL literals.
>>> IE
>>>>> it
>>>>>> affects only configuration phase of the schema.
>>>>>> Pretty much we're doing string/replace to derive table name from
>>>>>> ($index/$type).
>>>>>>
>>>>>>> You seem to be using '_' as a separator character. Are we sure that
>>>>>>> people will never use it in index or type name? Separator characters
>>>>>>> often cause problems.
>>>>>> In elastic (index/type) pair is guaranteed to be unique therefore
>>>>>> "${index}_${type}" will be also unique (as string). This is only
>>>>> necessary
>>>>>> when we have several types per index. Valid question is wherever user
>>>>>> should be allowed such flexibility.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jh...@apache.org> wrote:
>>>>>>
>>>>>>> Andrei,
>>>>>>>
>>>>>>> I'm not an ES user so I don't fully understand this issue, but my two
>>>>>>> cents anyway...
>>>>>>>
>>>>>>> Can you show how those examples affect SQL against the ES adapter
>>>>>>> and/or how they affect JSON models?
>>>>>>>
>>>>>>> You seem to be using '_' as a separator character. Are we sure that
>>>>>>> people will never use it in index or type name? Separator characters
>>>>>>> often cause problems.
>>>>>>>
>>>>>>> Julian
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc>
>>>>> wrote:
>>>>>>>> I agree there should be a configuration option. How about the
>>>>> following
>>>>>>>> approach.
>>>>>>>>
>>>>>>>> Expose both variables ${index} and ${type} in configuration (JSON)
>>> and
>>>>>>> user
>>>>>>>> will use them to generate table name in calcite schema.
>>>>>>>>
>>>>>>>> Example
>>>>>>>> "table_name": "${type}" // current
>>>>>>>> "table_name": "${index}" // new (default?)
>>>>>>>> "table_name": "${index}_${type}" // most generic. supports multiple
>>>>> types
>>>>>>>> per index
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org>
>>>>> wrote:
>>>>>>>>> I think it sounds like you and Andrei are in a good position to
>>>>> tackle
>>>>>>> this
>>>>>>>>> one so I'm happy to have you both work on whatever solution you
>>>>> think is
>>>>>>>>> best.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Michael Mior
>>>>>>>>> mmior@apache.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
>>>>>>> christian.beikov@gmail.com
>>>>>>>>> a écrit :
>>>>>>>>>
>>>>>>>>>> IMO the best solution would be to make it configurable by
>>>>> introducing
>>>>>>> a
>>>>>>>>>> "table_mapping" config with values
>>>>>>>>>>
>>>>>>>>>>   * type - every type in the known indices is mapped as table
>>>>>>>>>>   * index - every known index is mapped as table
>>>>>>>>>>
>>>>>>>>>> We'd probably also need a "type_field" configuration for defining
>>>>>>> which
>>>>>>>>>> field to use for the type determination as one of the possible
>>>>> future
>>>>>>>>>> ways to do things is to introduce a custom field:
>>>>>>>>>>
>>>>>>>>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
>>>>>>>>>> We already detect the ES version, so we can set a smart default
>>> for
>>>>>>> this
>>>>>>>>>> setting. Let's make the index config param optional.
>>>>>>>>>>
>>>>>>>>>>   * When no index is given, we discover indexes, the default for
>>>>>>>>>>     "table_mapping" then is "index"
>>>>>>>>>>   * When index is given, the we only discover types according to
>>>>> the
>>>>>>>>>>     "type_field" configuration and the default for "table_mapping"
>>>>> is
>>>>>>>>>> "type"
>>>>>>>>>>
>>>>>>>>>> This would also allow to discover indexes but still use "type" as
>>>>>>>>>> "table_mapping".
>>>>>>>>>>
>>>>>>>>>> What do you think?
>>>>>>>>>>
>>>>>>>>>> Mit freundlichen Grüßen,
>>>>>>>>>>
>>> ------------------------------------------------------------------------
>>>>>>>>>> *Christian Beikov*
>>>>>>>>>> Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
>>>>>>>>>>> Yes. There is an API to list all indexes / types in elastic. They
>>>>>>> can
>>>>>>>>> be
>>>>>>>>>>> automatically imported into a schema.
>>>>>>>>>>>
>>>>>>>>>>> What needs to be agreed upon is how to expose those elements in
>>>>>>> calcite
>>>>>>>>>>> schema (naming / behaviour).
>>>>>>>>>>>
>>>>>>>>>>> 1) Many (most?) of setups are single type per index. Natural way
>>>>> to
>>>>>>>>> name
>>>>>>>>>>> would be  "elastic.$index" (elastic being schema name). Multiple
>>>>>>>>> indexes
>>>>>>>>>>> would be under same schema "elastic.index1" "elastic.index2" etc.
>>>>>>>>>>>
>>>>>>>>>>> 2) What if index has several types should they exported as
>>>>> calcite
>>>>>>>>>> tables:
>>>>>>>>>>> "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
>>>>>>> behaviour)
>>>>>>>>>> as
>>>>>>>>>>> "elastic.type1" and "elastic.type2". Or as subschema
>>>>>>>>>>> "elastic.$index.type1" ?
>>>>>>>>>>>
>>>>>>>>>>> Now what if one has combination of (1) and (2) ?
>>>>>>>>>>> Setup (2) is already deprecated (and will be unsupported in next
>>>>>>>>> version)
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
>>>>>>>>>> christian.beikov@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Is there an API to discover indexes? If there is, I'd suggest we
>>>>>>>>> allow a
>>>>>>>>>>>> config option that to make the adapter discover the possible
>>>>>>> indexes.
>>>>>>>>>>>> We'd still have to adapt the code a bit, but internally, the
>>>>> schema
>>>>>>>>>>>> could just keep a cache of type name to index name map and be
>>>>> able
>>>>>>> to
>>>>>>>>>>>> support both scenarios.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Mit freundlichen Grüßen,
>>>>>>>>>>>>
>>> ------------------------------------------------------------------------
>>>>>>>>>>>> *Christian Beikov*
>>>>>>>>>>>> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
>>>>>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
>>>>>>> working
>>>>>>>>>>>> with these
>>>>>>>>>>>>> changes to ES ?
>>>>>>>>>>>>> Current adapter will be working for a while with existing
>>>>> setup.
>>>>>>> The
>>>>>>>>>>>>> problem is nomenclature and ease of use.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Their new SQL concepts mapping
>>>>>>>>>>>>> <
>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>>>>>>>>>>>>> drops
>>>>>>>>>>>>> the notion of ES type (which before was equivalent of RDBMS
>>>>> table)
>>>>>>>>> and
>>>>>>>>>>>> uses
>>>>>>>>>>>>> ES index as new table equivalent (before ES index was equal to
>>>>>>>>>> database).
>>>>>>>>>>>>> Most users use elastic this way (one type , one index) index ==
>>>>>>>>> table.
>>>>>>>>>>>>> Currently calcite requires schema per index. In RDBMS parlance
>>>>>>>>> database
>>>>>>>>>>>> per
>>>>>>>>>>>>> table (I'd like to change that).
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
>>>>> paths
>>>>>>> for
>>>>>>>>>> both
>>>>>>>>>>>>>> behaviours? I know this is probably really challenging to
>>>>>>> estimate,
>>>>>>>>>> but
>>>>>>>>>>>> I
>>>>>>>>>>>>>> really have no idea of the scope of these changes. Would it
>>>>> mean
>>>>>>> two
>>>>>>>>>>>>>> different ES adapters?
>>>>>>>>>>>>> One can have just a separate calcite schema implementations
>>>>> (same
>>>>>>>>>>>> adapter /
>>>>>>>>>>>>> module) :
>>>>>>>>>>>>> 1)  LegacySchema (old). Schema can have only one index (but
>>>>>>> multiple
>>>>>>>>>>>>> types). Type == table in this case.
>>>>>>>>>>>>> 2)  NewSchema (new). Single schema can have multiple indexes
>>>>>>> (type is
>>>>>>>>>>>>> dropped). Index == table in this case
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3) Do we really need compatibility with the current version of
>>>>>>> the
>>>>>>>>>>>>> adapter?
>>>>>>>>>>>>>> IMO this depends on what versions of ES we would lose support
>>>>> for
>>>>>>>>> and
>>>>>>>>>>>> how
>>>>>>>>>>>>>> complex it would be for users of the current ES adapter to
>>>>> make
>>>>>>>>>> updates
>>>>>>>>>>>>> for
>>>>>>>>>>>>>> any Calcite API changes.
>>>>>>>>>>>>> The issue is not in adapter but how calcite schema exposes
>>>>> tables.
>>>>>>>>>>>> Should
>>>>>>>>>>>>> it expose index as individual table (new), or ES type (old) ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Andrei.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mmior@apache.org
>>>>>>>>> wrote:
>>>>>>>>>>>>>> Unfortunately I know very little about ES so I'm not in a
>>>>> great
>>>>>>>>>>>> position to
>>>>>>>>>>>>>> asses the impact of these changes. I will say that that legacy
>>>>>>>>>>>>>> compatibility is great, but maintaining two sets of logic is
>>>>>>> always
>>>>>>>>> a
>>>>>>>>>>>>>> challenge. A few follow up questions:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
>>>>>>> working
>>>>>>>>>>>> with
>>>>>>>>>>>>>> these changes to ES?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
>>>>> paths
>>>>>>> for
>>>>>>>>>> both
>>>>>>>>>>>>>> behaviours? I know this is probably really challenging to
>>>>>>> estimate,
>>>>>>>>>> but
>>>>>>>>>>>> I
>>>>>>>>>>>>>> really have no idea of the scope of these changes. Would it
>>>>> mean
>>>>>>> two
>>>>>>>>>>>>>> different ES adapters?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3) Do we really need compatibility with the current version of
>>>>>>> the
>>>>>>>>>>>> adapter?
>>>>>>>>>>>>>> IMO this depends on what versions of ES we would lose support
>>>>> for
>>>>>>>>> and
>>>>>>>>>>>> how
>>>>>>>>>>>>>> complex it would be for users of the current ES adapter to
>>>>> make
>>>>>>>>>> updates
>>>>>>>>>>>> for
>>>>>>>>>>>>>> any Calcite API changes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for your continued work on the ES adapter Andrei!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Michael Mior
>>>>>>>>>>>>>> mmior@apache.org
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <andrei@sereda.cc
>>>>> a
>>>>>>>>>> écrit
>>>>>>>>>>>> :
>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Elastic announced
>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
>>>>>>>>>>>>>>> that they will be deprecating mapping types in ES6 and
>>>>> indexes
>>>>>>> will
>>>>>>>>>> be
>>>>>>>>>>>>>>> single-typed only.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Historical analogy <
>>>>> https://www.elastic.co/blog/index-vs-type>
>>>>>>>>>> between
>>>>>>>>>>>>>>> RDBMS and elastic was that index is equivalent to a database
>>>>> and
>>>>>>>>> type
>>>>>>>>>>>>>>> corresponds to table in that database. In a couple of
>>>>> releases
>>>>>>>>>> (ES6-8)
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>> shall not longer be true.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Recent SQL addition
>>>>>>>>>>>>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released>
>>>>> to
>>>>>>>>>> elastic
>>>>>>>>>>>>>>> confirms
>>>>>>>>>>>>>>> this trend
>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>> Index is equivalent to a table and there are no more ES
>>>>> types.
>>>>>>>>>>>>>>> I would like to propose to include this logic in Calcite ES
>>>>>>>>> adapter.
>>>>>>>>>>>> IE,
>>>>>>>>>>>>>>> expose each ES single-typed index as a separate table inside
>>>>>>>>> calcite
>>>>>>>>>>>>>>> schema. This is in contrast to  current integration where
>>>>> schema
>>>>>>>>> can
>>>>>>>>>>>> only
>>>>>>>>>>>>>>> have a single index. Current approach forces you to create
>>>>>>> multiple
>>>>>>>>>>>>>> schemas
>>>>>>>>>>>>>>> to query single-typed indexes (on the same ES cluster).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Legacy compatibility can always be controlled with
>>>>> configuration
>>>>>>>>>>>>>>> parameters.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Do you agree with such changes ? If yes, would you consider a
>>>>>>> PR ?
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Andrei.
>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>


Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Andrei Sereda <an...@sereda.cc>.
Christian / Michael,

Can you please weight-in for your preferred solution and I'll implement it.

One more question. Sometimes it is nice to be able to filter (limit)
indexes (tables) exposed by calcite. Say my cluster has 10 indexes but I
want user to query only one. Would you be opposed if I add configuration
parameter which allows to specify a (eg. regexp) filter for ES indexes ?


On Fri, Jun 29, 2018 at 11:17 PM Andrei Sereda <an...@sereda.cc> wrote:

> That's a reasonable alternative.
>
> On Fri, Jun 29, 2018 at 7:57 PM Julian Hyde <jh...@apache.org> wrote:
>
>> Maybe there could be a separator char as one of the adapter’s parameters.
>> People should choose a value, say ‘$’ or ‘#’, that is legal in an unquoted
>> SQL identifier but does not occur in any of their index or type names.
>>
>> If not specified, the adapter would end up in a simple mode, say looking
>> for indexes first, then looking for types, and people would need to make
>> sure indexes and types have distinct names. After the transition to
>> single-type indexes, people could stop using the parameter.
>>
>> Julian
>>
>>
>> > On Jun 29, 2018, at 4:43 PM, Andrei Sereda <an...@sereda.cc> wrote:
>> >
>> > That's a valid point. Then user would define a different pattern like
>> > "i$index_t$type" for his cluster.
>> >
>> > I think  we should first answer wherever such scenarios should be
>> supported
>> > by calcite (given that they're already deprecated by the vendor). If
>> yes,
>> > what should be collision strategy ? User defined pattern like above or
>> > failure or auto generated name ?
>> >
>> > On Fri, Jun 29, 2018, 19:14 Julian Hyde <jh...@apache.org> wrote:
>> >
>> >>> In elastic (index/type) pair is guaranteed to be unique therefore
>> >>> "${index}_${type}" will be also unique (as string). This is only
>> >> necessary
>> >>> when we have several types per index. Valid question is wherever user
>> >>> should be allowed such flexibility.
>> >>
>> >> Uniqueness is not my concern.
>> >>
>> >> Suppose there is an index called "x_y" with a type called "z", and
>> >> another index called "x" with a type called "y_z". If I write "x_y_z"
>> >> it's not clear how it should be broken into index/type.
>> >>
>> >>
>> >> On Fri, Jun 29, 2018 at 3:15 PM, Andrei Sereda <an...@sereda.cc>
>> wrote:
>> >>>> Can you show how those examples affect SQL against the ES adapter
>> and/or
>> >>> how they affect JSON models?
>> >>>
>> >>> The discussion is how to properly bridge (index/type) concept from ES
>> >> into
>> >>> relational world. Proposal to use placeholders ($index / $type)
>> affects
>> >>> only how table is named in calcite. They're not used as SQL literals.
>> IE
>> >> it
>> >>> affects only configuration phase of the schema.
>> >>> Pretty much we're doing string/replace to derive table name from
>> >>> ($index/$type).
>> >>>
>> >>>> You seem to be using '_' as a separator character. Are we sure that
>> >>>> people will never use it in index or type name? Separator characters
>> >>>> often cause problems.
>> >>> In elastic (index/type) pair is guaranteed to be unique therefore
>> >>> "${index}_${type}" will be also unique (as string). This is only
>> >> necessary
>> >>> when we have several types per index. Valid question is wherever user
>> >>> should be allowed such flexibility.
>> >>>
>> >>>
>> >>>
>> >>> On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jh...@apache.org> wrote:
>> >>>
>> >>>> Andrei,
>> >>>>
>> >>>> I'm not an ES user so I don't fully understand this issue, but my two
>> >>>> cents anyway...
>> >>>>
>> >>>> Can you show how those examples affect SQL against the ES adapter
>> >>>> and/or how they affect JSON models?
>> >>>>
>> >>>> You seem to be using '_' as a separator character. Are we sure that
>> >>>> people will never use it in index or type name? Separator characters
>> >>>> often cause problems.
>> >>>>
>> >>>> Julian
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc>
>> >> wrote:
>> >>>>> I agree there should be a configuration option. How about the
>> >> following
>> >>>>> approach.
>> >>>>>
>> >>>>> Expose both variables ${index} and ${type} in configuration (JSON)
>> and
>> >>>> user
>> >>>>> will use them to generate table name in calcite schema.
>> >>>>>
>> >>>>> Example
>> >>>>> "table_name": "${type}" // current
>> >>>>> "table_name": "${index}" // new (default?)
>> >>>>> "table_name": "${index}_${type}" // most generic. supports multiple
>> >> types
>> >>>>> per index
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org>
>> >> wrote:
>> >>>>>
>> >>>>>> I think it sounds like you and Andrei are in a good position to
>> >> tackle
>> >>>> this
>> >>>>>> one so I'm happy to have you both work on whatever solution you
>> >> think is
>> >>>>>> best.
>> >>>>>>
>> >>>>>> --
>> >>>>>> Michael Mior
>> >>>>>> mmior@apache.org
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
>> >>>> christian.beikov@gmail.com
>> >>>>>>>
>> >>>>>> a écrit :
>> >>>>>>
>> >>>>>>> IMO the best solution would be to make it configurable by
>> >> introducing
>> >>>> a
>> >>>>>>> "table_mapping" config with values
>> >>>>>>>
>> >>>>>>>  * type - every type in the known indices is mapped as table
>> >>>>>>>  * index - every known index is mapped as table
>> >>>>>>>
>> >>>>>>> We'd probably also need a "type_field" configuration for defining
>> >>>> which
>> >>>>>>> field to use for the type determination as one of the possible
>> >> future
>> >>>>>>> ways to do things is to introduce a custom field:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
>> >>>>>>>
>> >>>>>>> We already detect the ES version, so we can set a smart default
>> for
>> >>>> this
>> >>>>>>> setting. Let's make the index config param optional.
>> >>>>>>>
>> >>>>>>>  * When no index is given, we discover indexes, the default for
>> >>>>>>>    "table_mapping" then is "index"
>> >>>>>>>  * When index is given, the we only discover types according to
>> >> the
>> >>>>>>>    "type_field" configuration and the default for "table_mapping"
>> >> is
>> >>>>>>> "type"
>> >>>>>>>
>> >>>>>>> This would also allow to discover indexes but still use "type" as
>> >>>>>>> "table_mapping".
>> >>>>>>>
>> >>>>>>> What do you think?
>> >>>>>>>
>> >>>>>>> Mit freundlichen Grüßen,
>> >>>>>>>
>> >>>>
>> ------------------------------------------------------------------------
>> >>>>>>> *Christian Beikov*
>> >>>>>>> Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
>> >>>>>>>> Yes. There is an API to list all indexes / types in elastic. They
>> >>>> can
>> >>>>>> be
>> >>>>>>>> automatically imported into a schema.
>> >>>>>>>>
>> >>>>>>>> What needs to be agreed upon is how to expose those elements in
>> >>>> calcite
>> >>>>>>>> schema (naming / behaviour).
>> >>>>>>>>
>> >>>>>>>> 1) Many (most?) of setups are single type per index. Natural way
>> >> to
>> >>>>>> name
>> >>>>>>>> would be  "elastic.$index" (elastic being schema name). Multiple
>> >>>>>> indexes
>> >>>>>>>> would be under same schema "elastic.index1" "elastic.index2" etc.
>> >>>>>>>>
>> >>>>>>>> 2) What if index has several types should they exported as
>> >> calcite
>> >>>>>>> tables:
>> >>>>>>>> "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
>> >>>> behaviour)
>> >>>>>>> as
>> >>>>>>>> "elastic.type1" and "elastic.type2". Or as subschema
>> >>>>>>>> "elastic.$index.type1" ?
>> >>>>>>>>
>> >>>>>>>> Now what if one has combination of (1) and (2) ?
>> >>>>>>>> Setup (2) is already deprecated (and will be unsupported in next
>> >>>>>> version)
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
>> >>>>>>> christian.beikov@gmail.com>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> Is there an API to discover indexes? If there is, I'd suggest we
>> >>>>>> allow a
>> >>>>>>>>> config option that to make the adapter discover the possible
>> >>>> indexes.
>> >>>>>>>>> We'd still have to adapt the code a bit, but internally, the
>> >> schema
>> >>>>>>>>> could just keep a cache of type name to index name map and be
>> >> able
>> >>>> to
>> >>>>>>>>> support both scenarios.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Mit freundlichen Grüßen,
>> >>>>>>>>>
>> >>>>>>
>> >>
>> ------------------------------------------------------------------------
>> >>>>>>>>> *Christian Beikov*
>> >>>>>>>>> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
>> >>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
>> >>>> working
>> >>>>>>>>> with these
>> >>>>>>>>>> changes to ES ?
>> >>>>>>>>>> Current adapter will be working for a while with existing
>> >> setup.
>> >>>> The
>> >>>>>>>>>> problem is nomenclature and ease of use.
>> >>>>>>>>>>
>> >>>>>>>>>> Their new SQL concepts mapping
>> >>>>>>>>>> <
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>> >>>>>>>>>> drops
>> >>>>>>>>>> the notion of ES type (which before was equivalent of RDBMS
>> >> table)
>> >>>>>> and
>> >>>>>>>>> uses
>> >>>>>>>>>> ES index as new table equivalent (before ES index was equal to
>> >>>>>>> database).
>> >>>>>>>>>> Most users use elastic this way (one type , one index) index ==
>> >>>>>> table.
>> >>>>>>>>>>
>> >>>>>>>>>> Currently calcite requires schema per index. In RDBMS parlance
>> >>>>>> database
>> >>>>>>>>> per
>> >>>>>>>>>> table (I'd like to change that).
>> >>>>>>>>>>
>> >>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
>> >> paths
>> >>>> for
>> >>>>>>> both
>> >>>>>>>>>>> behaviours? I know this is probably really challenging to
>> >>>> estimate,
>> >>>>>>> but
>> >>>>>>>>> I
>> >>>>>>>>>>> really have no idea of the scope of these changes. Would it
>> >> mean
>> >>>> two
>> >>>>>>>>>>> different ES adapters?
>> >>>>>>>>>> One can have just a separate calcite schema implementations
>> >> (same
>> >>>>>>>>> adapter /
>> >>>>>>>>>> module) :
>> >>>>>>>>>> 1)  LegacySchema (old). Schema can have only one index (but
>> >>>> multiple
>> >>>>>>>>>> types). Type == table in this case.
>> >>>>>>>>>> 2)  NewSchema (new). Single schema can have multiple indexes
>> >>>> (type is
>> >>>>>>>>>> dropped). Index == table in this case
>> >>>>>>>>>>
>> >>>>>>>>>>> 3) Do we really need compatibility with the current version of
>> >>>> the
>> >>>>>>>>>> adapter?
>> >>>>>>>>>>> IMO this depends on what versions of ES we would lose support
>> >> for
>> >>>>>> and
>> >>>>>>>>> how
>> >>>>>>>>>>> complex it would be for users of the current ES adapter to
>> >> make
>> >>>>>>> updates
>> >>>>>>>>>> for
>> >>>>>>>>>>> any Calcite API changes.
>> >>>>>>>>>> The issue is not in adapter but how calcite schema exposes
>> >> tables.
>> >>>>>>>>> Should
>> >>>>>>>>>> it expose index as individual table (new), or ES type (old) ?
>> >>>>>>>>>>
>> >>>>>>>>>> Andrei.
>> >>>>>>>>>>
>> >>>>>>>>>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mmior@apache.org
>> >>>
>> >>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Unfortunately I know very little about ES so I'm not in a
>> >> great
>> >>>>>>>>> position to
>> >>>>>>>>>>> asses the impact of these changes. I will say that that legacy
>> >>>>>>>>>>> compatibility is great, but maintaining two sets of logic is
>> >>>> always
>> >>>>>> a
>> >>>>>>>>>>> challenge. A few follow up questions:
>> >>>>>>>>>>>
>> >>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
>> >>>> working
>> >>>>>>>>> with
>> >>>>>>>>>>> these changes to ES?
>> >>>>>>>>>>>
>> >>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
>> >> paths
>> >>>> for
>> >>>>>>> both
>> >>>>>>>>>>> behaviours? I know this is probably really challenging to
>> >>>> estimate,
>> >>>>>>> but
>> >>>>>>>>> I
>> >>>>>>>>>>> really have no idea of the scope of these changes. Would it
>> >> mean
>> >>>> two
>> >>>>>>>>>>> different ES adapters?
>> >>>>>>>>>>>
>> >>>>>>>>>>> 3) Do we really need compatibility with the current version of
>> >>>> the
>> >>>>>>>>> adapter?
>> >>>>>>>>>>> IMO this depends on what versions of ES we would lose support
>> >> for
>> >>>>>> and
>> >>>>>>>>> how
>> >>>>>>>>>>> complex it would be for users of the current ES adapter to
>> >> make
>> >>>>>>> updates
>> >>>>>>>>> for
>> >>>>>>>>>>> any Calcite API changes.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thanks for your continued work on the ES adapter Andrei!
>> >>>>>>>>>>>
>> >>>>>>>>>>> --
>> >>>>>>>>>>> Michael Mior
>> >>>>>>>>>>> mmior@apache.org
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <andrei@sereda.cc
>> >
>> >> a
>> >>>>>>> écrit
>> >>>>>>>>> :
>> >>>>>>>>>>>> Hello,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Elastic announced
>> >>>>>>>>>>>> <
>> >>>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
>> >>>>>>>>>>>> that they will be deprecating mapping types in ES6 and
>> >> indexes
>> >>>> will
>> >>>>>>> be
>> >>>>>>>>>>>> single-typed only.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Historical analogy <
>> >> https://www.elastic.co/blog/index-vs-type>
>> >>>>>>> between
>> >>>>>>>>>>>> RDBMS and elastic was that index is equivalent to a database
>> >> and
>> >>>>>> type
>> >>>>>>>>>>>> corresponds to table in that database. In a couple of
>> >> releases
>> >>>>>>> (ES6-8)
>> >>>>>>>>>>> this
>> >>>>>>>>>>>> shall not longer be true.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Recent SQL addition
>> >>>>>>>>>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released>
>> >> to
>> >>>>>>> elastic
>> >>>>>>>>>>>> confirms
>> >>>>>>>>>>>> this trend
>> >>>>>>>>>>>> <
>> >>>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>> >>>>>>>>>>>>> .
>> >>>>>>>>>>>> Index is equivalent to a table and there are no more ES
>> >> types.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I would like to propose to include this logic in Calcite ES
>> >>>>>> adapter.
>> >>>>>>>>> IE,
>> >>>>>>>>>>>> expose each ES single-typed index as a separate table inside
>> >>>>>> calcite
>> >>>>>>>>>>>> schema. This is in contrast to  current integration where
>> >> schema
>> >>>>>> can
>> >>>>>>>>> only
>> >>>>>>>>>>>> have a single index. Current approach forces you to create
>> >>>> multiple
>> >>>>>>>>>>> schemas
>> >>>>>>>>>>>> to query single-typed indexes (on the same ES cluster).
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Legacy compatibility can always be controlled with
>> >> configuration
>> >>>>>>>>>>>> parameters.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Do you agree with such changes ? If yes, would you consider a
>> >>>> PR ?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Regards,
>> >>>>>>>>>>>> Andrei.
>> >>>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>>
>>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Andrei Sereda <an...@sereda.cc>.
That's a reasonable alternative.

On Fri, Jun 29, 2018 at 7:57 PM Julian Hyde <jh...@apache.org> wrote:

> Maybe there could be a separator char as one of the adapter’s parameters.
> People should choose a value, say ‘$’ or ‘#’, that is legal in an unquoted
> SQL identifier but does not occur in any of their index or type names.
>
> If not specified, the adapter would end up in a simple mode, say looking
> for indexes first, then looking for types, and people would need to make
> sure indexes and types have distinct names. After the transition to
> single-type indexes, people could stop using the parameter.
>
> Julian
>
>
> > On Jun 29, 2018, at 4:43 PM, Andrei Sereda <an...@sereda.cc> wrote:
> >
> > That's a valid point. Then user would define a different pattern like
> > "i$index_t$type" for his cluster.
> >
> > I think  we should first answer wherever such scenarios should be
> supported
> > by calcite (given that they're already deprecated by the vendor). If yes,
> > what should be collision strategy ? User defined pattern like above or
> > failure or auto generated name ?
> >
> > On Fri, Jun 29, 2018, 19:14 Julian Hyde <jh...@apache.org> wrote:
> >
> >>> In elastic (index/type) pair is guaranteed to be unique therefore
> >>> "${index}_${type}" will be also unique (as string). This is only
> >> necessary
> >>> when we have several types per index. Valid question is wherever user
> >>> should be allowed such flexibility.
> >>
> >> Uniqueness is not my concern.
> >>
> >> Suppose there is an index called "x_y" with a type called "z", and
> >> another index called "x" with a type called "y_z". If I write "x_y_z"
> >> it's not clear how it should be broken into index/type.
> >>
> >>
> >> On Fri, Jun 29, 2018 at 3:15 PM, Andrei Sereda <an...@sereda.cc>
> wrote:
> >>>> Can you show how those examples affect SQL against the ES adapter
> and/or
> >>> how they affect JSON models?
> >>>
> >>> The discussion is how to properly bridge (index/type) concept from ES
> >> into
> >>> relational world. Proposal to use placeholders ($index / $type) affects
> >>> only how table is named in calcite. They're not used as SQL literals.
> IE
> >> it
> >>> affects only configuration phase of the schema.
> >>> Pretty much we're doing string/replace to derive table name from
> >>> ($index/$type).
> >>>
> >>>> You seem to be using '_' as a separator character. Are we sure that
> >>>> people will never use it in index or type name? Separator characters
> >>>> often cause problems.
> >>> In elastic (index/type) pair is guaranteed to be unique therefore
> >>> "${index}_${type}" will be also unique (as string). This is only
> >> necessary
> >>> when we have several types per index. Valid question is wherever user
> >>> should be allowed such flexibility.
> >>>
> >>>
> >>>
> >>> On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jh...@apache.org> wrote:
> >>>
> >>>> Andrei,
> >>>>
> >>>> I'm not an ES user so I don't fully understand this issue, but my two
> >>>> cents anyway...
> >>>>
> >>>> Can you show how those examples affect SQL against the ES adapter
> >>>> and/or how they affect JSON models?
> >>>>
> >>>> You seem to be using '_' as a separator character. Are we sure that
> >>>> people will never use it in index or type name? Separator characters
> >>>> often cause problems.
> >>>>
> >>>> Julian
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc>
> >> wrote:
> >>>>> I agree there should be a configuration option. How about the
> >> following
> >>>>> approach.
> >>>>>
> >>>>> Expose both variables ${index} and ${type} in configuration (JSON)
> and
> >>>> user
> >>>>> will use them to generate table name in calcite schema.
> >>>>>
> >>>>> Example
> >>>>> "table_name": "${type}" // current
> >>>>> "table_name": "${index}" // new (default?)
> >>>>> "table_name": "${index}_${type}" // most generic. supports multiple
> >> types
> >>>>> per index
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org>
> >> wrote:
> >>>>>
> >>>>>> I think it sounds like you and Andrei are in a good position to
> >> tackle
> >>>> this
> >>>>>> one so I'm happy to have you both work on whatever solution you
> >> think is
> >>>>>> best.
> >>>>>>
> >>>>>> --
> >>>>>> Michael Mior
> >>>>>> mmior@apache.org
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
> >>>> christian.beikov@gmail.com
> >>>>>>>
> >>>>>> a écrit :
> >>>>>>
> >>>>>>> IMO the best solution would be to make it configurable by
> >> introducing
> >>>> a
> >>>>>>> "table_mapping" config with values
> >>>>>>>
> >>>>>>>  * type - every type in the known indices is mapped as table
> >>>>>>>  * index - every known index is mapped as table
> >>>>>>>
> >>>>>>> We'd probably also need a "type_field" configuration for defining
> >>>> which
> >>>>>>> field to use for the type determination as one of the possible
> >> future
> >>>>>>> ways to do things is to introduce a custom field:
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
> >>>>>>>
> >>>>>>> We already detect the ES version, so we can set a smart default for
> >>>> this
> >>>>>>> setting. Let's make the index config param optional.
> >>>>>>>
> >>>>>>>  * When no index is given, we discover indexes, the default for
> >>>>>>>    "table_mapping" then is "index"
> >>>>>>>  * When index is given, the we only discover types according to
> >> the
> >>>>>>>    "type_field" configuration and the default for "table_mapping"
> >> is
> >>>>>>> "type"
> >>>>>>>
> >>>>>>> This would also allow to discover indexes but still use "type" as
> >>>>>>> "table_mapping".
> >>>>>>>
> >>>>>>> What do you think?
> >>>>>>>
> >>>>>>> Mit freundlichen Grüßen,
> >>>>>>>
> >>>>
> ------------------------------------------------------------------------
> >>>>>>> *Christian Beikov*
> >>>>>>> Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
> >>>>>>>> Yes. There is an API to list all indexes / types in elastic. They
> >>>> can
> >>>>>> be
> >>>>>>>> automatically imported into a schema.
> >>>>>>>>
> >>>>>>>> What needs to be agreed upon is how to expose those elements in
> >>>> calcite
> >>>>>>>> schema (naming / behaviour).
> >>>>>>>>
> >>>>>>>> 1) Many (most?) of setups are single type per index. Natural way
> >> to
> >>>>>> name
> >>>>>>>> would be  "elastic.$index" (elastic being schema name). Multiple
> >>>>>> indexes
> >>>>>>>> would be under same schema "elastic.index1" "elastic.index2" etc.
> >>>>>>>>
> >>>>>>>> 2) What if index has several types should they exported as
> >> calcite
> >>>>>>> tables:
> >>>>>>>> "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
> >>>> behaviour)
> >>>>>>> as
> >>>>>>>> "elastic.type1" and "elastic.type2". Or as subschema
> >>>>>>>> "elastic.$index.type1" ?
> >>>>>>>>
> >>>>>>>> Now what if one has combination of (1) and (2) ?
> >>>>>>>> Setup (2) is already deprecated (and will be unsupported in next
> >>>>>> version)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
> >>>>>>> christian.beikov@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Is there an API to discover indexes? If there is, I'd suggest we
> >>>>>> allow a
> >>>>>>>>> config option that to make the adapter discover the possible
> >>>> indexes.
> >>>>>>>>> We'd still have to adapt the code a bit, but internally, the
> >> schema
> >>>>>>>>> could just keep a cache of type name to index name map and be
> >> able
> >>>> to
> >>>>>>>>> support both scenarios.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Mit freundlichen Grüßen,
> >>>>>>>>>
> >>>>>>
> >> ------------------------------------------------------------------------
> >>>>>>>>> *Christian Beikov*
> >>>>>>>>> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
> >>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
> >>>> working
> >>>>>>>>> with these
> >>>>>>>>>> changes to ES ?
> >>>>>>>>>> Current adapter will be working for a while with existing
> >> setup.
> >>>> The
> >>>>>>>>>> problem is nomenclature and ease of use.
> >>>>>>>>>>
> >>>>>>>>>> Their new SQL concepts mapping
> >>>>>>>>>> <
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >>>>>>>>>> drops
> >>>>>>>>>> the notion of ES type (which before was equivalent of RDBMS
> >> table)
> >>>>>> and
> >>>>>>>>> uses
> >>>>>>>>>> ES index as new table equivalent (before ES index was equal to
> >>>>>>> database).
> >>>>>>>>>> Most users use elastic this way (one type , one index) index ==
> >>>>>> table.
> >>>>>>>>>>
> >>>>>>>>>> Currently calcite requires schema per index. In RDBMS parlance
> >>>>>> database
> >>>>>>>>> per
> >>>>>>>>>> table (I'd like to change that).
> >>>>>>>>>>
> >>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
> >> paths
> >>>> for
> >>>>>>> both
> >>>>>>>>>>> behaviours? I know this is probably really challenging to
> >>>> estimate,
> >>>>>>> but
> >>>>>>>>> I
> >>>>>>>>>>> really have no idea of the scope of these changes. Would it
> >> mean
> >>>> two
> >>>>>>>>>>> different ES adapters?
> >>>>>>>>>> One can have just a separate calcite schema implementations
> >> (same
> >>>>>>>>> adapter /
> >>>>>>>>>> module) :
> >>>>>>>>>> 1)  LegacySchema (old). Schema can have only one index (but
> >>>> multiple
> >>>>>>>>>> types). Type == table in this case.
> >>>>>>>>>> 2)  NewSchema (new). Single schema can have multiple indexes
> >>>> (type is
> >>>>>>>>>> dropped). Index == table in this case
> >>>>>>>>>>
> >>>>>>>>>>> 3) Do we really need compatibility with the current version of
> >>>> the
> >>>>>>>>>> adapter?
> >>>>>>>>>>> IMO this depends on what versions of ES we would lose support
> >> for
> >>>>>> and
> >>>>>>>>> how
> >>>>>>>>>>> complex it would be for users of the current ES adapter to
> >> make
> >>>>>>> updates
> >>>>>>>>>> for
> >>>>>>>>>>> any Calcite API changes.
> >>>>>>>>>> The issue is not in adapter but how calcite schema exposes
> >> tables.
> >>>>>>>>> Should
> >>>>>>>>>> it expose index as individual table (new), or ES type (old) ?
> >>>>>>>>>>
> >>>>>>>>>> Andrei.
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mmior@apache.org
> >>>
> >>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Unfortunately I know very little about ES so I'm not in a
> >> great
> >>>>>>>>> position to
> >>>>>>>>>>> asses the impact of these changes. I will say that that legacy
> >>>>>>>>>>> compatibility is great, but maintaining two sets of logic is
> >>>> always
> >>>>>> a
> >>>>>>>>>>> challenge. A few follow up questions:
> >>>>>>>>>>>
> >>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
> >>>> working
> >>>>>>>>> with
> >>>>>>>>>>> these changes to ES?
> >>>>>>>>>>>
> >>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
> >> paths
> >>>> for
> >>>>>>> both
> >>>>>>>>>>> behaviours? I know this is probably really challenging to
> >>>> estimate,
> >>>>>>> but
> >>>>>>>>> I
> >>>>>>>>>>> really have no idea of the scope of these changes. Would it
> >> mean
> >>>> two
> >>>>>>>>>>> different ES adapters?
> >>>>>>>>>>>
> >>>>>>>>>>> 3) Do we really need compatibility with the current version of
> >>>> the
> >>>>>>>>> adapter?
> >>>>>>>>>>> IMO this depends on what versions of ES we would lose support
> >> for
> >>>>>> and
> >>>>>>>>> how
> >>>>>>>>>>> complex it would be for users of the current ES adapter to
> >> make
> >>>>>>> updates
> >>>>>>>>> for
> >>>>>>>>>>> any Calcite API changes.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for your continued work on the ES adapter Andrei!
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Michael Mior
> >>>>>>>>>>> mmior@apache.org
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc>
> >> a
> >>>>>>> écrit
> >>>>>>>>> :
> >>>>>>>>>>>> Hello,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Elastic announced
> >>>>>>>>>>>> <
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> >>>>>>>>>>>> that they will be deprecating mapping types in ES6 and
> >> indexes
> >>>> will
> >>>>>>> be
> >>>>>>>>>>>> single-typed only.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Historical analogy <
> >> https://www.elastic.co/blog/index-vs-type>
> >>>>>>> between
> >>>>>>>>>>>> RDBMS and elastic was that index is equivalent to a database
> >> and
> >>>>>> type
> >>>>>>>>>>>> corresponds to table in that database. In a couple of
> >> releases
> >>>>>>> (ES6-8)
> >>>>>>>>>>> this
> >>>>>>>>>>>> shall not longer be true.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Recent SQL addition
> >>>>>>>>>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released>
> >> to
> >>>>>>> elastic
> >>>>>>>>>>>> confirms
> >>>>>>>>>>>> this trend
> >>>>>>>>>>>> <
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >>>>>>>>>>>>> .
> >>>>>>>>>>>> Index is equivalent to a table and there are no more ES
> >> types.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I would like to propose to include this logic in Calcite ES
> >>>>>> adapter.
> >>>>>>>>> IE,
> >>>>>>>>>>>> expose each ES single-typed index as a separate table inside
> >>>>>> calcite
> >>>>>>>>>>>> schema. This is in contrast to  current integration where
> >> schema
> >>>>>> can
> >>>>>>>>> only
> >>>>>>>>>>>> have a single index. Current approach forces you to create
> >>>> multiple
> >>>>>>>>>>> schemas
> >>>>>>>>>>>> to query single-typed indexes (on the same ES cluster).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Legacy compatibility can always be controlled with
> >> configuration
> >>>>>>>>>>>> parameters.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Do you agree with such changes ? If yes, would you consider a
> >>>> PR ?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Andrei.
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
>
>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Julian Hyde <jh...@apache.org>.
Maybe there could be a separator char as one of the adapter’s parameters. People should choose a value, say ‘$’ or ‘#’, that is legal in an unquoted SQL identifier but does not occur in any of their index or type names.

If not specified, the adapter would end up in a simple mode, say looking for indexes first, then looking for types, and people would need to make sure indexes and types have distinct names. After the transition to single-type indexes, people could stop using the parameter.

Julian


> On Jun 29, 2018, at 4:43 PM, Andrei Sereda <an...@sereda.cc> wrote:
> 
> That's a valid point. Then user would define a different pattern like
> "i$index_t$type" for his cluster.
> 
> I think  we should first answer wherever such scenarios should be supported
> by calcite (given that they're already deprecated by the vendor). If yes,
> what should be collision strategy ? User defined pattern like above or
> failure or auto generated name ?
> 
> On Fri, Jun 29, 2018, 19:14 Julian Hyde <jh...@apache.org> wrote:
> 
>>> In elastic (index/type) pair is guaranteed to be unique therefore
>>> "${index}_${type}" will be also unique (as string). This is only
>> necessary
>>> when we have several types per index. Valid question is wherever user
>>> should be allowed such flexibility.
>> 
>> Uniqueness is not my concern.
>> 
>> Suppose there is an index called "x_y" with a type called "z", and
>> another index called "x" with a type called "y_z". If I write "x_y_z"
>> it's not clear how it should be broken into index/type.
>> 
>> 
>> On Fri, Jun 29, 2018 at 3:15 PM, Andrei Sereda <an...@sereda.cc> wrote:
>>>> Can you show how those examples affect SQL against the ES adapter and/or
>>> how they affect JSON models?
>>> 
>>> The discussion is how to properly bridge (index/type) concept from ES
>> into
>>> relational world. Proposal to use placeholders ($index / $type) affects
>>> only how table is named in calcite. They're not used as SQL literals. IE
>> it
>>> affects only configuration phase of the schema.
>>> Pretty much we're doing string/replace to derive table name from
>>> ($index/$type).
>>> 
>>>> You seem to be using '_' as a separator character. Are we sure that
>>>> people will never use it in index or type name? Separator characters
>>>> often cause problems.
>>> In elastic (index/type) pair is guaranteed to be unique therefore
>>> "${index}_${type}" will be also unique (as string). This is only
>> necessary
>>> when we have several types per index. Valid question is wherever user
>>> should be allowed such flexibility.
>>> 
>>> 
>>> 
>>> On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jh...@apache.org> wrote:
>>> 
>>>> Andrei,
>>>> 
>>>> I'm not an ES user so I don't fully understand this issue, but my two
>>>> cents anyway...
>>>> 
>>>> Can you show how those examples affect SQL against the ES adapter
>>>> and/or how they affect JSON models?
>>>> 
>>>> You seem to be using '_' as a separator character. Are we sure that
>>>> people will never use it in index or type name? Separator characters
>>>> often cause problems.
>>>> 
>>>> Julian
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc>
>> wrote:
>>>>> I agree there should be a configuration option. How about the
>> following
>>>>> approach.
>>>>> 
>>>>> Expose both variables ${index} and ${type} in configuration (JSON) and
>>>> user
>>>>> will use them to generate table name in calcite schema.
>>>>> 
>>>>> Example
>>>>> "table_name": "${type}" // current
>>>>> "table_name": "${index}" // new (default?)
>>>>> "table_name": "${index}_${type}" // most generic. supports multiple
>> types
>>>>> per index
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org>
>> wrote:
>>>>> 
>>>>>> I think it sounds like you and Andrei are in a good position to
>> tackle
>>>> this
>>>>>> one so I'm happy to have you both work on whatever solution you
>> think is
>>>>>> best.
>>>>>> 
>>>>>> --
>>>>>> Michael Mior
>>>>>> mmior@apache.org
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
>>>> christian.beikov@gmail.com
>>>>>>> 
>>>>>> a écrit :
>>>>>> 
>>>>>>> IMO the best solution would be to make it configurable by
>> introducing
>>>> a
>>>>>>> "table_mapping" config with values
>>>>>>> 
>>>>>>>  * type - every type in the known indices is mapped as table
>>>>>>>  * index - every known index is mapped as table
>>>>>>> 
>>>>>>> We'd probably also need a "type_field" configuration for defining
>>>> which
>>>>>>> field to use for the type determination as one of the possible
>> future
>>>>>>> ways to do things is to introduce a custom field:
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
>>>>>>> 
>>>>>>> We already detect the ES version, so we can set a smart default for
>>>> this
>>>>>>> setting. Let's make the index config param optional.
>>>>>>> 
>>>>>>>  * When no index is given, we discover indexes, the default for
>>>>>>>    "table_mapping" then is "index"
>>>>>>>  * When index is given, the we only discover types according to
>> the
>>>>>>>    "type_field" configuration and the default for "table_mapping"
>> is
>>>>>>> "type"
>>>>>>> 
>>>>>>> This would also allow to discover indexes but still use "type" as
>>>>>>> "table_mapping".
>>>>>>> 
>>>>>>> What do you think?
>>>>>>> 
>>>>>>> Mit freundlichen Grüßen,
>>>>>>> 
>>>> ------------------------------------------------------------------------
>>>>>>> *Christian Beikov*
>>>>>>> Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
>>>>>>>> Yes. There is an API to list all indexes / types in elastic. They
>>>> can
>>>>>> be
>>>>>>>> automatically imported into a schema.
>>>>>>>> 
>>>>>>>> What needs to be agreed upon is how to expose those elements in
>>>> calcite
>>>>>>>> schema (naming / behaviour).
>>>>>>>> 
>>>>>>>> 1) Many (most?) of setups are single type per index. Natural way
>> to
>>>>>> name
>>>>>>>> would be  "elastic.$index" (elastic being schema name). Multiple
>>>>>> indexes
>>>>>>>> would be under same schema "elastic.index1" "elastic.index2" etc.
>>>>>>>> 
>>>>>>>> 2) What if index has several types should they exported as
>> calcite
>>>>>>> tables:
>>>>>>>> "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
>>>> behaviour)
>>>>>>> as
>>>>>>>> "elastic.type1" and "elastic.type2". Or as subschema
>>>>>>>> "elastic.$index.type1" ?
>>>>>>>> 
>>>>>>>> Now what if one has combination of (1) and (2) ?
>>>>>>>> Setup (2) is already deprecated (and will be unsupported in next
>>>>>> version)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
>>>>>>> christian.beikov@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Is there an API to discover indexes? If there is, I'd suggest we
>>>>>> allow a
>>>>>>>>> config option that to make the adapter discover the possible
>>>> indexes.
>>>>>>>>> We'd still have to adapt the code a bit, but internally, the
>> schema
>>>>>>>>> could just keep a cache of type name to index name map and be
>> able
>>>> to
>>>>>>>>> support both scenarios.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Mit freundlichen Grüßen,
>>>>>>>>> 
>>>>>> 
>> ------------------------------------------------------------------------
>>>>>>>>> *Christian Beikov*
>>>>>>>>> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
>>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
>>>> working
>>>>>>>>> with these
>>>>>>>>>> changes to ES ?
>>>>>>>>>> Current adapter will be working for a while with existing
>> setup.
>>>> The
>>>>>>>>>> problem is nomenclature and ease of use.
>>>>>>>>>> 
>>>>>>>>>> Their new SQL concepts mapping
>>>>>>>>>> <
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>>>>>>>>>> drops
>>>>>>>>>> the notion of ES type (which before was equivalent of RDBMS
>> table)
>>>>>> and
>>>>>>>>> uses
>>>>>>>>>> ES index as new table equivalent (before ES index was equal to
>>>>>>> database).
>>>>>>>>>> Most users use elastic this way (one type , one index) index ==
>>>>>> table.
>>>>>>>>>> 
>>>>>>>>>> Currently calcite requires schema per index. In RDBMS parlance
>>>>>> database
>>>>>>>>> per
>>>>>>>>>> table (I'd like to change that).
>>>>>>>>>> 
>>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
>> paths
>>>> for
>>>>>>> both
>>>>>>>>>>> behaviours? I know this is probably really challenging to
>>>> estimate,
>>>>>>> but
>>>>>>>>> I
>>>>>>>>>>> really have no idea of the scope of these changes. Would it
>> mean
>>>> two
>>>>>>>>>>> different ES adapters?
>>>>>>>>>> One can have just a separate calcite schema implementations
>> (same
>>>>>>>>> adapter /
>>>>>>>>>> module) :
>>>>>>>>>> 1)  LegacySchema (old). Schema can have only one index (but
>>>> multiple
>>>>>>>>>> types). Type == table in this case.
>>>>>>>>>> 2)  NewSchema (new). Single schema can have multiple indexes
>>>> (type is
>>>>>>>>>> dropped). Index == table in this case
>>>>>>>>>> 
>>>>>>>>>>> 3) Do we really need compatibility with the current version of
>>>> the
>>>>>>>>>> adapter?
>>>>>>>>>>> IMO this depends on what versions of ES we would lose support
>> for
>>>>>> and
>>>>>>>>> how
>>>>>>>>>>> complex it would be for users of the current ES adapter to
>> make
>>>>>>> updates
>>>>>>>>>> for
>>>>>>>>>>> any Calcite API changes.
>>>>>>>>>> The issue is not in adapter but how calcite schema exposes
>> tables.
>>>>>>>>> Should
>>>>>>>>>> it expose index as individual table (new), or ES type (old) ?
>>>>>>>>>> 
>>>>>>>>>> Andrei.
>>>>>>>>>> 
>>>>>>>>>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mmior@apache.org
>>> 
>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Unfortunately I know very little about ES so I'm not in a
>> great
>>>>>>>>> position to
>>>>>>>>>>> asses the impact of these changes. I will say that that legacy
>>>>>>>>>>> compatibility is great, but maintaining two sets of logic is
>>>> always
>>>>>> a
>>>>>>>>>>> challenge. A few follow up questions:
>>>>>>>>>>> 
>>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
>>>> working
>>>>>>>>> with
>>>>>>>>>>> these changes to ES?
>>>>>>>>>>> 
>>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
>> paths
>>>> for
>>>>>>> both
>>>>>>>>>>> behaviours? I know this is probably really challenging to
>>>> estimate,
>>>>>>> but
>>>>>>>>> I
>>>>>>>>>>> really have no idea of the scope of these changes. Would it
>> mean
>>>> two
>>>>>>>>>>> different ES adapters?
>>>>>>>>>>> 
>>>>>>>>>>> 3) Do we really need compatibility with the current version of
>>>> the
>>>>>>>>> adapter?
>>>>>>>>>>> IMO this depends on what versions of ES we would lose support
>> for
>>>>>> and
>>>>>>>>> how
>>>>>>>>>>> complex it would be for users of the current ES adapter to
>> make
>>>>>>> updates
>>>>>>>>> for
>>>>>>>>>>> any Calcite API changes.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for your continued work on the ES adapter Andrei!
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> Michael Mior
>>>>>>>>>>> mmior@apache.org
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc>
>> a
>>>>>>> écrit
>>>>>>>>> :
>>>>>>>>>>>> Hello,
>>>>>>>>>>>> 
>>>>>>>>>>>> Elastic announced
>>>>>>>>>>>> <
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
>>>>>>>>>>>> that they will be deprecating mapping types in ES6 and
>> indexes
>>>> will
>>>>>>> be
>>>>>>>>>>>> single-typed only.
>>>>>>>>>>>> 
>>>>>>>>>>>> Historical analogy <
>> https://www.elastic.co/blog/index-vs-type>
>>>>>>> between
>>>>>>>>>>>> RDBMS and elastic was that index is equivalent to a database
>> and
>>>>>> type
>>>>>>>>>>>> corresponds to table in that database. In a couple of
>> releases
>>>>>>> (ES6-8)
>>>>>>>>>>> this
>>>>>>>>>>>> shall not longer be true.
>>>>>>>>>>>> 
>>>>>>>>>>>> Recent SQL addition
>>>>>>>>>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released>
>> to
>>>>>>> elastic
>>>>>>>>>>>> confirms
>>>>>>>>>>>> this trend
>>>>>>>>>>>> <
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>>>>>>>>>>>>> .
>>>>>>>>>>>> Index is equivalent to a table and there are no more ES
>> types.
>>>>>>>>>>>> 
>>>>>>>>>>>> I would like to propose to include this logic in Calcite ES
>>>>>> adapter.
>>>>>>>>> IE,
>>>>>>>>>>>> expose each ES single-typed index as a separate table inside
>>>>>> calcite
>>>>>>>>>>>> schema. This is in contrast to  current integration where
>> schema
>>>>>> can
>>>>>>>>> only
>>>>>>>>>>>> have a single index. Current approach forces you to create
>>>> multiple
>>>>>>>>>>> schemas
>>>>>>>>>>>> to query single-typed indexes (on the same ES cluster).
>>>>>>>>>>>> 
>>>>>>>>>>>> Legacy compatibility can always be controlled with
>> configuration
>>>>>>>>>>>> parameters.
>>>>>>>>>>>> 
>>>>>>>>>>>> Do you agree with such changes ? If yes, would you consider a
>>>> PR ?
>>>>>>>>>>>> 
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Andrei.
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> 


Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Andrei Sereda <an...@sereda.cc>.
That's a valid point. Then user would define a different pattern like
"i$index_t$type" for his cluster.

I think  we should first answer wherever such scenarios should be supported
by calcite (given that they're already deprecated by the vendor). If yes,
what should be collision strategy ? User defined pattern like above or
failure or auto generated name ?

On Fri, Jun 29, 2018, 19:14 Julian Hyde <jh...@apache.org> wrote:

> > In elastic (index/type) pair is guaranteed to be unique therefore
> > "${index}_${type}" will be also unique (as string). This is only
> necessary
> > when we have several types per index. Valid question is wherever user
> > should be allowed such flexibility.
>
> Uniqueness is not my concern.
>
> Suppose there is an index called "x_y" with a type called "z", and
> another index called "x" with a type called "y_z". If I write "x_y_z"
> it's not clear how it should be broken into index/type.
>
>
> On Fri, Jun 29, 2018 at 3:15 PM, Andrei Sereda <an...@sereda.cc> wrote:
> >> Can you show how those examples affect SQL against the ES adapter and/or
> > how they affect JSON models?
> >
> > The discussion is how to properly bridge (index/type) concept from ES
> into
> > relational world. Proposal to use placeholders ($index / $type) affects
> > only how table is named in calcite. They're not used as SQL literals. IE
> it
> > affects only configuration phase of the schema.
> > Pretty much we're doing string/replace to derive table name from
> > ($index/$type).
> >
> >> You seem to be using '_' as a separator character. Are we sure that
> >> people will never use it in index or type name? Separator characters
> >> often cause problems.
> > In elastic (index/type) pair is guaranteed to be unique therefore
> > "${index}_${type}" will be also unique (as string). This is only
> necessary
> > when we have several types per index. Valid question is wherever user
> > should be allowed such flexibility.
> >
> >
> >
> > On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jh...@apache.org> wrote:
> >
> >> Andrei,
> >>
> >> I'm not an ES user so I don't fully understand this issue, but my two
> >> cents anyway...
> >>
> >> Can you show how those examples affect SQL against the ES adapter
> >> and/or how they affect JSON models?
> >>
> >> You seem to be using '_' as a separator character. Are we sure that
> >> people will never use it in index or type name? Separator characters
> >> often cause problems.
> >>
> >> Julian
> >>
> >>
> >>
> >>
> >> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc>
> wrote:
> >> > I agree there should be a configuration option. How about the
> following
> >> > approach.
> >> >
> >> > Expose both variables ${index} and ${type} in configuration (JSON) and
> >> user
> >> > will use them to generate table name in calcite schema.
> >> >
> >> > Example
> >> > "table_name": "${type}" // current
> >> > "table_name": "${index}" // new (default?)
> >> > "table_name": "${index}_${type}" // most generic. supports multiple
> types
> >> > per index
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org>
> wrote:
> >> >
> >> >> I think it sounds like you and Andrei are in a good position to
> tackle
> >> this
> >> >> one so I'm happy to have you both work on whatever solution you
> think is
> >> >> best.
> >> >>
> >> >> --
> >> >> Michael Mior
> >> >> mmior@apache.org
> >> >>
> >> >>
> >> >>
> >> >> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
> >> christian.beikov@gmail.com
> >> >> >
> >> >> a écrit :
> >> >>
> >> >> > IMO the best solution would be to make it configurable by
> introducing
> >> a
> >> >> > "table_mapping" config with values
> >> >> >
> >> >> >   * type - every type in the known indices is mapped as table
> >> >> >   * index - every known index is mapped as table
> >> >> >
> >> >> > We'd probably also need a "type_field" configuration for defining
> >> which
> >> >> > field to use for the type determination as one of the possible
> future
> >> >> > ways to do things is to introduce a custom field:
> >> >> >
> >> >> >
> >> >>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
> >> >> >
> >> >> > We already detect the ES version, so we can set a smart default for
> >> this
> >> >> > setting. Let's make the index config param optional.
> >> >> >
> >> >> >   * When no index is given, we discover indexes, the default for
> >> >> >     "table_mapping" then is "index"
> >> >> >   * When index is given, the we only discover types according to
> the
> >> >> >     "type_field" configuration and the default for "table_mapping"
> is
> >> >> > "type"
> >> >> >
> >> >> > This would also allow to discover indexes but still use "type" as
> >> >> > "table_mapping".
> >> >> >
> >> >> > What do you think?
> >> >> >
> >> >> > Mit freundlichen Grüßen,
> >> >> >
> >> ------------------------------------------------------------------------
> >> >> > *Christian Beikov*
> >> >> > Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
> >> >> > > Yes. There is an API to list all indexes / types in elastic. They
> >> can
> >> >> be
> >> >> > > automatically imported into a schema.
> >> >> > >
> >> >> > > What needs to be agreed upon is how to expose those elements in
> >> calcite
> >> >> > > schema (naming / behaviour).
> >> >> > >
> >> >> > > 1) Many (most?) of setups are single type per index. Natural way
> to
> >> >> name
> >> >> > > would be  "elastic.$index" (elastic being schema name). Multiple
> >> >> indexes
> >> >> > > would be under same schema "elastic.index1" "elastic.index2" etc.
> >> >> > >
> >> >> > > 2) What if index has several types should they exported as
> calcite
> >> >> > tables:
> >> >> > > "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
> >> behaviour)
> >> >> > as
> >> >> > > "elastic.type1" and "elastic.type2". Or as subschema
> >> >> > > "elastic.$index.type1" ?
> >> >> > >
> >> >> > > Now what if one has combination of (1) and (2) ?
> >> >> > > Setup (2) is already deprecated (and will be unsupported in next
> >> >> version)
> >> >> > >
> >> >> > >
> >> >> > > On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
> >> >> > christian.beikov@gmail.com>
> >> >> > > wrote:
> >> >> > >
> >> >> > >> Is there an API to discover indexes? If there is, I'd suggest we
> >> >> allow a
> >> >> > >> config option that to make the adapter discover the possible
> >> indexes.
> >> >> > >> We'd still have to adapt the code a bit, but internally, the
> schema
> >> >> > >> could just keep a cache of type name to index name map and be
> able
> >> to
> >> >> > >> support both scenarios.
> >> >> > >>
> >> >> > >>
> >> >> > >> Mit freundlichen Grüßen,
> >> >> > >>
> >> >>
> ------------------------------------------------------------------------
> >> >> > >> *Christian Beikov*
> >> >> > >> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
> >> >> > >>>> 1) What's the time horizon for the current adapter no longer
> >> working
> >> >> > >> with these
> >> >> > >>> changes to ES ?
> >> >> > >>> Current adapter will be working for a while with existing
> setup.
> >> The
> >> >> > >>> problem is nomenclature and ease of use.
> >> >> > >>>
> >> >> > >>> Their new SQL concepts mapping
> >> >> > >>> <
> >> >> > >>
> >> >> >
> >> >>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >> >> > >>> drops
> >> >> > >>> the notion of ES type (which before was equivalent of RDBMS
> table)
> >> >> and
> >> >> > >> uses
> >> >> > >>> ES index as new table equivalent (before ES index was equal to
> >> >> > database).
> >> >> > >>> Most users use elastic this way (one type , one index) index ==
> >> >> table.
> >> >> > >>>
> >> >> > >>> Currently calcite requires schema per index. In RDBMS parlance
> >> >> database
> >> >> > >> per
> >> >> > >>> table (I'd like to change that).
> >> >> > >>>
> >> >> > >>>> 2) Any guess how complicated it would be to maintain code
> paths
> >> for
> >> >> > both
> >> >> > >>>> behaviours? I know this is probably really challenging to
> >> estimate,
> >> >> > but
> >> >> > >> I
> >> >> > >>>> really have no idea of the scope of these changes. Would it
> mean
> >> two
> >> >> > >>>> different ES adapters?
> >> >> > >>> One can have just a separate calcite schema implementations
> (same
> >> >> > >> adapter /
> >> >> > >>> module) :
> >> >> > >>> 1)  LegacySchema (old). Schema can have only one index (but
> >> multiple
> >> >> > >>> types). Type == table in this case.
> >> >> > >>> 2)  NewSchema (new). Single schema can have multiple indexes
> >> (type is
> >> >> > >>> dropped). Index == table in this case
> >> >> > >>>
> >> >> > >>>> 3) Do we really need compatibility with the current version of
> >> the
> >> >> > >>> adapter?
> >> >> > >>>> IMO this depends on what versions of ES we would lose support
> for
> >> >> and
> >> >> > >> how
> >> >> > >>>> complex it would be for users of the current ES adapter to
> make
> >> >> > updates
> >> >> > >>> for
> >> >> > >>>> any Calcite API changes.
> >> >> > >>> The issue is not in adapter but how calcite schema exposes
> tables.
> >> >> > >> Should
> >> >> > >>> it expose index as individual table (new), or ES type (old) ?
> >> >> > >>>
> >> >> > >>> Andrei.
> >> >> > >>>
> >> >> > >>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mmior@apache.org
> >
> >> >> wrote:
> >> >> > >>>
> >> >> > >>>> Unfortunately I know very little about ES so I'm not in a
> great
> >> >> > >> position to
> >> >> > >>>> asses the impact of these changes. I will say that that legacy
> >> >> > >>>> compatibility is great, but maintaining two sets of logic is
> >> always
> >> >> a
> >> >> > >>>> challenge. A few follow up questions:
> >> >> > >>>>
> >> >> > >>>> 1) What's the time horizon for the current adapter no longer
> >> working
> >> >> > >> with
> >> >> > >>>> these changes to ES?
> >> >> > >>>>
> >> >> > >>>> 2) Any guess how complicated it would be to maintain code
> paths
> >> for
> >> >> > both
> >> >> > >>>> behaviours? I know this is probably really challenging to
> >> estimate,
> >> >> > but
> >> >> > >> I
> >> >> > >>>> really have no idea of the scope of these changes. Would it
> mean
> >> two
> >> >> > >>>> different ES adapters?
> >> >> > >>>>
> >> >> > >>>> 3) Do we really need compatibility with the current version of
> >> the
> >> >> > >> adapter?
> >> >> > >>>> IMO this depends on what versions of ES we would lose support
> for
> >> >> and
> >> >> > >> how
> >> >> > >>>> complex it would be for users of the current ES adapter to
> make
> >> >> > updates
> >> >> > >> for
> >> >> > >>>> any Calcite API changes.
> >> >> > >>>>
> >> >> > >>>> Thanks for your continued work on the ES adapter Andrei!
> >> >> > >>>>
> >> >> > >>>> --
> >> >> > >>>> Michael Mior
> >> >> > >>>> mmior@apache.org
> >> >> > >>>>
> >> >> > >>>>
> >> >> > >>>>
> >> >> > >>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc>
> a
> >> >> > écrit
> >> >> > >> :
> >> >> > >>>>> Hello,
> >> >> > >>>>>
> >> >> > >>>>> Elastic announced
> >> >> > >>>>> <
> >> >> > >>>>>
> >> >> > >>
> >> >> >
> >> >>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> >> >> > >>>>> that they will be deprecating mapping types in ES6 and
> indexes
> >> will
> >> >> > be
> >> >> > >>>>> single-typed only.
> >> >> > >>>>>
> >> >> > >>>>> Historical analogy <
> https://www.elastic.co/blog/index-vs-type>
> >> >> > between
> >> >> > >>>>> RDBMS and elastic was that index is equivalent to a database
> and
> >> >> type
> >> >> > >>>>> corresponds to table in that database. In a couple of
> releases
> >> >> > (ES6-8)
> >> >> > >>>> this
> >> >> > >>>>> shall not longer be true.
> >> >> > >>>>>
> >> >> > >>>>> Recent SQL addition
> >> >> > >>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released>
> to
> >> >> > elastic
> >> >> > >>>>> confirms
> >> >> > >>>>> this trend
> >> >> > >>>>> <
> >> >> > >>>>>
> >> >> > >>
> >> >> >
> >> >>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >> >> > >>>>>> .
> >> >> > >>>>> Index is equivalent to a table and there are no more ES
> types.
> >> >> > >>>>>
> >> >> > >>>>> I would like to propose to include this logic in Calcite ES
> >> >> adapter.
> >> >> > >> IE,
> >> >> > >>>>> expose each ES single-typed index as a separate table inside
> >> >> calcite
> >> >> > >>>>> schema. This is in contrast to  current integration where
> schema
> >> >> can
> >> >> > >> only
> >> >> > >>>>> have a single index. Current approach forces you to create
> >> multiple
> >> >> > >>>> schemas
> >> >> > >>>>> to query single-typed indexes (on the same ES cluster).
> >> >> > >>>>>
> >> >> > >>>>> Legacy compatibility can always be controlled with
> configuration
> >> >> > >>>>> parameters.
> >> >> > >>>>>
> >> >> > >>>>> Do you agree with such changes ? If yes, would you consider a
> >> PR ?
> >> >> > >>>>>
> >> >> > >>>>> Regards,
> >> >> > >>>>> Andrei.
> >> >> > >>>>>
> >> >> > >>
> >> >> >
> >> >> >
> >> >>
> >>
>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Julian Hyde <jh...@apache.org>.
> In elastic (index/type) pair is guaranteed to be unique therefore
> "${index}_${type}" will be also unique (as string). This is only necessary
> when we have several types per index. Valid question is wherever user
> should be allowed such flexibility.

Uniqueness is not my concern.

Suppose there is an index called "x_y" with a type called "z", and
another index called "x" with a type called "y_z". If I write "x_y_z"
it's not clear how it should be broken into index/type.


On Fri, Jun 29, 2018 at 3:15 PM, Andrei Sereda <an...@sereda.cc> wrote:
>> Can you show how those examples affect SQL against the ES adapter and/or
> how they affect JSON models?
>
> The discussion is how to properly bridge (index/type) concept from ES into
> relational world. Proposal to use placeholders ($index / $type) affects
> only how table is named in calcite. They're not used as SQL literals. IE it
> affects only configuration phase of the schema.
> Pretty much we're doing string/replace to derive table name from
> ($index/$type).
>
>> You seem to be using '_' as a separator character. Are we sure that
>> people will never use it in index or type name? Separator characters
>> often cause problems.
> In elastic (index/type) pair is guaranteed to be unique therefore
> "${index}_${type}" will be also unique (as string). This is only necessary
> when we have several types per index. Valid question is wherever user
> should be allowed such flexibility.
>
>
>
> On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jh...@apache.org> wrote:
>
>> Andrei,
>>
>> I'm not an ES user so I don't fully understand this issue, but my two
>> cents anyway...
>>
>> Can you show how those examples affect SQL against the ES adapter
>> and/or how they affect JSON models?
>>
>> You seem to be using '_' as a separator character. Are we sure that
>> people will never use it in index or type name? Separator characters
>> often cause problems.
>>
>> Julian
>>
>>
>>
>>
>> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc> wrote:
>> > I agree there should be a configuration option. How about the following
>> > approach.
>> >
>> > Expose both variables ${index} and ${type} in configuration (JSON) and
>> user
>> > will use them to generate table name in calcite schema.
>> >
>> > Example
>> > "table_name": "${type}" // current
>> > "table_name": "${index}" // new (default?)
>> > "table_name": "${index}_${type}" // most generic. supports multiple types
>> > per index
>> >
>> >
>> >
>> >
>> >
>> > On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org> wrote:
>> >
>> >> I think it sounds like you and Andrei are in a good position to tackle
>> this
>> >> one so I'm happy to have you both work on whatever solution you think is
>> >> best.
>> >>
>> >> --
>> >> Michael Mior
>> >> mmior@apache.org
>> >>
>> >>
>> >>
>> >> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
>> christian.beikov@gmail.com
>> >> >
>> >> a écrit :
>> >>
>> >> > IMO the best solution would be to make it configurable by introducing
>> a
>> >> > "table_mapping" config with values
>> >> >
>> >> >   * type - every type in the known indices is mapped as table
>> >> >   * index - every known index is mapped as table
>> >> >
>> >> > We'd probably also need a "type_field" configuration for defining
>> which
>> >> > field to use for the type determination as one of the possible future
>> >> > ways to do things is to introduce a custom field:
>> >> >
>> >> >
>> >>
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
>> >> >
>> >> > We already detect the ES version, so we can set a smart default for
>> this
>> >> > setting. Let's make the index config param optional.
>> >> >
>> >> >   * When no index is given, we discover indexes, the default for
>> >> >     "table_mapping" then is "index"
>> >> >   * When index is given, the we only discover types according to the
>> >> >     "type_field" configuration and the default for "table_mapping" is
>> >> > "type"
>> >> >
>> >> > This would also allow to discover indexes but still use "type" as
>> >> > "table_mapping".
>> >> >
>> >> > What do you think?
>> >> >
>> >> > Mit freundlichen Grüßen,
>> >> >
>> ------------------------------------------------------------------------
>> >> > *Christian Beikov*
>> >> > Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
>> >> > > Yes. There is an API to list all indexes / types in elastic. They
>> can
>> >> be
>> >> > > automatically imported into a schema.
>> >> > >
>> >> > > What needs to be agreed upon is how to expose those elements in
>> calcite
>> >> > > schema (naming / behaviour).
>> >> > >
>> >> > > 1) Many (most?) of setups are single type per index. Natural way to
>> >> name
>> >> > > would be  "elastic.$index" (elastic being schema name). Multiple
>> >> indexes
>> >> > > would be under same schema "elastic.index1" "elastic.index2" etc.
>> >> > >
>> >> > > 2) What if index has several types should they exported as calcite
>> >> > tables:
>> >> > > "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
>> behaviour)
>> >> > as
>> >> > > "elastic.type1" and "elastic.type2". Or as subschema
>> >> > > "elastic.$index.type1" ?
>> >> > >
>> >> > > Now what if one has combination of (1) and (2) ?
>> >> > > Setup (2) is already deprecated (and will be unsupported in next
>> >> version)
>> >> > >
>> >> > >
>> >> > > On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
>> >> > christian.beikov@gmail.com>
>> >> > > wrote:
>> >> > >
>> >> > >> Is there an API to discover indexes? If there is, I'd suggest we
>> >> allow a
>> >> > >> config option that to make the adapter discover the possible
>> indexes.
>> >> > >> We'd still have to adapt the code a bit, but internally, the schema
>> >> > >> could just keep a cache of type name to index name map and be able
>> to
>> >> > >> support both scenarios.
>> >> > >>
>> >> > >>
>> >> > >> Mit freundlichen Grüßen,
>> >> > >>
>> >> ------------------------------------------------------------------------
>> >> > >> *Christian Beikov*
>> >> > >> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
>> >> > >>>> 1) What's the time horizon for the current adapter no longer
>> working
>> >> > >> with these
>> >> > >>> changes to ES ?
>> >> > >>> Current adapter will be working for a while with existing setup.
>> The
>> >> > >>> problem is nomenclature and ease of use.
>> >> > >>>
>> >> > >>> Their new SQL concepts mapping
>> >> > >>> <
>> >> > >>
>> >> >
>> >>
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>> >> > >>> drops
>> >> > >>> the notion of ES type (which before was equivalent of RDBMS table)
>> >> and
>> >> > >> uses
>> >> > >>> ES index as new table equivalent (before ES index was equal to
>> >> > database).
>> >> > >>> Most users use elastic this way (one type , one index) index ==
>> >> table.
>> >> > >>>
>> >> > >>> Currently calcite requires schema per index. In RDBMS parlance
>> >> database
>> >> > >> per
>> >> > >>> table (I'd like to change that).
>> >> > >>>
>> >> > >>>> 2) Any guess how complicated it would be to maintain code paths
>> for
>> >> > both
>> >> > >>>> behaviours? I know this is probably really challenging to
>> estimate,
>> >> > but
>> >> > >> I
>> >> > >>>> really have no idea of the scope of these changes. Would it mean
>> two
>> >> > >>>> different ES adapters?
>> >> > >>> One can have just a separate calcite schema implementations (same
>> >> > >> adapter /
>> >> > >>> module) :
>> >> > >>> 1)  LegacySchema (old). Schema can have only one index (but
>> multiple
>> >> > >>> types). Type == table in this case.
>> >> > >>> 2)  NewSchema (new). Single schema can have multiple indexes
>> (type is
>> >> > >>> dropped). Index == table in this case
>> >> > >>>
>> >> > >>>> 3) Do we really need compatibility with the current version of
>> the
>> >> > >>> adapter?
>> >> > >>>> IMO this depends on what versions of ES we would lose support for
>> >> and
>> >> > >> how
>> >> > >>>> complex it would be for users of the current ES adapter to make
>> >> > updates
>> >> > >>> for
>> >> > >>>> any Calcite API changes.
>> >> > >>> The issue is not in adapter but how calcite schema exposes tables.
>> >> > >> Should
>> >> > >>> it expose index as individual table (new), or ES type (old) ?
>> >> > >>>
>> >> > >>> Andrei.
>> >> > >>>
>> >> > >>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org>
>> >> wrote:
>> >> > >>>
>> >> > >>>> Unfortunately I know very little about ES so I'm not in a great
>> >> > >> position to
>> >> > >>>> asses the impact of these changes. I will say that that legacy
>> >> > >>>> compatibility is great, but maintaining two sets of logic is
>> always
>> >> a
>> >> > >>>> challenge. A few follow up questions:
>> >> > >>>>
>> >> > >>>> 1) What's the time horizon for the current adapter no longer
>> working
>> >> > >> with
>> >> > >>>> these changes to ES?
>> >> > >>>>
>> >> > >>>> 2) Any guess how complicated it would be to maintain code paths
>> for
>> >> > both
>> >> > >>>> behaviours? I know this is probably really challenging to
>> estimate,
>> >> > but
>> >> > >> I
>> >> > >>>> really have no idea of the scope of these changes. Would it mean
>> two
>> >> > >>>> different ES adapters?
>> >> > >>>>
>> >> > >>>> 3) Do we really need compatibility with the current version of
>> the
>> >> > >> adapter?
>> >> > >>>> IMO this depends on what versions of ES we would lose support for
>> >> and
>> >> > >> how
>> >> > >>>> complex it would be for users of the current ES adapter to make
>> >> > updates
>> >> > >> for
>> >> > >>>> any Calcite API changes.
>> >> > >>>>
>> >> > >>>> Thanks for your continued work on the ES adapter Andrei!
>> >> > >>>>
>> >> > >>>> --
>> >> > >>>> Michael Mior
>> >> > >>>> mmior@apache.org
>> >> > >>>>
>> >> > >>>>
>> >> > >>>>
>> >> > >>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a
>> >> > écrit
>> >> > >> :
>> >> > >>>>> Hello,
>> >> > >>>>>
>> >> > >>>>> Elastic announced
>> >> > >>>>> <
>> >> > >>>>>
>> >> > >>
>> >> >
>> >>
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
>> >> > >>>>> that they will be deprecating mapping types in ES6 and indexes
>> will
>> >> > be
>> >> > >>>>> single-typed only.
>> >> > >>>>>
>> >> > >>>>> Historical analogy <https://www.elastic.co/blog/index-vs-type>
>> >> > between
>> >> > >>>>> RDBMS and elastic was that index is equivalent to a database and
>> >> type
>> >> > >>>>> corresponds to table in that database. In a couple of releases
>> >> > (ES6-8)
>> >> > >>>> this
>> >> > >>>>> shall not longer be true.
>> >> > >>>>>
>> >> > >>>>> Recent SQL addition
>> >> > >>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to
>> >> > elastic
>> >> > >>>>> confirms
>> >> > >>>>> this trend
>> >> > >>>>> <
>> >> > >>>>>
>> >> > >>
>> >> >
>> >>
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>> >> > >>>>>> .
>> >> > >>>>> Index is equivalent to a table and there are no more ES types.
>> >> > >>>>>
>> >> > >>>>> I would like to propose to include this logic in Calcite ES
>> >> adapter.
>> >> > >> IE,
>> >> > >>>>> expose each ES single-typed index as a separate table inside
>> >> calcite
>> >> > >>>>> schema. This is in contrast to  current integration where schema
>> >> can
>> >> > >> only
>> >> > >>>>> have a single index. Current approach forces you to create
>> multiple
>> >> > >>>> schemas
>> >> > >>>>> to query single-typed indexes (on the same ES cluster).
>> >> > >>>>>
>> >> > >>>>> Legacy compatibility can always be controlled with configuration
>> >> > >>>>> parameters.
>> >> > >>>>>
>> >> > >>>>> Do you agree with such changes ? If yes, would you consider a
>> PR ?
>> >> > >>>>>
>> >> > >>>>> Regards,
>> >> > >>>>> Andrei.
>> >> > >>>>>
>> >> > >>
>> >> >
>> >> >
>> >>
>>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Andrei Sereda <an...@sereda.cc>.
> Can you show how those examples affect SQL against the ES adapter and/or
how they affect JSON models?

The discussion is how to properly bridge (index/type) concept from ES into
relational world. Proposal to use placeholders ($index / $type) affects
only how table is named in calcite. They're not used as SQL literals. IE it
affects only configuration phase of the schema.
Pretty much we're doing string/replace to derive table name from
($index/$type).

> You seem to be using '_' as a separator character. Are we sure that
> people will never use it in index or type name? Separator characters
> often cause problems.
In elastic (index/type) pair is guaranteed to be unique therefore
"${index}_${type}" will be also unique (as string). This is only necessary
when we have several types per index. Valid question is wherever user
should be allowed such flexibility.



On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jh...@apache.org> wrote:

> Andrei,
>
> I'm not an ES user so I don't fully understand this issue, but my two
> cents anyway...
>
> Can you show how those examples affect SQL against the ES adapter
> and/or how they affect JSON models?
>
> You seem to be using '_' as a separator character. Are we sure that
> people will never use it in index or type name? Separator characters
> often cause problems.
>
> Julian
>
>
>
>
> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc> wrote:
> > I agree there should be a configuration option. How about the following
> > approach.
> >
> > Expose both variables ${index} and ${type} in configuration (JSON) and
> user
> > will use them to generate table name in calcite schema.
> >
> > Example
> > "table_name": "${type}" // current
> > "table_name": "${index}" // new (default?)
> > "table_name": "${index}_${type}" // most generic. supports multiple types
> > per index
> >
> >
> >
> >
> >
> > On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org> wrote:
> >
> >> I think it sounds like you and Andrei are in a good position to tackle
> this
> >> one so I'm happy to have you both work on whatever solution you think is
> >> best.
> >>
> >> --
> >> Michael Mior
> >> mmior@apache.org
> >>
> >>
> >>
> >> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
> christian.beikov@gmail.com
> >> >
> >> a écrit :
> >>
> >> > IMO the best solution would be to make it configurable by introducing
> a
> >> > "table_mapping" config with values
> >> >
> >> >   * type - every type in the known indices is mapped as table
> >> >   * index - every known index is mapped as table
> >> >
> >> > We'd probably also need a "type_field" configuration for defining
> which
> >> > field to use for the type determination as one of the possible future
> >> > ways to do things is to introduce a custom field:
> >> >
> >> >
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
> >> >
> >> > We already detect the ES version, so we can set a smart default for
> this
> >> > setting. Let's make the index config param optional.
> >> >
> >> >   * When no index is given, we discover indexes, the default for
> >> >     "table_mapping" then is "index"
> >> >   * When index is given, the we only discover types according to the
> >> >     "type_field" configuration and the default for "table_mapping" is
> >> > "type"
> >> >
> >> > This would also allow to discover indexes but still use "type" as
> >> > "table_mapping".
> >> >
> >> > What do you think?
> >> >
> >> > Mit freundlichen Grüßen,
> >> >
> ------------------------------------------------------------------------
> >> > *Christian Beikov*
> >> > Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
> >> > > Yes. There is an API to list all indexes / types in elastic. They
> can
> >> be
> >> > > automatically imported into a schema.
> >> > >
> >> > > What needs to be agreed upon is how to expose those elements in
> calcite
> >> > > schema (naming / behaviour).
> >> > >
> >> > > 1) Many (most?) of setups are single type per index. Natural way to
> >> name
> >> > > would be  "elastic.$index" (elastic being schema name). Multiple
> >> indexes
> >> > > would be under same schema "elastic.index1" "elastic.index2" etc.
> >> > >
> >> > > 2) What if index has several types should they exported as calcite
> >> > tables:
> >> > > "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
> behaviour)
> >> > as
> >> > > "elastic.type1" and "elastic.type2". Or as subschema
> >> > > "elastic.$index.type1" ?
> >> > >
> >> > > Now what if one has combination of (1) and (2) ?
> >> > > Setup (2) is already deprecated (and will be unsupported in next
> >> version)
> >> > >
> >> > >
> >> > > On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
> >> > christian.beikov@gmail.com>
> >> > > wrote:
> >> > >
> >> > >> Is there an API to discover indexes? If there is, I'd suggest we
> >> allow a
> >> > >> config option that to make the adapter discover the possible
> indexes.
> >> > >> We'd still have to adapt the code a bit, but internally, the schema
> >> > >> could just keep a cache of type name to index name map and be able
> to
> >> > >> support both scenarios.
> >> > >>
> >> > >>
> >> > >> Mit freundlichen Grüßen,
> >> > >>
> >> ------------------------------------------------------------------------
> >> > >> *Christian Beikov*
> >> > >> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
> >> > >>>> 1) What's the time horizon for the current adapter no longer
> working
> >> > >> with these
> >> > >>> changes to ES ?
> >> > >>> Current adapter will be working for a while with existing setup.
> The
> >> > >>> problem is nomenclature and ease of use.
> >> > >>>
> >> > >>> Their new SQL concepts mapping
> >> > >>> <
> >> > >>
> >> >
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >> > >>> drops
> >> > >>> the notion of ES type (which before was equivalent of RDBMS table)
> >> and
> >> > >> uses
> >> > >>> ES index as new table equivalent (before ES index was equal to
> >> > database).
> >> > >>> Most users use elastic this way (one type , one index) index ==
> >> table.
> >> > >>>
> >> > >>> Currently calcite requires schema per index. In RDBMS parlance
> >> database
> >> > >> per
> >> > >>> table (I'd like to change that).
> >> > >>>
> >> > >>>> 2) Any guess how complicated it would be to maintain code paths
> for
> >> > both
> >> > >>>> behaviours? I know this is probably really challenging to
> estimate,
> >> > but
> >> > >> I
> >> > >>>> really have no idea of the scope of these changes. Would it mean
> two
> >> > >>>> different ES adapters?
> >> > >>> One can have just a separate calcite schema implementations (same
> >> > >> adapter /
> >> > >>> module) :
> >> > >>> 1)  LegacySchema (old). Schema can have only one index (but
> multiple
> >> > >>> types). Type == table in this case.
> >> > >>> 2)  NewSchema (new). Single schema can have multiple indexes
> (type is
> >> > >>> dropped). Index == table in this case
> >> > >>>
> >> > >>>> 3) Do we really need compatibility with the current version of
> the
> >> > >>> adapter?
> >> > >>>> IMO this depends on what versions of ES we would lose support for
> >> and
> >> > >> how
> >> > >>>> complex it would be for users of the current ES adapter to make
> >> > updates
> >> > >>> for
> >> > >>>> any Calcite API changes.
> >> > >>> The issue is not in adapter but how calcite schema exposes tables.
> >> > >> Should
> >> > >>> it expose index as individual table (new), or ES type (old) ?
> >> > >>>
> >> > >>> Andrei.
> >> > >>>
> >> > >>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org>
> >> wrote:
> >> > >>>
> >> > >>>> Unfortunately I know very little about ES so I'm not in a great
> >> > >> position to
> >> > >>>> asses the impact of these changes. I will say that that legacy
> >> > >>>> compatibility is great, but maintaining two sets of logic is
> always
> >> a
> >> > >>>> challenge. A few follow up questions:
> >> > >>>>
> >> > >>>> 1) What's the time horizon for the current adapter no longer
> working
> >> > >> with
> >> > >>>> these changes to ES?
> >> > >>>>
> >> > >>>> 2) Any guess how complicated it would be to maintain code paths
> for
> >> > both
> >> > >>>> behaviours? I know this is probably really challenging to
> estimate,
> >> > but
> >> > >> I
> >> > >>>> really have no idea of the scope of these changes. Would it mean
> two
> >> > >>>> different ES adapters?
> >> > >>>>
> >> > >>>> 3) Do we really need compatibility with the current version of
> the
> >> > >> adapter?
> >> > >>>> IMO this depends on what versions of ES we would lose support for
> >> and
> >> > >> how
> >> > >>>> complex it would be for users of the current ES adapter to make
> >> > updates
> >> > >> for
> >> > >>>> any Calcite API changes.
> >> > >>>>
> >> > >>>> Thanks for your continued work on the ES adapter Andrei!
> >> > >>>>
> >> > >>>> --
> >> > >>>> Michael Mior
> >> > >>>> mmior@apache.org
> >> > >>>>
> >> > >>>>
> >> > >>>>
> >> > >>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a
> >> > écrit
> >> > >> :
> >> > >>>>> Hello,
> >> > >>>>>
> >> > >>>>> Elastic announced
> >> > >>>>> <
> >> > >>>>>
> >> > >>
> >> >
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> >> > >>>>> that they will be deprecating mapping types in ES6 and indexes
> will
> >> > be
> >> > >>>>> single-typed only.
> >> > >>>>>
> >> > >>>>> Historical analogy <https://www.elastic.co/blog/index-vs-type>
> >> > between
> >> > >>>>> RDBMS and elastic was that index is equivalent to a database and
> >> type
> >> > >>>>> corresponds to table in that database. In a couple of releases
> >> > (ES6-8)
> >> > >>>> this
> >> > >>>>> shall not longer be true.
> >> > >>>>>
> >> > >>>>> Recent SQL addition
> >> > >>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to
> >> > elastic
> >> > >>>>> confirms
> >> > >>>>> this trend
> >> > >>>>> <
> >> > >>>>>
> >> > >>
> >> >
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >> > >>>>>> .
> >> > >>>>> Index is equivalent to a table and there are no more ES types.
> >> > >>>>>
> >> > >>>>> I would like to propose to include this logic in Calcite ES
> >> adapter.
> >> > >> IE,
> >> > >>>>> expose each ES single-typed index as a separate table inside
> >> calcite
> >> > >>>>> schema. This is in contrast to  current integration where schema
> >> can
> >> > >> only
> >> > >>>>> have a single index. Current approach forces you to create
> multiple
> >> > >>>> schemas
> >> > >>>>> to query single-typed indexes (on the same ES cluster).
> >> > >>>>>
> >> > >>>>> Legacy compatibility can always be controlled with configuration
> >> > >>>>> parameters.
> >> > >>>>>
> >> > >>>>> Do you agree with such changes ? If yes, would you consider a
> PR ?
> >> > >>>>>
> >> > >>>>> Regards,
> >> > >>>>> Andrei.
> >> > >>>>>
> >> > >>
> >> >
> >> >
> >>
>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Julian Hyde <jh...@apache.org>.
Andrei,

I'm not an ES user so I don't fully understand this issue, but my two
cents anyway...

Can you show how those examples affect SQL against the ES adapter
and/or how they affect JSON models?

You seem to be using '_' as a separator character. Are we sure that
people will never use it in index or type name? Separator characters
often cause problems.

Julian




On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <an...@sereda.cc> wrote:
> I agree there should be a configuration option. How about the following
> approach.
>
> Expose both variables ${index} and ${type} in configuration (JSON) and user
> will use them to generate table name in calcite schema.
>
> Example
> "table_name": "${type}" // current
> "table_name": "${index}" // new (default?)
> "table_name": "${index}_${type}" // most generic. supports multiple types
> per index
>
>
>
>
>
> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org> wrote:
>
>> I think it sounds like you and Andrei are in a good position to tackle this
>> one so I'm happy to have you both work on whatever solution you think is
>> best.
>>
>> --
>> Michael Mior
>> mmior@apache.org
>>
>>
>>
>> Le ven. 29 juin 2018 à 04:19, Christian Beikov <christian.beikov@gmail.com
>> >
>> a écrit :
>>
>> > IMO the best solution would be to make it configurable by introducing a
>> > "table_mapping" config with values
>> >
>> >   * type - every type in the known indices is mapped as table
>> >   * index - every known index is mapped as table
>> >
>> > We'd probably also need a "type_field" configuration for defining which
>> > field to use for the type determination as one of the possible future
>> > ways to do things is to introduce a custom field:
>> >
>> >
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
>> >
>> > We already detect the ES version, so we can set a smart default for this
>> > setting. Let's make the index config param optional.
>> >
>> >   * When no index is given, we discover indexes, the default for
>> >     "table_mapping" then is "index"
>> >   * When index is given, the we only discover types according to the
>> >     "type_field" configuration and the default for "table_mapping" is
>> > "type"
>> >
>> > This would also allow to discover indexes but still use "type" as
>> > "table_mapping".
>> >
>> > What do you think?
>> >
>> > Mit freundlichen Grüßen,
>> > ------------------------------------------------------------------------
>> > *Christian Beikov*
>> > Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
>> > > Yes. There is an API to list all indexes / types in elastic. They can
>> be
>> > > automatically imported into a schema.
>> > >
>> > > What needs to be agreed upon is how to expose those elements in calcite
>> > > schema (naming / behaviour).
>> > >
>> > > 1) Many (most?) of setups are single type per index. Natural way to
>> name
>> > > would be  "elastic.$index" (elastic being schema name). Multiple
>> indexes
>> > > would be under same schema "elastic.index1" "elastic.index2" etc.
>> > >
>> > > 2) What if index has several types should they exported as calcite
>> > tables:
>> > > "elastic.$index_type1" "elastic.$index_type2" ?  Or (current behaviour)
>> > as
>> > > "elastic.type1" and "elastic.type2". Or as subschema
>> > > "elastic.$index.type1" ?
>> > >
>> > > Now what if one has combination of (1) and (2) ?
>> > > Setup (2) is already deprecated (and will be unsupported in next
>> version)
>> > >
>> > >
>> > > On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
>> > christian.beikov@gmail.com>
>> > > wrote:
>> > >
>> > >> Is there an API to discover indexes? If there is, I'd suggest we
>> allow a
>> > >> config option that to make the adapter discover the possible indexes.
>> > >> We'd still have to adapt the code a bit, but internally, the schema
>> > >> could just keep a cache of type name to index name map and be able to
>> > >> support both scenarios.
>> > >>
>> > >>
>> > >> Mit freundlichen Grüßen,
>> > >>
>> ------------------------------------------------------------------------
>> > >> *Christian Beikov*
>> > >> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
>> > >>>> 1) What's the time horizon for the current adapter no longer working
>> > >> with these
>> > >>> changes to ES ?
>> > >>> Current adapter will be working for a while with existing setup. The
>> > >>> problem is nomenclature and ease of use.
>> > >>>
>> > >>> Their new SQL concepts mapping
>> > >>> <
>> > >>
>> >
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>> > >>> drops
>> > >>> the notion of ES type (which before was equivalent of RDBMS table)
>> and
>> > >> uses
>> > >>> ES index as new table equivalent (before ES index was equal to
>> > database).
>> > >>> Most users use elastic this way (one type , one index) index ==
>> table.
>> > >>>
>> > >>> Currently calcite requires schema per index. In RDBMS parlance
>> database
>> > >> per
>> > >>> table (I'd like to change that).
>> > >>>
>> > >>>> 2) Any guess how complicated it would be to maintain code paths for
>> > both
>> > >>>> behaviours? I know this is probably really challenging to estimate,
>> > but
>> > >> I
>> > >>>> really have no idea of the scope of these changes. Would it mean two
>> > >>>> different ES adapters?
>> > >>> One can have just a separate calcite schema implementations (same
>> > >> adapter /
>> > >>> module) :
>> > >>> 1)  LegacySchema (old). Schema can have only one index (but multiple
>> > >>> types). Type == table in this case.
>> > >>> 2)  NewSchema (new). Single schema can have multiple indexes (type is
>> > >>> dropped). Index == table in this case
>> > >>>
>> > >>>> 3) Do we really need compatibility with the current version of the
>> > >>> adapter?
>> > >>>> IMO this depends on what versions of ES we would lose support for
>> and
>> > >> how
>> > >>>> complex it would be for users of the current ES adapter to make
>> > updates
>> > >>> for
>> > >>>> any Calcite API changes.
>> > >>> The issue is not in adapter but how calcite schema exposes tables.
>> > >> Should
>> > >>> it expose index as individual table (new), or ES type (old) ?
>> > >>>
>> > >>> Andrei.
>> > >>>
>> > >>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org>
>> wrote:
>> > >>>
>> > >>>> Unfortunately I know very little about ES so I'm not in a great
>> > >> position to
>> > >>>> asses the impact of these changes. I will say that that legacy
>> > >>>> compatibility is great, but maintaining two sets of logic is always
>> a
>> > >>>> challenge. A few follow up questions:
>> > >>>>
>> > >>>> 1) What's the time horizon for the current adapter no longer working
>> > >> with
>> > >>>> these changes to ES?
>> > >>>>
>> > >>>> 2) Any guess how complicated it would be to maintain code paths for
>> > both
>> > >>>> behaviours? I know this is probably really challenging to estimate,
>> > but
>> > >> I
>> > >>>> really have no idea of the scope of these changes. Would it mean two
>> > >>>> different ES adapters?
>> > >>>>
>> > >>>> 3) Do we really need compatibility with the current version of the
>> > >> adapter?
>> > >>>> IMO this depends on what versions of ES we would lose support for
>> and
>> > >> how
>> > >>>> complex it would be for users of the current ES adapter to make
>> > updates
>> > >> for
>> > >>>> any Calcite API changes.
>> > >>>>
>> > >>>> Thanks for your continued work on the ES adapter Andrei!
>> > >>>>
>> > >>>> --
>> > >>>> Michael Mior
>> > >>>> mmior@apache.org
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a
>> > écrit
>> > >> :
>> > >>>>> Hello,
>> > >>>>>
>> > >>>>> Elastic announced
>> > >>>>> <
>> > >>>>>
>> > >>
>> >
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
>> > >>>>> that they will be deprecating mapping types in ES6 and indexes will
>> > be
>> > >>>>> single-typed only.
>> > >>>>>
>> > >>>>> Historical analogy <https://www.elastic.co/blog/index-vs-type>
>> > between
>> > >>>>> RDBMS and elastic was that index is equivalent to a database and
>> type
>> > >>>>> corresponds to table in that database. In a couple of releases
>> > (ES6-8)
>> > >>>> this
>> > >>>>> shall not longer be true.
>> > >>>>>
>> > >>>>> Recent SQL addition
>> > >>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to
>> > elastic
>> > >>>>> confirms
>> > >>>>> this trend
>> > >>>>> <
>> > >>>>>
>> > >>
>> >
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>> > >>>>>> .
>> > >>>>> Index is equivalent to a table and there are no more ES types.
>> > >>>>>
>> > >>>>> I would like to propose to include this logic in Calcite ES
>> adapter.
>> > >> IE,
>> > >>>>> expose each ES single-typed index as a separate table inside
>> calcite
>> > >>>>> schema. This is in contrast to  current integration where schema
>> can
>> > >> only
>> > >>>>> have a single index. Current approach forces you to create multiple
>> > >>>> schemas
>> > >>>>> to query single-typed indexes (on the same ES cluster).
>> > >>>>>
>> > >>>>> Legacy compatibility can always be controlled with configuration
>> > >>>>> parameters.
>> > >>>>>
>> > >>>>> Do you agree with such changes ? If yes, would you consider a PR ?
>> > >>>>>
>> > >>>>> Regards,
>> > >>>>> Andrei.
>> > >>>>>
>> > >>
>> >
>> >
>>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Andrei Sereda <an...@sereda.cc>.
Plus allow to filter indexes using regexp ?

On Fri, Jun 29, 2018 at 1:58 PM Andrei Sereda <an...@sereda.cc> wrote:

> I agree there should be a configuration option. How about the following
> approach.
>
> Expose both variables ${index} and ${type} in configuration (JSON) and
> user will use them to generate table name in calcite schema.
>
> Example
> "table_name": "${type}" // current
> "table_name": "${index}" // new (default?)
> "table_name": "${index}_${type}" // most generic. supports multiple types
> per index
>
>
>
>
>
> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org> wrote:
>
>> I think it sounds like you and Andrei are in a good position to tackle
>> this
>> one so I'm happy to have you both work on whatever solution you think is
>> best.
>>
>> --
>> Michael Mior
>> mmior@apache.org
>>
>>
>>
>> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
>> christian.beikov@gmail.com>
>> a écrit :
>>
>> > IMO the best solution would be to make it configurable by introducing a
>> > "table_mapping" config with values
>> >
>> >   * type - every type in the known indices is mapped as table
>> >   * index - every known index is mapped as table
>> >
>> > We'd probably also need a "type_field" configuration for defining which
>> > field to use for the type determination as one of the possible future
>> > ways to do things is to introduce a custom field:
>> >
>> >
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
>> >
>> > We already detect the ES version, so we can set a smart default for this
>> > setting. Let's make the index config param optional.
>> >
>> >   * When no index is given, we discover indexes, the default for
>> >     "table_mapping" then is "index"
>> >   * When index is given, the we only discover types according to the
>> >     "type_field" configuration and the default for "table_mapping" is
>> > "type"
>> >
>> > This would also allow to discover indexes but still use "type" as
>> > "table_mapping".
>> >
>> > What do you think?
>> >
>> > Mit freundlichen Grüßen,
>> > ------------------------------------------------------------------------
>> > *Christian Beikov*
>> > Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
>> > > Yes. There is an API to list all indexes / types in elastic. They can
>> be
>> > > automatically imported into a schema.
>> > >
>> > > What needs to be agreed upon is how to expose those elements in
>> calcite
>> > > schema (naming / behaviour).
>> > >
>> > > 1) Many (most?) of setups are single type per index. Natural way to
>> name
>> > > would be  "elastic.$index" (elastic being schema name). Multiple
>> indexes
>> > > would be under same schema "elastic.index1" "elastic.index2" etc.
>> > >
>> > > 2) What if index has several types should they exported as calcite
>> > tables:
>> > > "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
>> behaviour)
>> > as
>> > > "elastic.type1" and "elastic.type2". Or as subschema
>> > > "elastic.$index.type1" ?
>> > >
>> > > Now what if one has combination of (1) and (2) ?
>> > > Setup (2) is already deprecated (and will be unsupported in next
>> version)
>> > >
>> > >
>> > > On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
>> > christian.beikov@gmail.com>
>> > > wrote:
>> > >
>> > >> Is there an API to discover indexes? If there is, I'd suggest we
>> allow a
>> > >> config option that to make the adapter discover the possible indexes.
>> > >> We'd still have to adapt the code a bit, but internally, the schema
>> > >> could just keep a cache of type name to index name map and be able to
>> > >> support both scenarios.
>> > >>
>> > >>
>> > >> Mit freundlichen Grüßen,
>> > >>
>> ------------------------------------------------------------------------
>> > >> *Christian Beikov*
>> > >> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
>> > >>>> 1) What's the time horizon for the current adapter no longer
>> working
>> > >> with these
>> > >>> changes to ES ?
>> > >>> Current adapter will be working for a while with existing setup. The
>> > >>> problem is nomenclature and ease of use.
>> > >>>
>> > >>> Their new SQL concepts mapping
>> > >>> <
>> > >>
>> >
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>> > >>> drops
>> > >>> the notion of ES type (which before was equivalent of RDBMS table)
>> and
>> > >> uses
>> > >>> ES index as new table equivalent (before ES index was equal to
>> > database).
>> > >>> Most users use elastic this way (one type , one index) index ==
>> table.
>> > >>>
>> > >>> Currently calcite requires schema per index. In RDBMS parlance
>> database
>> > >> per
>> > >>> table (I'd like to change that).
>> > >>>
>> > >>>> 2) Any guess how complicated it would be to maintain code paths for
>> > both
>> > >>>> behaviours? I know this is probably really challenging to estimate,
>> > but
>> > >> I
>> > >>>> really have no idea of the scope of these changes. Would it mean
>> two
>> > >>>> different ES adapters?
>> > >>> One can have just a separate calcite schema implementations (same
>> > >> adapter /
>> > >>> module) :
>> > >>> 1)  LegacySchema (old). Schema can have only one index (but multiple
>> > >>> types). Type == table in this case.
>> > >>> 2)  NewSchema (new). Single schema can have multiple indexes (type
>> is
>> > >>> dropped). Index == table in this case
>> > >>>
>> > >>>> 3) Do we really need compatibility with the current version of the
>> > >>> adapter?
>> > >>>> IMO this depends on what versions of ES we would lose support for
>> and
>> > >> how
>> > >>>> complex it would be for users of the current ES adapter to make
>> > updates
>> > >>> for
>> > >>>> any Calcite API changes.
>> > >>> The issue is not in adapter but how calcite schema exposes tables.
>> > >> Should
>> > >>> it expose index as individual table (new), or ES type (old) ?
>> > >>>
>> > >>> Andrei.
>> > >>>
>> > >>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org>
>> wrote:
>> > >>>
>> > >>>> Unfortunately I know very little about ES so I'm not in a great
>> > >> position to
>> > >>>> asses the impact of these changes. I will say that that legacy
>> > >>>> compatibility is great, but maintaining two sets of logic is
>> always a
>> > >>>> challenge. A few follow up questions:
>> > >>>>
>> > >>>> 1) What's the time horizon for the current adapter no longer
>> working
>> > >> with
>> > >>>> these changes to ES?
>> > >>>>
>> > >>>> 2) Any guess how complicated it would be to maintain code paths for
>> > both
>> > >>>> behaviours? I know this is probably really challenging to estimate,
>> > but
>> > >> I
>> > >>>> really have no idea of the scope of these changes. Would it mean
>> two
>> > >>>> different ES adapters?
>> > >>>>
>> > >>>> 3) Do we really need compatibility with the current version of the
>> > >> adapter?
>> > >>>> IMO this depends on what versions of ES we would lose support for
>> and
>> > >> how
>> > >>>> complex it would be for users of the current ES adapter to make
>> > updates
>> > >> for
>> > >>>> any Calcite API changes.
>> > >>>>
>> > >>>> Thanks for your continued work on the ES adapter Andrei!
>> > >>>>
>> > >>>> --
>> > >>>> Michael Mior
>> > >>>> mmior@apache.org
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a
>> > écrit
>> > >> :
>> > >>>>> Hello,
>> > >>>>>
>> > >>>>> Elastic announced
>> > >>>>> <
>> > >>>>>
>> > >>
>> >
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
>> > >>>>> that they will be deprecating mapping types in ES6 and indexes
>> will
>> > be
>> > >>>>> single-typed only.
>> > >>>>>
>> > >>>>> Historical analogy <https://www.elastic.co/blog/index-vs-type>
>> > between
>> > >>>>> RDBMS and elastic was that index is equivalent to a database and
>> type
>> > >>>>> corresponds to table in that database. In a couple of releases
>> > (ES6-8)
>> > >>>> this
>> > >>>>> shall not longer be true.
>> > >>>>>
>> > >>>>> Recent SQL addition
>> > >>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to
>> > elastic
>> > >>>>> confirms
>> > >>>>> this trend
>> > >>>>> <
>> > >>>>>
>> > >>
>> >
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>> > >>>>>> .
>> > >>>>> Index is equivalent to a table and there are no more ES types.
>> > >>>>>
>> > >>>>> I would like to propose to include this logic in Calcite ES
>> adapter.
>> > >> IE,
>> > >>>>> expose each ES single-typed index as a separate table inside
>> calcite
>> > >>>>> schema. This is in contrast to  current integration where schema
>> can
>> > >> only
>> > >>>>> have a single index. Current approach forces you to create
>> multiple
>> > >>>> schemas
>> > >>>>> to query single-typed indexes (on the same ES cluster).
>> > >>>>>
>> > >>>>> Legacy compatibility can always be controlled with configuration
>> > >>>>> parameters.
>> > >>>>>
>> > >>>>> Do you agree with such changes ? If yes, would you consider a PR ?
>> > >>>>>
>> > >>>>> Regards,
>> > >>>>> Andrei.
>> > >>>>>
>> > >>
>> >
>> >
>>
>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Andrei Sereda <an...@sereda.cc>.
I agree there should be a configuration option. How about the following
approach.

Expose both variables ${index} and ${type} in configuration (JSON) and user
will use them to generate table name in calcite schema.

Example
"table_name": "${type}" // current
"table_name": "${index}" // new (default?)
"table_name": "${index}_${type}" // most generic. supports multiple types
per index





On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org> wrote:

> I think it sounds like you and Andrei are in a good position to tackle this
> one so I'm happy to have you both work on whatever solution you think is
> best.
>
> --
> Michael Mior
> mmior@apache.org
>
>
>
> Le ven. 29 juin 2018 à 04:19, Christian Beikov <christian.beikov@gmail.com
> >
> a écrit :
>
> > IMO the best solution would be to make it configurable by introducing a
> > "table_mapping" config with values
> >
> >   * type - every type in the known indices is mapped as table
> >   * index - every known index is mapped as table
> >
> > We'd probably also need a "type_field" configuration for defining which
> > field to use for the type determination as one of the possible future
> > ways to do things is to introduce a custom field:
> >
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
> >
> > We already detect the ES version, so we can set a smart default for this
> > setting. Let's make the index config param optional.
> >
> >   * When no index is given, we discover indexes, the default for
> >     "table_mapping" then is "index"
> >   * When index is given, the we only discover types according to the
> >     "type_field" configuration and the default for "table_mapping" is
> > "type"
> >
> > This would also allow to discover indexes but still use "type" as
> > "table_mapping".
> >
> > What do you think?
> >
> > Mit freundlichen Grüßen,
> > ------------------------------------------------------------------------
> > *Christian Beikov*
> > Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
> > > Yes. There is an API to list all indexes / types in elastic. They can
> be
> > > automatically imported into a schema.
> > >
> > > What needs to be agreed upon is how to expose those elements in calcite
> > > schema (naming / behaviour).
> > >
> > > 1) Many (most?) of setups are single type per index. Natural way to
> name
> > > would be  "elastic.$index" (elastic being schema name). Multiple
> indexes
> > > would be under same schema "elastic.index1" "elastic.index2" etc.
> > >
> > > 2) What if index has several types should they exported as calcite
> > tables:
> > > "elastic.$index_type1" "elastic.$index_type2" ?  Or (current behaviour)
> > as
> > > "elastic.type1" and "elastic.type2". Or as subschema
> > > "elastic.$index.type1" ?
> > >
> > > Now what if one has combination of (1) and (2) ?
> > > Setup (2) is already deprecated (and will be unsupported in next
> version)
> > >
> > >
> > > On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
> > christian.beikov@gmail.com>
> > > wrote:
> > >
> > >> Is there an API to discover indexes? If there is, I'd suggest we
> allow a
> > >> config option that to make the adapter discover the possible indexes.
> > >> We'd still have to adapt the code a bit, but internally, the schema
> > >> could just keep a cache of type name to index name map and be able to
> > >> support both scenarios.
> > >>
> > >>
> > >> Mit freundlichen Grüßen,
> > >>
> ------------------------------------------------------------------------
> > >> *Christian Beikov*
> > >> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
> > >>>> 1) What's the time horizon for the current adapter no longer working
> > >> with these
> > >>> changes to ES ?
> > >>> Current adapter will be working for a while with existing setup. The
> > >>> problem is nomenclature and ease of use.
> > >>>
> > >>> Their new SQL concepts mapping
> > >>> <
> > >>
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> > >>> drops
> > >>> the notion of ES type (which before was equivalent of RDBMS table)
> and
> > >> uses
> > >>> ES index as new table equivalent (before ES index was equal to
> > database).
> > >>> Most users use elastic this way (one type , one index) index ==
> table.
> > >>>
> > >>> Currently calcite requires schema per index. In RDBMS parlance
> database
> > >> per
> > >>> table (I'd like to change that).
> > >>>
> > >>>> 2) Any guess how complicated it would be to maintain code paths for
> > both
> > >>>> behaviours? I know this is probably really challenging to estimate,
> > but
> > >> I
> > >>>> really have no idea of the scope of these changes. Would it mean two
> > >>>> different ES adapters?
> > >>> One can have just a separate calcite schema implementations (same
> > >> adapter /
> > >>> module) :
> > >>> 1)  LegacySchema (old). Schema can have only one index (but multiple
> > >>> types). Type == table in this case.
> > >>> 2)  NewSchema (new). Single schema can have multiple indexes (type is
> > >>> dropped). Index == table in this case
> > >>>
> > >>>> 3) Do we really need compatibility with the current version of the
> > >>> adapter?
> > >>>> IMO this depends on what versions of ES we would lose support for
> and
> > >> how
> > >>>> complex it would be for users of the current ES adapter to make
> > updates
> > >>> for
> > >>>> any Calcite API changes.
> > >>> The issue is not in adapter but how calcite schema exposes tables.
> > >> Should
> > >>> it expose index as individual table (new), or ES type (old) ?
> > >>>
> > >>> Andrei.
> > >>>
> > >>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org>
> wrote:
> > >>>
> > >>>> Unfortunately I know very little about ES so I'm not in a great
> > >> position to
> > >>>> asses the impact of these changes. I will say that that legacy
> > >>>> compatibility is great, but maintaining two sets of logic is always
> a
> > >>>> challenge. A few follow up questions:
> > >>>>
> > >>>> 1) What's the time horizon for the current adapter no longer working
> > >> with
> > >>>> these changes to ES?
> > >>>>
> > >>>> 2) Any guess how complicated it would be to maintain code paths for
> > both
> > >>>> behaviours? I know this is probably really challenging to estimate,
> > but
> > >> I
> > >>>> really have no idea of the scope of these changes. Would it mean two
> > >>>> different ES adapters?
> > >>>>
> > >>>> 3) Do we really need compatibility with the current version of the
> > >> adapter?
> > >>>> IMO this depends on what versions of ES we would lose support for
> and
> > >> how
> > >>>> complex it would be for users of the current ES adapter to make
> > updates
> > >> for
> > >>>> any Calcite API changes.
> > >>>>
> > >>>> Thanks for your continued work on the ES adapter Andrei!
> > >>>>
> > >>>> --
> > >>>> Michael Mior
> > >>>> mmior@apache.org
> > >>>>
> > >>>>
> > >>>>
> > >>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a
> > écrit
> > >> :
> > >>>>> Hello,
> > >>>>>
> > >>>>> Elastic announced
> > >>>>> <
> > >>>>>
> > >>
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> > >>>>> that they will be deprecating mapping types in ES6 and indexes will
> > be
> > >>>>> single-typed only.
> > >>>>>
> > >>>>> Historical analogy <https://www.elastic.co/blog/index-vs-type>
> > between
> > >>>>> RDBMS and elastic was that index is equivalent to a database and
> type
> > >>>>> corresponds to table in that database. In a couple of releases
> > (ES6-8)
> > >>>> this
> > >>>>> shall not longer be true.
> > >>>>>
> > >>>>> Recent SQL addition
> > >>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to
> > elastic
> > >>>>> confirms
> > >>>>> this trend
> > >>>>> <
> > >>>>>
> > >>
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> > >>>>>> .
> > >>>>> Index is equivalent to a table and there are no more ES types.
> > >>>>>
> > >>>>> I would like to propose to include this logic in Calcite ES
> adapter.
> > >> IE,
> > >>>>> expose each ES single-typed index as a separate table inside
> calcite
> > >>>>> schema. This is in contrast to  current integration where schema
> can
> > >> only
> > >>>>> have a single index. Current approach forces you to create multiple
> > >>>> schemas
> > >>>>> to query single-typed indexes (on the same ES cluster).
> > >>>>>
> > >>>>> Legacy compatibility can always be controlled with configuration
> > >>>>> parameters.
> > >>>>>
> > >>>>> Do you agree with such changes ? If yes, would you consider a PR ?
> > >>>>>
> > >>>>> Regards,
> > >>>>> Andrei.
> > >>>>>
> > >>
> >
> >
>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Michael Mior <mm...@apache.org>.
I think it sounds like you and Andrei are in a good position to tackle this
one so I'm happy to have you both work on whatever solution you think is
best.

--
Michael Mior
mmior@apache.org



Le ven. 29 juin 2018 à 04:19, Christian Beikov <ch...@gmail.com>
a écrit :

> IMO the best solution would be to make it configurable by introducing a
> "table_mapping" config with values
>
>   * type - every type in the known indices is mapped as table
>   * index - every known index is mapped as table
>
> We'd probably also need a "type_field" configuration for defining which
> field to use for the type determination as one of the possible future
> ways to do things is to introduce a custom field:
>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
>
> We already detect the ES version, so we can set a smart default for this
> setting. Let's make the index config param optional.
>
>   * When no index is given, we discover indexes, the default for
>     "table_mapping" then is "index"
>   * When index is given, the we only discover types according to the
>     "type_field" configuration and the default for "table_mapping" is
> "type"
>
> This would also allow to discover indexes but still use "type" as
> "table_mapping".
>
> What do you think?
>
> Mit freundlichen Grüßen,
> ------------------------------------------------------------------------
> *Christian Beikov*
> Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
> > Yes. There is an API to list all indexes / types in elastic. They can be
> > automatically imported into a schema.
> >
> > What needs to be agreed upon is how to expose those elements in calcite
> > schema (naming / behaviour).
> >
> > 1) Many (most?) of setups are single type per index. Natural way to name
> > would be  "elastic.$index" (elastic being schema name). Multiple indexes
> > would be under same schema "elastic.index1" "elastic.index2" etc.
> >
> > 2) What if index has several types should they exported as calcite
> tables:
> > "elastic.$index_type1" "elastic.$index_type2" ?  Or (current behaviour)
> as
> > "elastic.type1" and "elastic.type2". Or as subschema
> > "elastic.$index.type1" ?
> >
> > Now what if one has combination of (1) and (2) ?
> > Setup (2) is already deprecated (and will be unsupported in next version)
> >
> >
> > On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
> christian.beikov@gmail.com>
> > wrote:
> >
> >> Is there an API to discover indexes? If there is, I'd suggest we allow a
> >> config option that to make the adapter discover the possible indexes.
> >> We'd still have to adapt the code a bit, but internally, the schema
> >> could just keep a cache of type name to index name map and be able to
> >> support both scenarios.
> >>
> >>
> >> Mit freundlichen Grüßen,
> >> ------------------------------------------------------------------------
> >> *Christian Beikov*
> >> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
> >>>> 1) What's the time horizon for the current adapter no longer working
> >> with these
> >>> changes to ES ?
> >>> Current adapter will be working for a while with existing setup. The
> >>> problem is nomenclature and ease of use.
> >>>
> >>> Their new SQL concepts mapping
> >>> <
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >>> drops
> >>> the notion of ES type (which before was equivalent of RDBMS table) and
> >> uses
> >>> ES index as new table equivalent (before ES index was equal to
> database).
> >>> Most users use elastic this way (one type , one index) index == table.
> >>>
> >>> Currently calcite requires schema per index. In RDBMS parlance database
> >> per
> >>> table (I'd like to change that).
> >>>
> >>>> 2) Any guess how complicated it would be to maintain code paths for
> both
> >>>> behaviours? I know this is probably really challenging to estimate,
> but
> >> I
> >>>> really have no idea of the scope of these changes. Would it mean two
> >>>> different ES adapters?
> >>> One can have just a separate calcite schema implementations (same
> >> adapter /
> >>> module) :
> >>> 1)  LegacySchema (old). Schema can have only one index (but multiple
> >>> types). Type == table in this case.
> >>> 2)  NewSchema (new). Single schema can have multiple indexes (type is
> >>> dropped). Index == table in this case
> >>>
> >>>> 3) Do we really need compatibility with the current version of the
> >>> adapter?
> >>>> IMO this depends on what versions of ES we would lose support for and
> >> how
> >>>> complex it would be for users of the current ES adapter to make
> updates
> >>> for
> >>>> any Calcite API changes.
> >>> The issue is not in adapter but how calcite schema exposes tables.
> >> Should
> >>> it expose index as individual table (new), or ES type (old) ?
> >>>
> >>> Andrei.
> >>>
> >>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org> wrote:
> >>>
> >>>> Unfortunately I know very little about ES so I'm not in a great
> >> position to
> >>>> asses the impact of these changes. I will say that that legacy
> >>>> compatibility is great, but maintaining two sets of logic is always a
> >>>> challenge. A few follow up questions:
> >>>>
> >>>> 1) What's the time horizon for the current adapter no longer working
> >> with
> >>>> these changes to ES?
> >>>>
> >>>> 2) Any guess how complicated it would be to maintain code paths for
> both
> >>>> behaviours? I know this is probably really challenging to estimate,
> but
> >> I
> >>>> really have no idea of the scope of these changes. Would it mean two
> >>>> different ES adapters?
> >>>>
> >>>> 3) Do we really need compatibility with the current version of the
> >> adapter?
> >>>> IMO this depends on what versions of ES we would lose support for and
> >> how
> >>>> complex it would be for users of the current ES adapter to make
> updates
> >> for
> >>>> any Calcite API changes.
> >>>>
> >>>> Thanks for your continued work on the ES adapter Andrei!
> >>>>
> >>>> --
> >>>> Michael Mior
> >>>> mmior@apache.org
> >>>>
> >>>>
> >>>>
> >>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a
> écrit
> >> :
> >>>>> Hello,
> >>>>>
> >>>>> Elastic announced
> >>>>> <
> >>>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> >>>>> that they will be deprecating mapping types in ES6 and indexes will
> be
> >>>>> single-typed only.
> >>>>>
> >>>>> Historical analogy <https://www.elastic.co/blog/index-vs-type>
> between
> >>>>> RDBMS and elastic was that index is equivalent to a database and type
> >>>>> corresponds to table in that database. In a couple of releases
> (ES6-8)
> >>>> this
> >>>>> shall not longer be true.
> >>>>>
> >>>>> Recent SQL addition
> >>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to
> elastic
> >>>>> confirms
> >>>>> this trend
> >>>>> <
> >>>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >>>>>> .
> >>>>> Index is equivalent to a table and there are no more ES types.
> >>>>>
> >>>>> I would like to propose to include this logic in Calcite ES adapter.
> >> IE,
> >>>>> expose each ES single-typed index as a separate table inside calcite
> >>>>> schema. This is in contrast to  current integration where schema can
> >> only
> >>>>> have a single index. Current approach forces you to create multiple
> >>>> schemas
> >>>>> to query single-typed indexes (on the same ES cluster).
> >>>>>
> >>>>> Legacy compatibility can always be controlled with configuration
> >>>>> parameters.
> >>>>>
> >>>>> Do you agree with such changes ? If yes, would you consider a PR ?
> >>>>>
> >>>>> Regards,
> >>>>> Andrei.
> >>>>>
> >>
>
>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Christian Beikov <ch...@gmail.com>.
IMO the best solution would be to make it configurable by introducing a 
"table_mapping" config with values

  * type - every type in the known indices is mapped as table
  * index - every known index is mapped as table

We'd probably also need a "type_field" configuration for defining which 
field to use for the type determination as one of the possible future 
ways to do things is to introduce a custom field: 
https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2

We already detect the ES version, so we can set a smart default for this 
setting. Let's make the index config param optional.

  * When no index is given, we discover indexes, the default for
    "table_mapping" then is "index"
  * When index is given, the we only discover types according to the
    "type_field" configuration and the default for "table_mapping" is "type"

This would also allow to discover indexes but still use "type" as 
"table_mapping".

What do you think?

Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
> Yes. There is an API to list all indexes / types in elastic. They can be
> automatically imported into a schema.
>
> What needs to be agreed upon is how to expose those elements in calcite
> schema (naming / behaviour).
>
> 1) Many (most?) of setups are single type per index. Natural way to name
> would be  "elastic.$index" (elastic being schema name). Multiple indexes
> would be under same schema "elastic.index1" "elastic.index2" etc.
>
> 2) What if index has several types should they exported as calcite tables:
> "elastic.$index_type1" "elastic.$index_type2" ?  Or (current behaviour) as
> "elastic.type1" and "elastic.type2". Or as subschema
> "elastic.$index.type1" ?
>
> Now what if one has combination of (1) and (2) ?
> Setup (2) is already deprecated (and will be unsupported in next version)
>
>
> On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <ch...@gmail.com>
> wrote:
>
>> Is there an API to discover indexes? If there is, I'd suggest we allow a
>> config option that to make the adapter discover the possible indexes.
>> We'd still have to adapt the code a bit, but internally, the schema
>> could just keep a cache of type name to index name map and be able to
>> support both scenarios.
>>
>>
>> Mit freundlichen Grüßen,
>> ------------------------------------------------------------------------
>> *Christian Beikov*
>> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
>>>> 1) What's the time horizon for the current adapter no longer working
>> with these
>>> changes to ES ?
>>> Current adapter will be working for a while with existing setup. The
>>> problem is nomenclature and ease of use.
>>>
>>> Their new SQL concepts mapping
>>> <
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>>> drops
>>> the notion of ES type (which before was equivalent of RDBMS table) and
>> uses
>>> ES index as new table equivalent (before ES index was equal to database).
>>> Most users use elastic this way (one type , one index) index == table.
>>>
>>> Currently calcite requires schema per index. In RDBMS parlance database
>> per
>>> table (I'd like to change that).
>>>
>>>> 2) Any guess how complicated it would be to maintain code paths for both
>>>> behaviours? I know this is probably really challenging to estimate, but
>> I
>>>> really have no idea of the scope of these changes. Would it mean two
>>>> different ES adapters?
>>> One can have just a separate calcite schema implementations (same
>> adapter /
>>> module) :
>>> 1)  LegacySchema (old). Schema can have only one index (but multiple
>>> types). Type == table in this case.
>>> 2)  NewSchema (new). Single schema can have multiple indexes (type is
>>> dropped). Index == table in this case
>>>
>>>> 3) Do we really need compatibility with the current version of the
>>> adapter?
>>>> IMO this depends on what versions of ES we would lose support for and
>> how
>>>> complex it would be for users of the current ES adapter to make updates
>>> for
>>>> any Calcite API changes.
>>> The issue is not in adapter but how calcite schema exposes tables.
>> Should
>>> it expose index as individual table (new), or ES type (old) ?
>>>
>>> Andrei.
>>>
>>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org> wrote:
>>>
>>>> Unfortunately I know very little about ES so I'm not in a great
>> position to
>>>> asses the impact of these changes. I will say that that legacy
>>>> compatibility is great, but maintaining two sets of logic is always a
>>>> challenge. A few follow up questions:
>>>>
>>>> 1) What's the time horizon for the current adapter no longer working
>> with
>>>> these changes to ES?
>>>>
>>>> 2) Any guess how complicated it would be to maintain code paths for both
>>>> behaviours? I know this is probably really challenging to estimate, but
>> I
>>>> really have no idea of the scope of these changes. Would it mean two
>>>> different ES adapters?
>>>>
>>>> 3) Do we really need compatibility with the current version of the
>> adapter?
>>>> IMO this depends on what versions of ES we would lose support for and
>> how
>>>> complex it would be for users of the current ES adapter to make updates
>> for
>>>> any Calcite API changes.
>>>>
>>>> Thanks for your continued work on the ES adapter Andrei!
>>>>
>>>> --
>>>> Michael Mior
>>>> mmior@apache.org
>>>>
>>>>
>>>>
>>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a écrit
>> :
>>>>> Hello,
>>>>>
>>>>> Elastic announced
>>>>> <
>>>>>
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
>>>>> that they will be deprecating mapping types in ES6 and indexes will be
>>>>> single-typed only.
>>>>>
>>>>> Historical analogy <https://www.elastic.co/blog/index-vs-type> between
>>>>> RDBMS and elastic was that index is equivalent to a database and type
>>>>> corresponds to table in that database. In a couple of releases (ES6-8)
>>>> this
>>>>> shall not longer be true.
>>>>>
>>>>> Recent SQL addition
>>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to elastic
>>>>> confirms
>>>>> this trend
>>>>> <
>>>>>
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>>>>>> .
>>>>> Index is equivalent to a table and there are no more ES types.
>>>>>
>>>>> I would like to propose to include this logic in Calcite ES adapter.
>> IE,
>>>>> expose each ES single-typed index as a separate table inside calcite
>>>>> schema. This is in contrast to  current integration where schema can
>> only
>>>>> have a single index. Current approach forces you to create multiple
>>>> schemas
>>>>> to query single-typed indexes (on the same ES cluster).
>>>>>
>>>>> Legacy compatibility can always be controlled with configuration
>>>>> parameters.
>>>>>
>>>>> Do you agree with such changes ? If yes, would you consider a PR ?
>>>>>
>>>>> Regards,
>>>>> Andrei.
>>>>>
>>


Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Andrei Sereda <an...@sereda.cc>.
Yes. There is an API to list all indexes / types in elastic. They can be
automatically imported into a schema.

What needs to be agreed upon is how to expose those elements in calcite
schema (naming / behaviour).

1) Many (most?) of setups are single type per index. Natural way to name
would be  "elastic.$index" (elastic being schema name). Multiple indexes
would be under same schema "elastic.index1" "elastic.index2" etc.

2) What if index has several types should they exported as calcite tables:
"elastic.$index_type1" "elastic.$index_type2" ?  Or (current behaviour) as
"elastic.type1" and "elastic.type2". Or as subschema
"elastic.$index.type1" ?

Now what if one has combination of (1) and (2) ?
Setup (2) is already deprecated (and will be unsupported in next version)


On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <ch...@gmail.com>
wrote:

> Is there an API to discover indexes? If there is, I'd suggest we allow a
> config option that to make the adapter discover the possible indexes.
> We'd still have to adapt the code a bit, but internally, the schema
> could just keep a cache of type name to index name map and be able to
> support both scenarios.
>
>
> Mit freundlichen Grüßen,
> ------------------------------------------------------------------------
> *Christian Beikov*
> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
> >> 1) What's the time horizon for the current adapter no longer working
> with these
> > changes to ES ?
> > Current adapter will be working for a while with existing setup. The
> > problem is nomenclature and ease of use.
> >
> > Their new SQL concepts mapping
> > <
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >
> > drops
> > the notion of ES type (which before was equivalent of RDBMS table) and
> uses
> > ES index as new table equivalent (before ES index was equal to database).
> > Most users use elastic this way (one type , one index) index == table.
> >
> > Currently calcite requires schema per index. In RDBMS parlance database
> per
> > table (I'd like to change that).
> >
> >> 2) Any guess how complicated it would be to maintain code paths for both
> >> behaviours? I know this is probably really challenging to estimate, but
> I
> >> really have no idea of the scope of these changes. Would it mean two
> >> different ES adapters?
> > One can have just a separate calcite schema implementations (same
> adapter /
> > module) :
> > 1)  LegacySchema (old). Schema can have only one index (but multiple
> > types). Type == table in this case.
> > 2)  NewSchema (new). Single schema can have multiple indexes (type is
> > dropped). Index == table in this case
> >
> >> 3) Do we really need compatibility with the current version of the
> > adapter?
> >> IMO this depends on what versions of ES we would lose support for and
> how
> >> complex it would be for users of the current ES adapter to make updates
> > for
> >> any Calcite API changes.
> > The issue is not in adapter but how calcite schema exposes tables.
> Should
> > it expose index as individual table (new), or ES type (old) ?
> >
> > Andrei.
> >
> > On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org> wrote:
> >
> >> Unfortunately I know very little about ES so I'm not in a great
> position to
> >> asses the impact of these changes. I will say that that legacy
> >> compatibility is great, but maintaining two sets of logic is always a
> >> challenge. A few follow up questions:
> >>
> >> 1) What's the time horizon for the current adapter no longer working
> with
> >> these changes to ES?
> >>
> >> 2) Any guess how complicated it would be to maintain code paths for both
> >> behaviours? I know this is probably really challenging to estimate, but
> I
> >> really have no idea of the scope of these changes. Would it mean two
> >> different ES adapters?
> >>
> >> 3) Do we really need compatibility with the current version of the
> adapter?
> >> IMO this depends on what versions of ES we would lose support for and
> how
> >> complex it would be for users of the current ES adapter to make updates
> for
> >> any Calcite API changes.
> >>
> >> Thanks for your continued work on the ES adapter Andrei!
> >>
> >> --
> >> Michael Mior
> >> mmior@apache.org
> >>
> >>
> >>
> >> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a écrit
> :
> >>
> >>> Hello,
> >>>
> >>> Elastic announced
> >>> <
> >>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> >>> that they will be deprecating mapping types in ES6 and indexes will be
> >>> single-typed only.
> >>>
> >>> Historical analogy <https://www.elastic.co/blog/index-vs-type> between
> >>> RDBMS and elastic was that index is equivalent to a database and type
> >>> corresponds to table in that database. In a couple of releases (ES6-8)
> >> this
> >>> shall not longer be true.
> >>>
> >>> Recent SQL addition
> >>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to elastic
> >>> confirms
> >>> this trend
> >>> <
> >>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >>>> .
> >>> Index is equivalent to a table and there are no more ES types.
> >>>
> >>> I would like to propose to include this logic in Calcite ES adapter.
> IE,
> >>> expose each ES single-typed index as a separate table inside calcite
> >>> schema. This is in contrast to  current integration where schema can
> only
> >>> have a single index. Current approach forces you to create multiple
> >> schemas
> >>> to query single-typed indexes (on the same ES cluster).
> >>>
> >>> Legacy compatibility can always be controlled with configuration
> >>> parameters.
> >>>
> >>> Do you agree with such changes ? If yes, would you consider a PR ?
> >>>
> >>> Regards,
> >>> Andrei.
> >>>
>
>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Christian Beikov <ch...@gmail.com>.
Is there an API to discover indexes? If there is, I'd suggest we allow a 
config option that to make the adapter discover the possible indexes. 
We'd still have to adapt the code a bit, but internally, the schema 
could just keep a cache of type name to index name map and be able to 
support both scenarios.


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
>> 1) What's the time horizon for the current adapter no longer working with these
> changes to ES ?
> Current adapter will be working for a while with existing setup. The
> problem is nomenclature and ease of use.
>
> Their new SQL concepts mapping
> <https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html>
> drops
> the notion of ES type (which before was equivalent of RDBMS table) and uses
> ES index as new table equivalent (before ES index was equal to database).
> Most users use elastic this way (one type , one index) index == table.
>
> Currently calcite requires schema per index. In RDBMS parlance database per
> table (I'd like to change that).
>
>> 2) Any guess how complicated it would be to maintain code paths for both
>> behaviours? I know this is probably really challenging to estimate, but I
>> really have no idea of the scope of these changes. Would it mean two
>> different ES adapters?
> One can have just a separate calcite schema implementations (same adapter /
> module) :
> 1)  LegacySchema (old). Schema can have only one index (but multiple
> types). Type == table in this case.
> 2)  NewSchema (new). Single schema can have multiple indexes (type is
> dropped). Index == table in this case
>
>> 3) Do we really need compatibility with the current version of the
> adapter?
>> IMO this depends on what versions of ES we would lose support for and how
>> complex it would be for users of the current ES adapter to make updates
> for
>> any Calcite API changes.
> The issue is not in adapter but how calcite schema exposes tables.  Should
> it expose index as individual table (new), or ES type (old) ?
>
> Andrei.
>
> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org> wrote:
>
>> Unfortunately I know very little about ES so I'm not in a great position to
>> asses the impact of these changes. I will say that that legacy
>> compatibility is great, but maintaining two sets of logic is always a
>> challenge. A few follow up questions:
>>
>> 1) What's the time horizon for the current adapter no longer working with
>> these changes to ES?
>>
>> 2) Any guess how complicated it would be to maintain code paths for both
>> behaviours? I know this is probably really challenging to estimate, but I
>> really have no idea of the scope of these changes. Would it mean two
>> different ES adapters?
>>
>> 3) Do we really need compatibility with the current version of the adapter?
>> IMO this depends on what versions of ES we would lose support for and how
>> complex it would be for users of the current ES adapter to make updates for
>> any Calcite API changes.
>>
>> Thanks for your continued work on the ES adapter Andrei!
>>
>> --
>> Michael Mior
>> mmior@apache.org
>>
>>
>>
>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a écrit :
>>
>>> Hello,
>>>
>>> Elastic announced
>>> <
>>>
>> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
>>> that they will be deprecating mapping types in ES6 and indexes will be
>>> single-typed only.
>>>
>>> Historical analogy <https://www.elastic.co/blog/index-vs-type> between
>>> RDBMS and elastic was that index is equivalent to a database and type
>>> corresponds to table in that database. In a couple of releases (ES6-8)
>> this
>>> shall not longer be true.
>>>
>>> Recent SQL addition
>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to elastic
>>> confirms
>>> this trend
>>> <
>>>
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
>>>> .
>>> Index is equivalent to a table and there are no more ES types.
>>>
>>> I would like to propose to include this logic in Calcite ES adapter. IE,
>>> expose each ES single-typed index as a separate table inside calcite
>>> schema. This is in contrast to  current integration where schema can only
>>> have a single index. Current approach forces you to create multiple
>> schemas
>>> to query single-typed indexes (on the same ES cluster).
>>>
>>> Legacy compatibility can always be controlled with configuration
>>> parameters.
>>>
>>> Do you agree with such changes ? If yes, would you consider a PR ?
>>>
>>> Regards,
>>> Andrei.
>>>


Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Andrei Sereda <an...@sereda.cc>.
> 1) What's the time horizon for the current adapter no longer working with these
changes to ES ?
Current adapter will be working for a while with existing setup. The
problem is nomenclature and ease of use.

Their new SQL concepts mapping
<https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html>
drops
the notion of ES type (which before was equivalent of RDBMS table) and uses
ES index as new table equivalent (before ES index was equal to database).
Most users use elastic this way (one type , one index) index == table.

Currently calcite requires schema per index. In RDBMS parlance database per
table (I'd like to change that).

> 2) Any guess how complicated it would be to maintain code paths for both
> behaviours? I know this is probably really challenging to estimate, but I
> really have no idea of the scope of these changes. Would it mean two
> different ES adapters?

One can have just a separate calcite schema implementations (same adapter /
module) :
1)  LegacySchema (old). Schema can have only one index (but multiple
types). Type == table in this case.
2)  NewSchema (new). Single schema can have multiple indexes (type is
dropped). Index == table in this case

> 3) Do we really need compatibility with the current version of the
adapter?
> IMO this depends on what versions of ES we would lose support for and how
> complex it would be for users of the current ES adapter to make updates
for
> any Calcite API changes.

The issue is not in adapter but how calcite schema exposes tables.  Should
it expose index as individual table (new), or ES type (old) ?

Andrei.

On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org> wrote:

> Unfortunately I know very little about ES so I'm not in a great position to
> asses the impact of these changes. I will say that that legacy
> compatibility is great, but maintaining two sets of logic is always a
> challenge. A few follow up questions:
>
> 1) What's the time horizon for the current adapter no longer working with
> these changes to ES?
>
> 2) Any guess how complicated it would be to maintain code paths for both
> behaviours? I know this is probably really challenging to estimate, but I
> really have no idea of the scope of these changes. Would it mean two
> different ES adapters?
>
> 3) Do we really need compatibility with the current version of the adapter?
> IMO this depends on what versions of ES we would lose support for and how
> complex it would be for users of the current ES adapter to make updates for
> any Calcite API changes.
>
> Thanks for your continued work on the ES adapter Andrei!
>
> --
> Michael Mior
> mmior@apache.org
>
>
>
> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a écrit :
>
> > Hello,
> >
> > Elastic announced
> > <
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> > >
> > that they will be deprecating mapping types in ES6 and indexes will be
> > single-typed only.
> >
> > Historical analogy <https://www.elastic.co/blog/index-vs-type> between
> > RDBMS and elastic was that index is equivalent to a database and type
> > corresponds to table in that database. In a couple of releases (ES6-8)
> this
> > shall not longer be true.
> >
> > Recent SQL addition
> > <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to elastic
> > confirms
> > this trend
> > <
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> > >.
> > Index is equivalent to a table and there are no more ES types.
> >
> > I would like to propose to include this logic in Calcite ES adapter. IE,
> > expose each ES single-typed index as a separate table inside calcite
> > schema. This is in contrast to  current integration where schema can only
> > have a single index. Current approach forces you to create multiple
> schemas
> > to query single-typed indexes (on the same ES cluster).
> >
> > Legacy compatibility can always be controlled with configuration
> > parameters.
> >
> > Do you agree with such changes ? If yes, would you consider a PR ?
> >
> > Regards,
> > Andrei.
> >
>

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table

Posted by Michael Mior <mm...@apache.org>.
Unfortunately I know very little about ES so I'm not in a great position to
asses the impact of these changes. I will say that that legacy
compatibility is great, but maintaining two sets of logic is always a
challenge. A few follow up questions:

1) What's the time horizon for the current adapter no longer working with
these changes to ES?

2) Any guess how complicated it would be to maintain code paths for both
behaviours? I know this is probably really challenging to estimate, but I
really have no idea of the scope of these changes. Would it mean two
different ES adapters?

3) Do we really need compatibility with the current version of the adapter?
IMO this depends on what versions of ES we would lose support for and how
complex it would be for users of the current ES adapter to make updates for
any Calcite API changes.

Thanks for your continued work on the ES adapter Andrei!

--
Michael Mior
mmior@apache.org



Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <an...@sereda.cc> a écrit :

> Hello,
>
> Elastic announced
> <
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> >
> that they will be deprecating mapping types in ES6 and indexes will be
> single-typed only.
>
> Historical analogy <https://www.elastic.co/blog/index-vs-type> between
> RDBMS and elastic was that index is equivalent to a database and type
> corresponds to table in that database. In a couple of releases (ES6-8) this
> shall not longer be true.
>
> Recent SQL addition
> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> to elastic
> confirms
> this trend
> <
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >.
> Index is equivalent to a table and there are no more ES types.
>
> I would like to propose to include this logic in Calcite ES adapter. IE,
> expose each ES single-typed index as a separate table inside calcite
> schema. This is in contrast to  current integration where schema can only
> have a single index. Current approach forces you to create multiple schemas
> to query single-typed indexes (on the same ES cluster).
>
> Legacy compatibility can always be controlled with configuration
> parameters.
>
> Do you agree with such changes ? If yes, would you consider a PR ?
>
> Regards,
> Andrei.
>