You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Stéphane Gayet <St...@misterfly.com> on 2018/06/03 08:47:08 UTC

Affinity colocation and sql queries

Hi All,


I'm not sure to have well-understood Ignite affinity colocation. I have several questions.


1/ Does sql queries benefit from affinity colocation?


2/ The use case is an sql query like :

SELECT SUBSET1.KEY AS SET1_KEY, SUBSET2.KEY AS SET2_KEY

FROM WHOLESET AS SUBSET1, WHOLESET AS SUBSET2

WHERE SUBSET1.KEY = "key1" AND SUBSET2.KEY = "key2" AND (some other conditions)


I know that items with keys "key1" and "key2" will be crossed together almost all the time and I plan to define an affinity key so:

- items with "key1" and "key2" => affinity key 1

- items with "key3" and "key4" => another affinity key

and so on...


Does this work and will speed up the request?


3/ If I define an affinity key on an object, should the field marked also as [SqlQueryField] although it will not be part of the sql query? If yes, should it be indexed?

4/ Or, this doesn't work but it will if I split each subset in different caches and define affinity key as above.
cache1 = subset items with key "key1"
cache2 = subset items with key "key2"
...


5/ Or, nope, there is no way to achieve that.


Regards,

Stéphane Gayet

RE: Affinity colocation and sql queries

Posted by Stéphane Gayet <St...@misterfly.com>.

Thanks Ilya, Denis.


Regards,

________________________________
De : ilya.kasnacheev <il...@gmail.com>
Envoyé : mardi 5 juin 2018 18:15:45
À : user@ignite.apache.org
Objet : Re: Affinity colocation and sql queries

Hello!

1) Yes!

2) "almost all the time" is not a very strong constraint here. You might
need to enable distributed joins to get such (partially-collocated) queries
to perform. Your query might still benefit from collocation if query planner
realizes that all keys are located on the same node.

3) No, and no.

4) Having excessive number of caches has performance implications and
unlikely to lead to performance benefits.

Regards,



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Affinity colocation and sql queries

Posted by "ilya.kasnacheev" <il...@gmail.com>.

Hello!

1) Yes!

2) "almost all the time" is not a very strong constraint here. You might
need to enable distributed joins to get such (partially-collocated) queries
to perform. Your query might still benefit from collocation if query planner
realizes that all keys are located on the same node.

3) No, and no.

4) Having excessive number of caches has performance implications and
unlikely to lead to performance benefits.

Regards,



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Affinity colocation and sql queries

Posted by Denis Magda <dm...@apache.org>.

Hi Stephane,

Affinity collocation and affinity keys are crucial concepts/APIs to make
SQL JOINs work really fast. Similar concepts are adopted in other
distributed databases like Google Spanner, MemSQL, VoltDB, etc.

This page should shed more light on your question:
https://apacheignite-sql.readme.io/docs/distributed-joins#collocated-joins

This recording might be useful as well:
https://www.gridgain.com/resources/webinars/in-memory-computing-essentials-architects-and-developers-part-2

You can define the affinity key using CREATE TABLE command or
[SqlQueryField] annotation depending on a way you configure caches/tables.

--
Denis

On Sun, Jun 3, 2018 at 1:47 AM, Stéphane Gayet <Stephane.Gayet@misterfly.com
> wrote:

> Hi All,
>
>
> I'm not sure to have well-understood Ignite affinity colocation. I have
> several questions.
>
>
> 1/ Does sql queries benefit from affinity colocation?
>
>
> 2/ The use case is an sql query like :
>
> SELECT SUBSET1.KEY AS SET1_KEY, SUBSET2.KEY AS SET2_KEY
>
> FROM WHOLESET AS SUBSET1, WHOLESET AS SUBSET2
>
> WHERE SUBSET1.KEY = "key1" AND SUBSET2.KEY = "key2" AND (some other
> conditions)
>
>
> I know that items with keys "key1" and "key2" will be crossed together
> almost all the time and I plan to define an affinity key so:
>
> - items with "key1" and "key2" => affinity key 1
>
> - items with "key3" and "key4" => another affinity key
>
> and so on...
>
>
> Does this work and will speed up the request?
>
>
> 3/ If I define an affinity key on an object, should the field marked also
> as [SqlQueryField] although it will not be part of the sql query? If yes,
> should it be indexed?
>
> 4/ Or, this doesn't work but it will if I split each subset in different
> caches and define affinity key as above.
>
> cache1 = subset items with key "key1"
> cache2 = subset items with key "key2"
> ...
>
>
> 5/ Or, nope, there is no way to achieve that.
>
>
> Regards,
>
> Stéphane Gayet
>
>
>