You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Shiva Kumar <sh...@gmail.com> on 2019/09/19 13:17:43 UTC

distributed sql join not working as mentioned in documentation

Hi all,
I am trying to do a simple cross join on two tables with non-collocated
data (without affinity key),
This non-collocated distributed join always fails with the error message:

*"java.sql.SQLException: javax.cache.CacheException: Failed to prepare
distributed join query: join condition does not use index "*

If I create one of the tables in replicated mode and another one in
partitioned mode this Join operation works but documentation mentions that
Ignite supports non-collocated joins without any condition.
And we tried with 3 tables and 1 in replicated and other 2 in partitioned
then we observed that it failed.
we are running the Join operations with *distributedJoins=true.*
*We observed that if there are N tables in Join operation then (N-1) should
be in replicated mode, is our understanding right?*
*If our understanding is correct then to do Join operation the dimensioning
of cluster increases by many folds which can't be used in a production
environment.*
*To reproduce:*
*Ignite with 4 node cluster with native persistence enabled.*
*create the following tables*

CREATE TABLE City (

  id LONG PRIMARY KEY, name VARCHAR)

  WITH "backup=1";

CREATE TABLE Person (

  id LONG, name VARCHAR, city_id LONG, PRIMARY KEY (id, city_id))

  WITH "backups=1";

CREATE INDEX idx_city_name ON City (name);

CREATE INDEX idx_person_name ON Person (name);


INSERT INTO City (id, name) VALUES (1, 'Forest Hill');

INSERT INTO City (id, name) VALUES (2, 'Denver');

INSERT INTO City (id, name) VALUES (3, 'St. Petersburg');

INSERT INTO Person (id, name, city_id) VALUES (1, 'John Doe', 3);

INSERT INTO Person (id, name, city_id) VALUES (2, 'Jane Roe', 2);

INSERT INTO Person (id, name, city_id) VALUES (3, 'Mary Major', 1);

INSERT INTO Person (id, name, city_id) VALUES (4, 'Richard Miles', 2);


Query to be run:

select * from City c, Person p;

or
*SELECT* * *FROM* City *AS* c *CROSS* *join* Person *AS* p;

Re: distributed sql join not working as mentioned in documentation

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

This is an interesting situation to consider.

Have you tried *SELECT* * *FROM* City *AS* c *join* (SELECT * FROM Person)
*AS* p; ?

Otherwise, I don't think we support joins without conditions, unless most
of tables are replicated.

Regards,
-- 
Ilya Kasnacheev


чт, 26 сент. 2019 г. в 13:40, Shiva Kumar <sh...@gmail.com>:

> Hi Evgenii,
> Even with *where condition*, I am getting the same error.
> I have some use case where I can't collocate tables data, as Ignite doc
> says non-collocated distributed join or cross join is supported in Ignite I
> am trying to use that but getting this exception when I create tables in
> replicated mode.
> I have filed a bug  https://issues.apache.org/jira/browse/IGNITE-12201
>
> regards,
> shiva
>
> On Mon, Sep 23, 2019 at 3:57 PM Evgenii Zhuravlev <
> e.zhuravlev.wk@gmail.com> wrote:
>
>> Hi,
>>
>> To make work this query, you can add one where clause or join condition
>> in the query, for example: where c.id = city_id;. I don't really
>> understand why do you want to run a fully distributed cross join on these
>> tables - it doesn't make sense, moreover, it will lead to the a lot of data
>> movement between nodes.
>>
>> What are you trying to achieve?
>>
>> Best Regards,
>> Evgenii
>>
>> чт, 19 сент. 2019 г. в 16:18, Shiva Kumar <sh...@gmail.com>:
>>
>>> Hi all,
>>> I am trying to do a simple cross join on two tables with non-collocated
>>> data (without affinity key),
>>> This non-collocated distributed join always fails with the error message:
>>>
>>> *"java.sql.SQLException: javax.cache.CacheException: Failed to prepare
>>> distributed join query: join condition does not use index "*
>>>
>>> If I create one of the tables in replicated mode and another one in
>>> partitioned mode this Join operation works but documentation mentions that
>>> Ignite supports non-collocated joins without any condition.
>>> And we tried with 3 tables and 1 in replicated and other 2 in
>>> partitioned then we observed that it failed.
>>> we are running the Join operations with *distributedJoins=true.*
>>> *We observed that if there are N tables in Join operation then (N-1)
>>> should be in replicated mode, is our understanding right?*
>>> *If our understanding is correct then to do Join operation the
>>> dimensioning of cluster increases by many folds which can't be used in a
>>> production environment.*
>>> *To reproduce:*
>>> *Ignite with 4 node cluster with native persistence enabled.*
>>> *create the following tables*
>>>
>>> CREATE TABLE City (
>>>
>>>   id LONG PRIMARY KEY, name VARCHAR)
>>>
>>>   WITH "backup=1";
>>>
>>> CREATE TABLE Person (
>>>
>>>   id LONG, name VARCHAR, city_id LONG, PRIMARY KEY (id, city_id))
>>>
>>>   WITH "backups=1";
>>>
>>> CREATE INDEX idx_city_name ON City (name);
>>>
>>> CREATE INDEX idx_person_name ON Person (name);
>>>
>>>
>>> INSERT INTO City (id, name) VALUES (1, 'Forest Hill');
>>>
>>> INSERT INTO City (id, name) VALUES (2, 'Denver');
>>>
>>> INSERT INTO City (id, name) VALUES (3, 'St. Petersburg');
>>>
>>> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Doe', 3);
>>>
>>> INSERT INTO Person (id, name, city_id) VALUES (2, 'Jane Roe', 2);
>>>
>>> INSERT INTO Person (id, name, city_id) VALUES (3, 'Mary Major', 1);
>>>
>>> INSERT INTO Person (id, name, city_id) VALUES (4, 'Richard Miles', 2);
>>>
>>>
>>> Query to be run:
>>>
>>> select * from City c, Person p;
>>>
>>> or
>>> *SELECT* * *FROM* City *AS* c *CROSS* *join* Person *AS* p;
>>>
>>>
>>>
>>>

Re: distributed sql join not working as mentioned in documentation

Posted by Shiva Kumar <sh...@gmail.com>.

Hi Evgenii,
Even with *where condition*, I am getting the same error.
I have some use case where I can't collocate tables data, as Ignite doc
says non-collocated distributed join or cross join is supported in Ignite I
am trying to use that but getting this exception when I create tables in
replicated mode.
I have filed a bug  https://issues.apache.org/jira/browse/IGNITE-12201

regards,
shiva

On Mon, Sep 23, 2019 at 3:57 PM Evgenii Zhuravlev <e....@gmail.com>
wrote:

> Hi,
>
> To make work this query, you can add one where clause or join condition in
> the query, for example: where c.id = city_id;. I don't really understand
> why do you want to run a fully distributed cross join on these tables - it
> doesn't make sense, moreover, it will lead to the a lot of data movement
> between nodes.
>
> What are you trying to achieve?
>
> Best Regards,
> Evgenii
>
> чт, 19 сент. 2019 г. в 16:18, Shiva Kumar <sh...@gmail.com>:
>
>> Hi all,
>> I am trying to do a simple cross join on two tables with non-collocated
>> data (without affinity key),
>> This non-collocated distributed join always fails with the error message:
>>
>> *"java.sql.SQLException: javax.cache.CacheException: Failed to prepare
>> distributed join query: join condition does not use index "*
>>
>> If I create one of the tables in replicated mode and another one in
>> partitioned mode this Join operation works but documentation mentions that
>> Ignite supports non-collocated joins without any condition.
>> And we tried with 3 tables and 1 in replicated and other 2 in partitioned
>> then we observed that it failed.
>> we are running the Join operations with *distributedJoins=true.*
>> *We observed that if there are N tables in Join operation then (N-1)
>> should be in replicated mode, is our understanding right?*
>> *If our understanding is correct then to do Join operation the
>> dimensioning of cluster increases by many folds which can't be used in a
>> production environment.*
>> *To reproduce:*
>> *Ignite with 4 node cluster with native persistence enabled.*
>> *create the following tables*
>>
>> CREATE TABLE City (
>>
>>   id LONG PRIMARY KEY, name VARCHAR)
>>
>>   WITH "backup=1";
>>
>> CREATE TABLE Person (
>>
>>   id LONG, name VARCHAR, city_id LONG, PRIMARY KEY (id, city_id))
>>
>>   WITH "backups=1";
>>
>> CREATE INDEX idx_city_name ON City (name);
>>
>> CREATE INDEX idx_person_name ON Person (name);
>>
>>
>> INSERT INTO City (id, name) VALUES (1, 'Forest Hill');
>>
>> INSERT INTO City (id, name) VALUES (2, 'Denver');
>>
>> INSERT INTO City (id, name) VALUES (3, 'St. Petersburg');
>>
>> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Doe', 3);
>>
>> INSERT INTO Person (id, name, city_id) VALUES (2, 'Jane Roe', 2);
>>
>> INSERT INTO Person (id, name, city_id) VALUES (3, 'Mary Major', 1);
>>
>> INSERT INTO Person (id, name, city_id) VALUES (4, 'Richard Miles', 2);
>>
>>
>> Query to be run:
>>
>> select * from City c, Person p;
>>
>> or
>> *SELECT* * *FROM* City *AS* c *CROSS* *join* Person *AS* p;
>>
>>
>>
>>

Re: distributed sql join not working as mentioned in documentation

Posted by Evgenii Zhuravlev <e....@gmail.com>.

Hi,

To make work this query, you can add one where clause or join condition in
the query, for example: where c.id = city_id;. I don't really understand
why do you want to run a fully distributed cross join on these tables - it
doesn't make sense, moreover, it will lead to the a lot of data movement
between nodes.

What are you trying to achieve?

Best Regards,
Evgenii

чт, 19 сент. 2019 г. в 16:18, Shiva Kumar <sh...@gmail.com>:

> Hi all,
> I am trying to do a simple cross join on two tables with non-collocated
> data (without affinity key),
> This non-collocated distributed join always fails with the error message:
>
> *"java.sql.SQLException: javax.cache.CacheException: Failed to prepare
> distributed join query: join condition does not use index "*
>
> If I create one of the tables in replicated mode and another one in
> partitioned mode this Join operation works but documentation mentions that
> Ignite supports non-collocated joins without any condition.
> And we tried with 3 tables and 1 in replicated and other 2 in partitioned
> then we observed that it failed.
> we are running the Join operations with *distributedJoins=true.*
> *We observed that if there are N tables in Join operation then (N-1)
> should be in replicated mode, is our understanding right?*
> *If our understanding is correct then to do Join operation the
> dimensioning of cluster increases by many folds which can't be used in a
> production environment.*
> *To reproduce:*
> *Ignite with 4 node cluster with native persistence enabled.*
> *create the following tables*
>
> CREATE TABLE City (
>
>   id LONG PRIMARY KEY, name VARCHAR)
>
>   WITH "backup=1";
>
> CREATE TABLE Person (
>
>   id LONG, name VARCHAR, city_id LONG, PRIMARY KEY (id, city_id))
>
>   WITH "backups=1";
>
> CREATE INDEX idx_city_name ON City (name);
>
> CREATE INDEX idx_person_name ON Person (name);
>
>
> INSERT INTO City (id, name) VALUES (1, 'Forest Hill');
>
> INSERT INTO City (id, name) VALUES (2, 'Denver');
>
> INSERT INTO City (id, name) VALUES (3, 'St. Petersburg');
>
> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Doe', 3);
>
> INSERT INTO Person (id, name, city_id) VALUES (2, 'Jane Roe', 2);
>
> INSERT INTO Person (id, name, city_id) VALUES (3, 'Mary Major', 1);
>
> INSERT INTO Person (id, name, city_id) VALUES (4, 'Richard Miles', 2);
>
>
> Query to be run:
>
> select * from City c, Person p;
>
> or
> *SELECT* * *FROM* City *AS* c *CROSS* *join* Person *AS* p;
>
>
>
>