You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by iceguo <gu...@ffcs.cn> on 2015/10/27 16:29:00 UTC

The result of cross-cache SQL joins is incomplete.

I had started 2 Ignite nodes, and put table "Person" and "Organization" to
these 2 Ignite nodes with all cache mode is ”Partitioned“,

 then I did a cross-cache SQL joins as following:
"select Person.firstName from Person, \"OrganizationCache\".Organization
where Person.orgid = Organization.id and Organization.name = ?".

then I found the result in incomplete, it losted almost half results.

When I use cache mode is "replicated" or I use only one node, the result is
complete.

I want to know how to get the complete result between two PARTITIONED data
sets with "Partitioned" cache mode.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

Thank you for your reply.
I will try it.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1731.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

Size of T3 is more than 10 million, size of T2 is about 1 million, and size
of T1 is less than 10 thousand.

They are growing at the same rate.

The cluster will include 10 servers.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1851.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Dmitriy Setrakyan <ds...@apache.org>.

I also want to add that if you have Star Schema design described by Alexey
Goncharuk in this thread, then you do not need to worry about collocation.

We can make better suggestions, if you describe the data size of your
tables. Also, are they growing at the same rate, or some of them grow
faster than others.

Also, what is the size of the cluster you plan to use for your deployment?

D.

On Tue, Nov 3, 2015 at 2:33 AM, iceguo <gu...@ffcs.cn> wrote:

> Thank you for your kindness.
>
> Could you please help me to collocate the following object?
>
> CREATE TABLE  T1
>  ( LAT_ID        SMALLINT,
>    DEV_ID    VARCHAR(20),
>    DEV_LV    SMALLINT
>  );
>
>   CREATE TABLE  T2
>  ( THE_DATE              DATE,
>    DEV_ID           VARCHAR(20),
>    DEV_LV          SMALLINT
>  );
>
>  CREATE TABLE  T3
>  ( ID         INTEGER           PRIMARY KEY    NOT NULL,
>    DEV_TYPE   VARCHAR(200),
>    DEV_ID     VARCHAR(100)
>  );
>
>  p.s:
> table T1 and T2 have no primary key.
>
> SQL query:
> SELECT t1.* from t1,t2,t3 where t2.DEV_ID = t1.DEV_ID and
>                                              t2.DEV_LV = t1.DEV_LV and
>                                              t1.DEV_ID = t3.DEV_ID;
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1819.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: The result of cross-cache SQL joins is incomplete.

Posted by ght230 <gh...@163.com>.

Thank you for your reply!



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1924.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by vkulichenko <va...@gmail.com>.

Hi,

See my answers inline...

-Val


ght230 wrote
> 1.What is wrong with my setting?

If you collocate more than two tables, they need to have the same affinity
key. In your case you collocate UserOrder/OrderGood pair by orderid and
UserOrder/User pair by userid. Since orders are not collocated with users
(and vice versa), you get inconsistent query results. To solve this you can
switch either User or OrderGood cache to replicated mode.

ght230 wrote
> 2.For such crosscache queries, are there common rules(such as how to set
> cache mode, how to set AffinityKey) used for programming, 
> because we have so many queries and some queries are not fixed.If each
> query need to be individually programmed it will be difficult.

This always depends on your data model. The common approach to start with is
to put all relatively small and less frequently updated tables into
replicated caches and then use partitioned caches with proper collocation
for other tables. You should choose one entity by which you want to divide
your data and collocate everything by its ID. For example, if you choose to
collocate by user ID, you will have all data that belongs to some user
(orders, etc.) on the same node where the user resides. In your particular
case it's a bit more complicated because of many-to-many relationship - you
don't know where to store an order because it can belong to different users
that can be stored on different nodes. In this case you have to move more
tables to replicated mode or denormalize data model so that many-to-many
relationships are replaced with one-to-many.

ght230 wrote
> 3.I have learned that in version 1.5, Ignite will support Distributive SQL
> Joins.Does it meant we do not need to care about cache mode and
> AffinityKey?

Yes, but this will potentially cause performance degradation (collocated
queries are always faster than non-collocated). So you should use this only
if collocation is not an option for some reason or if the application is not
performance sensitive.

ght230 wrote
> 4.When will Ignite-1.5 release?

I don't know the exact date. We're in the final stage, but still have
several critical issues to fix. Anyway, I think you can expect it in the
nearest future.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1921.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by ght230 <gh...@163.com>.

Hi
Thank you for your reply.
I have tried again accroding to your advises.
I set table "day_report", "good" and "good_type" as Replicated cache mode,
set table "order_good", "user", "user_order" as Partitioned cache mode,
and set AffinityKey
AffinityKey<Integer>(user_order.id, user_order.usr_id)
AffinityKey<Integer>(order_good.id, order_good.order_id)

My SQL query is 
"SELECT o.*, gt.name FROM user_order2 o, order_good2 og, good g, good_type
gt, day_report2 dr, user u
WHERE o.id = og.order_id AND g.id = og.good_id AND g.type = gt.id AND
o.usr_id = u.id
AND dr.report_date = o.create_time"

But the query result is still incomplete.

1.What is wrong with my setting?

2.For such crosscache queries, are there common rules(such as how to set
cache mode, how to set AffinityKey) used for programming, 
because we have so many queries and some queries are not fixed.If each query
need to be individually programmed it will be difficult.

3.I have learned that in version 1.5, Ignite will support Distributive SQL
Joins.Does it meant we do not need to care about cache mode and AffinityKey?

4.When will Ignite-1.5 release?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1904.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Andrey Gura <ag...@gridgain.com>.

Hi,

Replicated vs partititoned is desicion that depends on many factors: data
set size, growth rate, reliability, colocation requirements.

I don't know any details about your domain. So I can only make some
assumptions.

1. "good" and "good_type" tables are dictionary tables. Usually tables like
this are not big. So we can represent they as replicated caches. It means
that all rows will be replicated on each cluster node. It leads to usefull
effect: this tables don't require any colocation.

2. "day_report" table isn't big also. If I understand correctly this table
contains one data row per day. So this table can b erepresented as
replicated cache.

3. "user", "user_order" and "order_good" tables can have a lot of data and
growth rate can be big enough. It makes sense to represent this tables as
partitioned cache.

4. If you have partitioned caches that involved into crosscache queries
then you should colocate data from this caches in order to provide correct
joining and good performance. Thus data about user orders should be
colocated with users (i.e. all user order entries should be on the same
nodes where this user entries are placed) and data about goods in orders
should be colocated with orders (i.e. all "order_good" rows for specific
order should be placed on the same nodes where corresponding order entry is
placed).

About your questions:

1. Some tables do not have primary key, should I need to add primary key on
> them first before importing them to Ignite?

Yes, you should. Any entry in cache must have unique key.

2. Which tables should be setting as Replicated cache mode? And which table
> should be setting as partition cache mode?

See my analysis above.

3. Should I need to set any AffinityKey?

Yes, you should. For example, in order to colocate user orders with users
you should use AffinityKey for "user_order" table: AffinityKey<Integer>(
user_order.id, user_order.usr_id).

4. Some join connections are not connected to primary key(such as
> day_report.report_date = user_order.create_time where create_time is not a
> primary key), Does it have any affect?
>

Primary key is a just unique not null key. Join can be executed on any
field, but it would be better if this field is indexed. Probably in your
case make sense to truncate user_order.create_time field to date type
(without info about hours, minutes, etc).

On Sat, Nov 7, 2015 at 5:22 PM, iceguo <gu...@ffcs.cn> wrote:

> Thank you for your reply, but I am still not very clear about the rules of
> setting a table for Replicated cache mode or Partition cache mode.
> I have another a little more complicated example, could you help to
> analysis
> it?
>
> In this example, I have 6 tables
>
> /*Table structure for table `day_report` */
> CREATE TABLE `day_report` (
>   `report_date` date DEFAULT NULL,
>   `money` decimal(10,0) DEFAULT NULL,
>   `people` int(11) DEFAULT NULL
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `good` */
>
> CREATE TABLE `good` (
>   `id` int(11) NOT NULL,
>   `name` varchar(200) DEFAULT NULL,
>   `type` int(11) DEFAULT NULL,
>   `money` decimal(10,0) DEFAULT NULL,
>   PRIMARY KEY (`id`)
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `good_type` */
>
> CREATE TABLE `good_type` (
>   `id` int(11) NOT NULL,
>   `name` varchar(200) DEFAULT NULL,
>   `pid` int(11) DEFAULT NULL,
>   PRIMARY KEY (`id`)
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `order_good` */
>
> CREATE TABLE `order_good` (
>   `order_id` int(11) DEFAULT NULL,
>   `good_id` int(11) DEFAULT NULL
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `user` */
>
> CREATE TABLE `user` (
>   `id` int(11) NOT NULL,
>   `name` varchar(50) DEFAULT NULL,
>   PRIMARY KEY (`id`)
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `user_order` */
>
> CREATE TABLE `user_order` (
>   `id` int(11) NOT NULL AUTO_INCREMENT,
>   `usr_id` int(11) DEFAULT NULL,
>   `create_time` date DEFAULT NULL,
>   `money` decimal(10,0) DEFAULT NULL,
>   KEY `id` (`id`)
> ) ENGINE=InnoDB AUTO_INCREMENT=500000 DEFAULT CHARSET=utf8;
>
> The join connections of them as following:
> day_report.report_date = user_order.create_time
> user_order.type         = good_type.pid
> user_order.id = order_good.order_id
> good.id = order_good.good_id
>
> My questions are:
> 1. Some tables do not have primary key, should I need to add primary key on
> them first before importing them to Ignite?
> 2. Which tables should be setting as Replicated cache mode? And which table
> should be setting as partition cache mode?
> 3. Should I need to set any AffinityKey?
> 4. Some join connections are not connected to primary key(such as
> day_report.report_date = user_order.create_time where create_time is not a
> primary key), Does it have any affect?
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1880.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

-- 
Andrey Gura
GridGain Systems, Inc.
www.gridgain.com

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

Thank you for your reply, but I am still not very clear about the rules of
setting a table for Replicated cache mode or Partition cache mode.
I have another a little more complicated example, could you help to analysis
it?

In this example, I have 6 tables 

/*Table structure for table `day_report` */
CREATE TABLE `day_report` (
  `report_date` date DEFAULT NULL,
  `money` decimal(10,0) DEFAULT NULL,
  `people` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

/*Table structure for table `good` */

CREATE TABLE `good` (
  `id` int(11) NOT NULL,
  `name` varchar(200) DEFAULT NULL,
  `type` int(11) DEFAULT NULL,
  `money` decimal(10,0) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

/*Table structure for table `good_type` */

CREATE TABLE `good_type` (
  `id` int(11) NOT NULL,
  `name` varchar(200) DEFAULT NULL,
  `pid` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

/*Table structure for table `order_good` */

CREATE TABLE `order_good` (
  `order_id` int(11) DEFAULT NULL,
  `good_id` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

/*Table structure for table `user` */

CREATE TABLE `user` (
  `id` int(11) NOT NULL,
  `name` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

/*Table structure for table `user_order` */

CREATE TABLE `user_order` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `usr_id` int(11) DEFAULT NULL,
  `create_time` date DEFAULT NULL,
  `money` decimal(10,0) DEFAULT NULL,
  KEY `id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=500000 DEFAULT CHARSET=utf8;

The join connections of them as following:
day_report.report_date = user_order.create_time
user_order.type 	= good_type.pid
user_order.id = order_good.order_id
good.id = order_good.good_id

My questions are:
1. Some tables do not have primary key, should I need to add primary key on
them first before importing them to Ignite?
2. Which tables should be setting as Replicated cache mode? And which table
should be setting as partition cache mode?
3. Should I need to set any AffinityKey?
4. Some join connections are not connected to primary key(such as
day_report.report_date = user_order.create_time where create_time is not a
primary key), Does it have any affect?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1880.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Thu, Nov 5, 2015 at 5:42 PM, iceguo <gu...@ffcs.cn> wrote:

> I also need to confirm 2 point.
>
> 1. If I choose T2 for partition mode, T1 and T3 for replication mode, I do
> not need to set any AffinityKey.
> Is it right?
>

Yes, correct.

>
> 2. If the size of T1 is too big, it is not proper to set it for replication
> mode. How can I do?
>

If the T1 data does not change a lot, and you have enough memory on each
node to hold the full T1 data set, it is OK to replicate it.

If none of the suggestions work for you, you can try executing your query
in 2 steps, in which case the result of the 1st query will server as a
parameter for the 2nd one.

>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1862.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

I also need to confirm 2 point.

1. If I choose T2 for partition mode, T1 and T3 for replication mode, I do
not need to set any AffinityKey.
Is it right?

2. If the size of T1 is too big, it is not proper to set it for replication
mode. How can I do?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1862.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Thu, Nov 5, 2015 at 5:16 PM, iceguo <gu...@ffcs.cn> wrote:

> In my example, only T1 is both connected to the other two tables(T2 and
> T3).
>
> I think the situations of T2 and T3 are equal, if T3 can be partition, T2
> can also be partition.

Am I right?
>

I think Yes. In this case, T1 can be Replicated and T2, T3 can be
Partitioned with DEV_ID as a collocation key.


>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1860.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

In my example, only T1 is both connected to the other two tables(T2 and T3).

I think the situations of T2 and T3 are equal, if T3 can be partition, T2
can also be partition.

Am I right?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1860.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Andrey Gura <ag...@gridgain.com>.

>
> I still want to know why you choose T3 for partition mode, why not T2 or
> T1,
> is there any rules?


Because all T1 and T2 data will be available on the any node in the cluster
(REPLICATED) T3 cache can be distributed between cluster nodes
(PARTITIONED) and provide required colocation.

On Thu, Nov 5, 2015 at 3:22 PM, iceguo <gu...@ffcs.cn> wrote:

> Sorry for reply late.
>
> Table T1 and T2 have no primary key originally.
>
> Before putting entries into cache, I manually added to the table a column
> "ID" for primary key.
>
> Then I use ignite-schema-import to generate Java domain model POJOs.
>
> I still want to know why you choose T3 for partition mode, why not T2 or
> T1,
> is there any rules?
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1850.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Andrey Gura
GridGain Systems, Inc.
www.gridgain.com

Re: The result of cross-cache SQL joins is incomplete.

Posted by Denis Magda <dm...@gridgain.com>.

> I still want to know why you choose T3 for partition mode, why not T2 or
T1, is there any rules?

It was just Andrey's assumption that both T2 and T1 suite well for the
replicated mode and T3 - for the partitioned.

As Andrey and Dmitriy stated above we need to know more info about all these
tables. It will help to decide whether it's possible or not to use any cache
in the replicated mode.

Please provide us with answers on Dmitriy's and Andrey's questions above
first.

--
Denis



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1887.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

Sorry for reply late.

Table T1 and T2 have no primary key originally. 

Before putting entries into cache, I manually added to the table a column
"ID" for primary key.

Then I use ignite-schema-import to generate Java domain model POJOs.

I still want to know why you choose T3 for partition mode, why not T2 or T1,
is there any rules?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1850.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Andrey Gura <ag...@gridgain.com>.

If I understand correctly T1 and T2 can have repeated rows. But cache
always identifies entries by some key. What is key for this tables in your
case? How do you put entries into cache?

The other question is how DEV_ID and DEV_LV columns

It seems that in current data scheme there is only one way to colocate data
correctly:

  - T1 replicated cache
  - T2 replicated cache
  - T3 partitioned cache

There is no need any affinity keys. But if T1 and T2 are large datasets
then such colocation will be ineffective.

Another way to colocate datasets like this it have compound key that
contains all table columns. But it lead to ineffective memory consumption
because all value data is duplicated in the key. See code example here:
https://gist.github.com/agura/8a2ce3d028d4b69c9a07

On Tue, Nov 3, 2015 at 1:33 PM, iceguo <gu...@ffcs.cn> wrote:

> Thank you for your kindness.
>
> Could you please help me to collocate the following object?
>
> CREATE TABLE  T1
>  ( LAT_ID        SMALLINT,
>    DEV_ID    VARCHAR(20),
>    DEV_LV    SMALLINT
>  );
>
>   CREATE TABLE  T2
>  ( THE_DATE              DATE,
>    DEV_ID           VARCHAR(20),
>    DEV_LV          SMALLINT
>  );
>
>  CREATE TABLE  T3
>  ( ID         INTEGER           PRIMARY KEY    NOT NULL,
>    DEV_TYPE   VARCHAR(200),
>    DEV_ID     VARCHAR(100)
>  );
>
>  p.s:
> table T1 and T2 have no primary key.
>
> SQL query:
> SELECT t1.* from t1,t2,t3 where t2.DEV_ID = t1.DEV_ID and
>                                              t2.DEV_LV = t1.DEV_LV and
>                                              t1.DEV_ID = t3.DEV_ID;
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1819.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

-- 
Andrey Gura
GridGain Systems, Inc.
www.gridgain.com

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

Thank you for your kindness.

Could you please help me to collocate the following object?

CREATE TABLE  T1 
 ( LAT_ID        SMALLINT,
   DEV_ID    VARCHAR(20),
   DEV_LV    SMALLINT
 );
  
  CREATE TABLE  T2
 ( THE_DATE              DATE,
   DEV_ID           VARCHAR(20),
   DEV_LV          SMALLINT
 );
 
 CREATE TABLE  T3 
 ( ID         INTEGER     	PRIMARY KEY    NOT NULL,
   DEV_TYPE   VARCHAR(200),
   DEV_ID     VARCHAR(100)
 );

 p.s:
table T1 and T2 have no primary key.

SQL query:
SELECT t1.* from t1,t2,t3 where t2.DEV_ID = t1.DEV_ID and 
					     t2.DEV_LV = t1.DEV_LV and 
					     t1.DEV_ID = t3.DEV_ID;



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1819.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Denis Magda <dm...@gridgain.com>.

> I compared the SQL querys between yours and mine, and I found something
different. 
> For your SQL query "emp.depid = dep.id", dep.id is primary key of table
> Department. 
> But for my SQL query "A.mid = B.mid", B.mid is NOT primary key of table B. 

Actually, I've provided you with the example that shows how you can
collocate entries that belong to 3 different tables.

After the collocation has been done properly you can freely and safely use
any kind of parameters in your SQL queries (like A.mid = B.mid).

So you should properly collocate A and B before. Just as an example. 
A a = new A(aPrimaryKey);
B b = new B(bPrimaryKey);

cacheA.put(a.aPrimaryKey, a);

cacheB.put(new AffinityKey(a.primaryKey, b.primaryKey), b);

As you see above when we put the 'b' object to the cache we use a special
kind of key - AffinityKey. It will guarantee that this 'b' object will be
placed on the same node where the object 'a' with a.primaryKey is located.
After that all the queries should work fine for you.

If you still have difficulties with this please provide us with exact
objects and the list of their keys. We will try to help you to collocate
them properly.

 




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1807.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

Thank you for your reply.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1818.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Denis Magda <dm...@gridgain.com>.

Alex, thanks for providing the correct example! Shame on me, didn't pay much
attention while was providing my version ;)



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1811.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Alexey Kuznetsov <ak...@gridgain.com>.

iceguo and Deniss

I think Denis provided slightly incorrect code snippet to work with async.

One should use varable to work with async like this:
IgniteCache asyncCache = cache.withAsync();
asyncCache.put(...);
....
asyncCache.future().get();

Denis, take a look.


On Mon, Nov 2, 2015 at 4:40 PM, Denis Magda <dm...@gridgain.com> wrote:

> > There is another question, when I put table data into REPLICATED cache,
> This operation is synchronous or asynchronous?
>
> cache.put is synchronous. However, you can make behave him asynchronously
> this way: cache.withAsync().put(...)
>
> > If it is asynchronous, how can I know when this operation has finished?
>
> cache.withAsync().put(...);
>
> // blocks until the put above is finished
> cache.future().get();
>
>
> In addition I would like to let you know that you always can change the way
> data is synchronized during put or other modification operations using
> CacheWriteSynchronizationMode that can be changed for a CacheConfiguration.
>
> Regards,
> Denis
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1808.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Alexey Kuznetsov
GridGain Systems
www.gridgain.com

Re: The result of cross-cache SQL joins is incomplete.

Posted by Denis Magda <dm...@gridgain.com>.

> There is another question, when I put table data into REPLICATED cache,
This operation is synchronous or asynchronous? 

cache.put is synchronous. However, you can make behave him asynchronously
this way: cache.withAsync().put(...)

> If it is asynchronous, how can I know when this operation has finished? 

cache.withAsync().put(...);

// blocks until the put above is finished
cache.future().get();


In addition I would like to let you know that you always can change the way
data is synchronized during put or other modification operations using
CacheWriteSynchronizationMode that can be changed for a CacheConfiguration.

Regards,
Denis



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1808.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

Thank you for your reply, I think it is reasonable.

There is another question, when I put table data into REPLICATED cache, This
operation is synchronous or asynchronous?

If it is asynchronous, how can I know when this operation has finished?

Or can I change the operation from asynchronous to synchronous ?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1802.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Alexey Goncharuk <al...@gmail.com>.

In cases when there are more than 2 tables joined together it is not always
possible to achieve proper colocation. However, in vast majority of such
cases the tables being joined comply with the Star-schema concept, i.e.
some of the tables have significantly smaller number of entries. In this
case you can simply put this data into REPLICATED cache and keep in
PARTITIONED cache only entities with the large number of entries. In this
case you do not need to worry about colocation between PARTITIONED and
REPLICATED caches as entries in REPLICATED cache are present on all nodes.

There is an example in Ignite demonstrating this approach, you can take a
look
at org.apache.ignite.examples.datagrid.starschema.CacheStarSchemaExample

Hope this helps.

2015-10-31 15:14 GMT+03:00 iceguo <gu...@ffcs.cn>:

> I have tried according to your advises, but it failed, the result is
> incomplete.
>
> I compared the SQL querys between yours and mine, and I found something
> different.
>
> For your SQL query "emp.depid = dep.id", dep.id is primary key of table
> Department.
>
> But for my SQL query "A.mid = B.mid", B.mid is NOT primary key of table B.
>
> I want to know whether this is the reason of the problem.
>
> If this is the reason, and I can not change the relation of the tables in
> my
> SQL query, how do I deal with it?
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1795.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

I have tried according to your advises, but it failed, the result is
incomplete.

I compared the SQL querys between yours and mine, and I found something
different.

For your SQL query "emp.depid = dep.id", dep.id is primary key of table
Department.

But for my SQL query "A.mid = B.mid", B.mid is NOT primary key of table B.

I want to know whether this is the reason of the problem.

If this is the reason, and I can not change the relation of the tables in my
SQL query, how do I deal with it?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1795.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Denis Magda <dm...@gridgain.com>.

To properly build affinity keys you need to follow parent-child relationship
between A, B and C.

Let's consider that 
A - is a Department
B - is an Employee
C - is Salary

And we want to store all employees data that belong to a particular
department on the same node (collocate departments with their employees).
Also we need to collocate employees with their salary information.

To fulfill this task we just need to put employees and their salaries on the
same node where an employee's department is located.

Example,

Department dep = new Department();
dep.setId(10);

departmentsCache.put(dep.getId(), dep);

Employee emp = new Employee();
emp.setId(200334);
emp.setDepId(dep.getId());

// Employee will be stored on the same node where a department with
emp.getDepId() is located
employeesCache.put(new AffinityKey(emp.getDepId(), emp.getId());
  

Salary sal = new Salary();
sal.setId(47382947);
sal.setEmpId(emp.getId());

// Salary will be stored on the same node where a department with
emp.getDepId() is located
salaryCache.put(new AffinityKey(emp.getDepId(), sal.getId());


Hope this example helps you to understand how to properly build AffinityKeys
and you'll be able to do the same for your A, B and C object.

Don't hesitate to ask additional questions if any.

Regards,
Denis



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1782.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

Thank you for your reply.

Now I have a more complicated example, it include 3 tables(A, B, C) to do
the following join conditions:
A.mid = B.mid and A.nid = B.nid and A.kid = C.kid.

I have tried to setting "A.key = new AffinityKey<>(A.id, A.mid)", but It
seems not enough.

I want to know how to colocate A, B and C in this case.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1778.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

Thank you for your advises.

You have mentioned compound affinity keys in your reply, I want to know how
to setting compound affinity keys? Could you please give me an example?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1794.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Wed, Oct 28, 2015 at 7:48 PM, iceguo <gu...@ffcs.cn> wrote:

> In the above example, there is only 1 join condition in the SQL,
> ”Person.orgid = Organization.id “,
> we can collocate Persons with their organizations by setting
> “Person.key = new AffinityKey<>(Person.id, Person.orgid)".
>
> But if my join SQL including more than 1 join conditions. such as:
> "select Person.firstName from Person, \"OrganizationCache\".Organization
> where Person.orgid = Organization.id and Person.firstName =
> Organization.name”.
>

I think in the example you are providing it should be always enough to
colocate on one field, say “orgId”, and the 2nd join criteria should
automatically work in colocated fashion.

It is also possible to have compound affinity keys (keys that are based on
more than one field), if it makes sense for you. For example, you can have
a CompoundKey class with ID and NAME fields, and then have both, Person and
Organization colocate based on this class.

Hope this helps.

>
> How to collocate Persons with their organizations?
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1754.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: The result of cross-cache SQL joins is incomplete.

Posted by iceguo <gu...@ffcs.cn>.

In the above example, there is only 1 join condition in the SQL,
”Person.orgid = Organization.id “, 
we can collocate Persons with their organizations by setting 
“Person.key = new AffinityKey<>(Person.id, Person.orgid)".

But if my join SQL including more than 1 join conditions. such as:
"select Person.firstName from Person, \"OrganizationCache\".Organization
where 
Person.orgid = Organization.id and Person.firstName = Organization.name".

How to collocate Persons with their organizations? 



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1754.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: The result of cross-cache SQL joins is incomplete.

Posted by Sergi Vladykin <se...@gmail.com>.

You have to collocate Persons with their organizations, so that all
the Persons for any given Organization reside on the same node
with that Organization.

Read more here: https://apacheignite.readme.io/docs/affinity-collocation

Sergi


2015-10-27 18:29 GMT+03:00 iceguo <gu...@ffcs.cn>:

> I had started 2 Ignite nodes, and put table "Person" and "Organization" to
> these 2 Ignite nodes with all cache mode is ”Partitioned“,
>
>  then I did a cross-cache SQL joins as following:
> "select Person.firstName from Person, \"OrganizationCache\".Organization
> where Person.orgid = Organization.id and Organization.name = ?".
>
> then I found the result in incomplete, it losted almost half results.
>
> When I use cache mode is "replicated" or I use only one node, the result is
> complete.
>
> I want to know how to get the complete result between two PARTITIONED data
> sets with "Partitioned" cache mode.
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>