You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by ght230 <gh...@163.com> on 2016/10/25 02:57:48 UTC

How to avoid data skew in collocate data with data

I had put my data into 3 nodes with collocate data with data, and found the
data is not evenly distributed across the three nodes.
Node A save 30 pieces of data, Node B save 30 pieces of data, and Node C
saved 900 pieces of data.

The example data as following:
Node	AffinityKey	count
A	530		9
A	531		10
A	532		11
B	540		12
B	541		14
B	542		4
C	550		100
C	551		200
C	552		50
C	553		400
C	554		150

How to make the data evenly distributed in the three nodes?




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-data-with-data-tp8454.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: How to avoid data skew in collocate data with data

Posted by Vladislav Pyatkov <vl...@gmail.com>.
Hi,

No, it will not help. Because affinity function have not information about
data volume into partition.
The purpose of the function (RendezvousAffinityFunction or
FairAffinityFunction) is distribute evenly partition quantity between nodes
(but not data). You can to deeper understand affinity contract if lock at
it's interface[1].

[1]:
https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/cache/affinity/AffinityFunction.java

On Wed, Oct 26, 2016 at 5:57 PM, ght230 <gh...@163.com> wrote:

> For some reason, I can not choose more selective affinty key than current.
>
> If I increase number of partition, will it help me to avoid the case that
> keys 553, 551, 554, 550 hitting into one node?
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-
> data-with-data-tp8454p8512.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov

Re: How to avoid data skew in collocate data with data

Posted by ght230 <gh...@163.com>.
For some reason, I can not choose more selective affinty key than current.

If I increase number of partition, will it help me to avoid the case that
keys 553, 551, 554, 550 hitting into one node?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-data-with-data-tp8454p8512.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: How to avoid data skew in collocate data with data

Posted by Vladislav Pyatkov <vl...@gmail.com>.
Hi,

You can increase number of partition through affinity function
configuration, but default value (1024 partitions) would be enough.

<bean class="org.apache.ignite.configuration.CacheConfiguration">
...
<property name="affinity">
  <bean
class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">
   <property name="partitions" value="1024"/>
</bean>
  </property>

I do not shure than FairAffinityFunction will help to you, in case if keys
553, 551, 554, 550 will hit into one node.

How about choose more selective affinty key than current?

On Wed, Oct 26, 2016 at 5:16 AM, ght230 <gh...@163.com> wrote:

> How to increase number of partitions?
>
> And if data skew happened, how can I rebalance it?
> Using FairAffinityFunction()?
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-
> data-with-data-tp8454p8491.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov

Re: How to avoid data skew in collocate data with data

Posted by ght230 <gh...@163.com>.
How to increase number of partitions?

And if data skew happened, how can I rebalance it?
Using FairAffinityFunction()?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-data-with-data-tp8454p8491.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: How to avoid data skew in collocate data with data

Posted by Vladislav Pyatkov <vl...@gmail.com>.
Hi,

I agree with Alexey, you need to increase number of partitions.

In additional, It look like your Affinity key has bad selectivity.
Why so mach data bind to  553, 551, but small data bind to 542, 530?

I recomend select other key as affinity or accept that disbalance.

On Tue, Oct 25, 2016 at 4:24 PM, Alexey Kuznetsov <ak...@apache.org>
wrote:

> Hi!
>
> How many partitions configured in your cache?
> As far as I see - 11 partitions?
> Could you try to configure more (64, 128, 256)?
> And see how data will be distributed?
>
> By default Ignite caches configured with 1024 partitions.
>
>
>
> On Tue, Oct 25, 2016 at 8:20 PM, ght230 <gh...@163.com> wrote:
>
>> When data skew happened, what can I do to rebalance all the data to the 3
>> nodes.
>>
>>
>>
>> --
>> View this message in context: http://apache-ignite-users.705
>> 18.x6.nabble.com/How-to-avoid-data-skew-in-collocate-data-
>> with-data-tp8454p8473.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>
>
>
> --
> Alexey Kuznetsov
>



-- 
Vladislav Pyatkov

Re: How to avoid data skew in collocate data with data

Posted by Alexey Kuznetsov <ak...@apache.org>.
Hi!

How many partitions configured in your cache?
As far as I see - 11 partitions?
Could you try to configure more (64, 128, 256)?
And see how data will be distributed?

By default Ignite caches configured with 1024 partitions.



On Tue, Oct 25, 2016 at 8:20 PM, ght230 <gh...@163.com> wrote:

> When data skew happened, what can I do to rebalance all the data to the 3
> nodes.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-
> data-with-data-tp8454p8473.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Alexey Kuznetsov

Re: How to avoid data skew in collocate data with data

Posted by ght230 <gh...@163.com>.
When data skew happened, what can I do to rebalance all the data to the 3
nodes.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-data-with-data-tp8454p8473.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: How to avoid data skew in collocate data with data

Posted by Vladislav Pyatkov <vl...@gmail.com>.
Hi,

One AffinityKey binding to one node always. This is AffinityKey meaning
(all entries with one AffinityKey stored into one node).

On Thu, Oct 27, 2016 at 9:45 AM, ght230 <gh...@163.com> wrote:

> If there are too many data related to one affinity key, even more than the
> capacity of one node.
>
> Will Ignite automatically split that data and stored them in several nodes?
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-
> data-with-data-tp8454p8539.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov

Re: How to avoid data skew in collocate data with data

Posted by ght230 <gh...@163.com>.
If there are too many data related to one affinity key, even more than the
capacity of one node.

Will Ignite automatically split that data and stored them in several nodes?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-data-with-data-tp8454p8539.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: How to avoid data skew in collocate data with data

Posted by vkulichenko <va...@gmail.com>.
No, for that you will have to implement your own affinity function. But there
are a lot of cases that you should keep in mind. What if the node fails, how
the data will be redistributed? What if new node joins, will it take some of
the data (in other words, will you be able to scale)? Etc.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-data-with-data-tp8454p8535.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: How to avoid data skew in collocate data with data

Posted by ght230 <gh...@163.com>.
I would like to pre-distribution of data according to a certain algorithm to
the designated node.

Is there an API can be used to put data to a specified node?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-data-with-data-tp8454p8527.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: How to avoid data skew in collocate data with data

Posted by vkulichenko <va...@gmail.com>.
Hi,

If you have such data distribution, then most likely your collocation
strategy is incorrect or at least not optimal. If your number of partitions
is much bigger than number of nodes and number of unique affinity keys is
much bigger than number of partitions, such skew will be minimized. "Much
bigger" here generally means orders of magnitude. Having said that, I would
recommend to revisit your data modes (in case this questions comes from a
real use case). Otherwise, you can always implement your own
AffinityFunction and customize data distribution, but this is not a trivial
task.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/How-to-avoid-data-skew-in-collocate-data-with-data-tp8454p8518.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.