You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Michael Griggs <mi...@gridgain.com> on 2017/03/15 13:09:14 UTC

Distribution of keys to partitions

Hi Igniters,

Last week I was working with a group of Ignite users.  They are inserting
several million string keys in to a cache.  Each string key was
approximately 22-characters in length.  When I exported the partition
counts (via GG Visor) I was able to see an unusual periodicity in the
number of keys allocated to partitions.  I charted this in Excel [1].
After further investigation, it appears that there is a relationship
between the number of keys being inserted, the number of partitions
assigned to the cache and amount of apparent periodicity: a small number of
partitions will cause periodicity to appear with lower numbers of keys.

The RendezvousAffinityFunction#partition function performs a simple
calculation of key hashcode modulo partition-count:

U.safeAbs(key.hashCode() % parts)


Digging further I was led to the fact that this is how the Java HashMap
*used* to behave [2], but was upgraded around Java 1.4 to perform the
following:

key.hashCode() & (parts - 1)

which performs more efficiently.  It was then updated further to do the
following:

(h = key.hashCode()) ^ (h >>> 16);

with the bit-shift performed to

incorporate impact of the highest bits that would otherwise
never be used in index calculations because of table bounds


When using this function, rather than our
RendezvousAffinityFunction#partition implementation, I also saw a
significant decrease in the periodicity and a better distribution of keys
amongst partitions [3].

I would like to suggest that we adopt this modified hash function inside
RendezvousAffinityFunction.

Regards
Mike


[1]: https://i.imgur.com/0FtCZ2A.png
[2]:
https://www.quora.com/Why-does-Java-use-a-mediocre-hashCode-implementation-for-strings
[3]: https://i.imgur.com/8ZuCSA3.png

Re: Distribution of keys to partitions

Posted by Yakov Zhdanov <yz...@apache.org>.

Guys, I replied in ticket. Overall I liked the changes but I think this
needs additional elaboration. Please see
https://issues.apache.org/jira/browse/IGNITE-4828

--Yakov

2017-03-31 20:09 GMT+03:00 Denis Magda <dm...@apache.org>:

> Michael, thanks.
>
> I did minor improvements and merged them to IGNITE-4828 branch and
> triggered TeamCity tests:
> http://ci.ignite.apache.org/viewQueued.html?itemId=526101&
> tab=queuedBuildOverviewTab
>
> *Michael*, please check the tests results lately. I won’t be available in
> the nearest 4 days.
>
> *Ilya*, please benchmark IGNITE-4828 branch against master branch in
> FULL_SYNC mode.
>
> —
> Denis
>
> > On Mar 31, 2017, at 2:59 AM, michael.griggs <mi...@gridgain.com>
> wrote:
> >
> > The change is now ready for review:
> >
> > https://github.com/apache/ignite/pull/1707
> >
> >
> >
> > --
> > View this message in context: http://apache-ignite-
> developers.2346864.n4.nabble.com/Distribution-of-keys-to-
> partitions-tp15455p16005.html
> > Sent from the Apache Ignite Developers mailing list archive at
> Nabble.com.
>
>

Re: Distribution of keys to partitions

Posted by Denis Magda <dm...@apache.org>.

Michael, thanks.

I did minor improvements and merged them to IGNITE-4828 branch and triggered TeamCity tests:
http://ci.ignite.apache.org/viewQueued.html?itemId=526101&tab=queuedBuildOverviewTab

*Michael*, please check the tests results lately. I won’t be available in the nearest 4 days.

*Ilya*, please benchmark IGNITE-4828 branch against master branch in FULL_SYNC mode.

—
Denis

> On Mar 31, 2017, at 2:59 AM, michael.griggs <mi...@gridgain.com> wrote:
> 
> The change is now ready for review:
> 
> https://github.com/apache/ignite/pull/1707
> 
> 
> 
> --
> View this message in context: http://apache-ignite-developers.2346864.n4.nabble.com/Distribution-of-keys-to-partitions-tp15455p16005.html
> Sent from the Apache Ignite Developers mailing list archive at Nabble.com.

Re: Distribution of keys to partitions

Posted by Dmitriy Setrakyan <ds...@apache.org>.

Vova,

Not sure I understand what you mean. I believe that we just agreed that if
the partition count is set to the power of 2, then we can improve the
performance with a better hashing algorithm.

D.

On Fri, Mar 31, 2017 at 1:46 AM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> In order to overcome this problem, Hazelcast guys set non-power-of-two
> partition count by default - 257.
>
> On Fri, Mar 31, 2017 at 9:59 AM, michael.griggs <
> michael.griggs@gridgain.com
> > wrote:
>
> > The change is now ready for review:
> >
> > https://github.com/apache/ignite/pull/1707
> >
> >
> >
> > --
> > View this message in context: http://apache-ignite-
> > developers.2346864.n4.nabble.com/Distribution-of-keys-to-
> > partitions-tp15455p16005.html
> > Sent from the Apache Ignite Developers mailing list archive at
> Nabble.com.
> >
>

Re: Distribution of keys to partitions

Posted by Vladimir Ozerov <vo...@gridgain.com>.

In order to overcome this problem, Hazelcast guys set non-power-of-two
partition count by default - 257.

On Fri, Mar 31, 2017 at 9:59 AM, michael.griggs <michael.griggs@gridgain.com
> wrote:

> The change is now ready for review:
>
> https://github.com/apache/ignite/pull/1707
>
>
>
> --
> View this message in context: http://apache-ignite-
> developers.2346864.n4.nabble.com/Distribution-of-keys-to-
> partitions-tp15455p16005.html
> Sent from the Apache Ignite Developers mailing list archive at Nabble.com.
>

Re: Distribution of keys to partitions

Posted by "michael.griggs" <mi...@gridgain.com>.

The change is now ready for review:

https://github.com/apache/ignite/pull/1707



--
View this message in context: http://apache-ignite-developers.2346864.n4.nabble.com/Distribution-of-keys-to-partitions-tp15455p16005.html
Sent from the Apache Ignite Developers mailing list archive at Nabble.com.

Re: Distribution of keys to partitions

Posted by "michael.griggs" <mi...@gridgain.com>.

I created a PR to implement this.  I ran the TC tests, but there are a lot of
errors.  However, the errors seem unrelated to the change.  

I see that other PRs are suffering with similar test failures.  Have some
tests been broken by new 2.0 functionality and not fixed yet?

http://ci.ignite.apache.org/project.html?projectId=IgniteTests&branch_IgniteTests=pull/1645/head

Regards
Mike





--
View this message in context: http://apache-ignite-developers.2346864.n4.nabble.com/Distribution-of-keys-to-partitions-tp15455p15676.html
Sent from the Apache Ignite Developers mailing list archive at Nabble.com.

Re: Distribution of keys to partitions

Posted by Denis Magda <dm...@apache.org>.

Excellent discovery, thanks Michael!

I would suggest doing the following. If we see that a number of partitions is a power of two then the new algorithm will be applied, otherwise the warning will be printed out and the *old* one approach will be used. Does this resolver all the concerns?

Michael, could you complete the contribution performing all the required changes? I think we should squeeze the update into AI 2.0 where it’s safe to break the compatibility.

Before merging the changes we can ask Ilya to run a set of benchmarks to confirm that there is no performance hit.

—
Denis

> On Mar 15, 2017, at 12:31 PM, Dmitriy Setrakyan <ds...@apache.org> wrote:
> 
> On Wed, Mar 15, 2017 at 9:51 AM, Michael Griggs <michael.griggs@gridgain.com
>> wrote:
> 
>> Have we ever heard of somebody needing to set the partition count to a
>> non-power-of-two number?  Perhaps we could restrict the method so that it
>> will only accept a power of two as the partition count?
>> 
> 
> As Valentin suggested, we should not restrict it, but should print out a
> warning.

Re: Distribution of keys to partitions

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Wed, Mar 15, 2017 at 9:51 AM, Michael Griggs <michael.griggs@gridgain.com
> wrote:

> Have we ever heard of somebody needing to set the partition count to a
> non-power-of-two number?  Perhaps we could restrict the method so that it
> will only accept a power of two as the partition count?
>

As Valentin suggested, we should not restrict it, but should print out a
warning.

RE: Distribution of keys to partitions

Posted by Michael Griggs <mi...@gridgain.com>.

Have we ever heard of somebody needing to set the partition count to a non-power-of-two number?  Perhaps we could restrict the method so that it will only accept a power of two as the partition count?

-----Original Message-----
From: Valentin Kulichenko [mailto:valentin.kulichenko@gmail.com] 
Sent: 15 March 2017 16:22
To: dev@ignite.apache.org
Subject: Re: Distribution of keys to partitions

Andrey,

Absolutely, your point is correct. I'm talking about default behavior which must be as effective as possible. In case we do this optimization, I would also show a warning if number of partitions is not a power of two.

-Val

On Wed, Mar 15, 2017 at 5:09 PM, Andrey Gura <ag...@apache.org> wrote:

> Anyway, we can't always use this optimization because it will not work 
> for non power of two values.
>
> On Wed, Mar 15, 2017 at 6:48 PM, Valentin Kulichenko 
> <va...@gmail.com> wrote:
> > In 99% of cases number of partition is a power of two, because it's 
> > the default value. Almost no one changes it. If this change actually 
> > provides better distribution, it absolutely makes sense to do it.
> >
> > Michael, can you create a Jira ticket and put you findings there?
> >
> > -Val
> >
> > On Wed, Mar 15, 2017 at 3:58 PM, Andrey Gura <ag...@apache.org> wrote:
> >
> >> Michael,
> >>
> >> it makes sense only for cases when partitions count is power of two.
> >> Affinity function doesn't have this limitation.
> >>
> >> Bu, of course, we can check, that partitions count is power of two 
> >> and use optimized hash code calculation.
> >>
> >>
> >> On Wed, Mar 15, 2017 at 4:09 PM, Michael Griggs 
> >> <mi...@gridgain.com> wrote:
> >> > Hi Igniters,
> >> >
> >> > Last week I was working with a group of Ignite users.  They are
> inserting
> >> > several million string keys in to a cache.  Each string key was 
> >> > approximately 22-characters in length.  When I exported the 
> >> > partition counts (via GG Visor) I was able to see an unusual 
> >> > periodicity in the number of keys allocated to partitions.  I charted this in Excel [1].
> >> > After further investigation, it appears that there is a 
> >> > relationship between the number of keys being inserted, the 
> >> > number of partitions assigned to the cache and amount of apparent 
> >> > periodicity: a small
> number
> >> of
> >> > partitions will cause periodicity to appear with lower numbers of
> keys.
> >> >
> >> > The RendezvousAffinityFunction#partition function performs a 
> >> > simple calculation of key hashcode modulo partition-count:
> >> >
> >> > U.safeAbs(key.hashCode() % parts)
> >> >
> >> >
> >> > Digging further I was led to the fact that this is how the Java
> HashMap
> >> > *used* to behave [2], but was upgraded around Java 1.4 to perform 
> >> > the
> >> > following:
> >> >
> >> > key.hashCode() & (parts - 1)
> >> >
> >> > which performs more efficiently.  It was then updated further to 
> >> > do
> the
> >> > following:
> >> >
> >> > (h = key.hashCode()) ^ (h >>> 16);
> >> >
> >> > with the bit-shift performed to
> >> >
> >> > incorporate impact of the highest bits that would otherwise never 
> >> > be used in index calculations because of table bounds
> >> >
> >> >
> >> > When using this function, rather than our 
> >> > RendezvousAffinityFunction#partition implementation, I also saw a 
> >> > significant decrease in the periodicity and a better distribution 
> >> > of
> keys
> >> > amongst partitions [3].
> >> >
> >> > I would like to suggest that we adopt this modified hash function
> inside
> >> > RendezvousAffinityFunction.
> >> >
> >> > Regards
> >> > Mike
> >> >
> >> >
> >> > [1]: https://i.imgur.com/0FtCZ2A.png
> >> > [2]:
> >> > https://www.quora.com/Why-does-Java-use-a-mediocre-
> >> hashCode-implementation-for-strings
> >> > [3]: https://i.imgur.com/8ZuCSA3.png
> >>
>

Re: Distribution of keys to partitions

Posted by Valentin Kulichenko <va...@gmail.com>.

Andrey,

Absolutely, your point is correct. I'm talking about default behavior which
must be as effective as possible. In case we do this optimization, I would
also show a warning if number of partitions is not a power of two.

-Val

On Wed, Mar 15, 2017 at 5:09 PM, Andrey Gura <ag...@apache.org> wrote:

> Anyway, we can't always use this optimization because it will not work
> for non power of two values.
>
> On Wed, Mar 15, 2017 at 6:48 PM, Valentin Kulichenko
> <va...@gmail.com> wrote:
> > In 99% of cases number of partition is a power of two, because it's the
> > default value. Almost no one changes it. If this change actually provides
> > better distribution, it absolutely makes sense to do it.
> >
> > Michael, can you create a Jira ticket and put you findings there?
> >
> > -Val
> >
> > On Wed, Mar 15, 2017 at 3:58 PM, Andrey Gura <ag...@apache.org> wrote:
> >
> >> Michael,
> >>
> >> it makes sense only for cases when partitions count is power of two.
> >> Affinity function doesn't have this limitation.
> >>
> >> Bu, of course, we can check, that partitions count is power of two and
> >> use optimized hash code calculation.
> >>
> >>
> >> On Wed, Mar 15, 2017 at 4:09 PM, Michael Griggs
> >> <mi...@gridgain.com> wrote:
> >> > Hi Igniters,
> >> >
> >> > Last week I was working with a group of Ignite users.  They are
> inserting
> >> > several million string keys in to a cache.  Each string key was
> >> > approximately 22-characters in length.  When I exported the partition
> >> > counts (via GG Visor) I was able to see an unusual periodicity in the
> >> > number of keys allocated to partitions.  I charted this in Excel [1].
> >> > After further investigation, it appears that there is a relationship
> >> > between the number of keys being inserted, the number of partitions
> >> > assigned to the cache and amount of apparent periodicity: a small
> number
> >> of
> >> > partitions will cause periodicity to appear with lower numbers of
> keys.
> >> >
> >> > The RendezvousAffinityFunction#partition function performs a simple
> >> > calculation of key hashcode modulo partition-count:
> >> >
> >> > U.safeAbs(key.hashCode() % parts)
> >> >
> >> >
> >> > Digging further I was led to the fact that this is how the Java
> HashMap
> >> > *used* to behave [2], but was upgraded around Java 1.4 to perform the
> >> > following:
> >> >
> >> > key.hashCode() & (parts - 1)
> >> >
> >> > which performs more efficiently.  It was then updated further to do
> the
> >> > following:
> >> >
> >> > (h = key.hashCode()) ^ (h >>> 16);
> >> >
> >> > with the bit-shift performed to
> >> >
> >> > incorporate impact of the highest bits that would otherwise
> >> > never be used in index calculations because of table bounds
> >> >
> >> >
> >> > When using this function, rather than our
> >> > RendezvousAffinityFunction#partition implementation, I also saw a
> >> > significant decrease in the periodicity and a better distribution of
> keys
> >> > amongst partitions [3].
> >> >
> >> > I would like to suggest that we adopt this modified hash function
> inside
> >> > RendezvousAffinityFunction.
> >> >
> >> > Regards
> >> > Mike
> >> >
> >> >
> >> > [1]: https://i.imgur.com/0FtCZ2A.png
> >> > [2]:
> >> > https://www.quora.com/Why-does-Java-use-a-mediocre-
> >> hashCode-implementation-for-strings
> >> > [3]: https://i.imgur.com/8ZuCSA3.png
> >>
>

Re: Distribution of keys to partitions

Posted by Andrey Gura <ag...@apache.org>.

Anyway, we can't always use this optimization because it will not work
for non power of two values.

On Wed, Mar 15, 2017 at 6:48 PM, Valentin Kulichenko
<va...@gmail.com> wrote:
> In 99% of cases number of partition is a power of two, because it's the
> default value. Almost no one changes it. If this change actually provides
> better distribution, it absolutely makes sense to do it.
>
> Michael, can you create a Jira ticket and put you findings there?
>
> -Val
>
> On Wed, Mar 15, 2017 at 3:58 PM, Andrey Gura <ag...@apache.org> wrote:
>
>> Michael,
>>
>> it makes sense only for cases when partitions count is power of two.
>> Affinity function doesn't have this limitation.
>>
>> Bu, of course, we can check, that partitions count is power of two and
>> use optimized hash code calculation.
>>
>>
>> On Wed, Mar 15, 2017 at 4:09 PM, Michael Griggs
>> <mi...@gridgain.com> wrote:
>> > Hi Igniters,
>> >
>> > Last week I was working with a group of Ignite users.  They are inserting
>> > several million string keys in to a cache.  Each string key was
>> > approximately 22-characters in length.  When I exported the partition
>> > counts (via GG Visor) I was able to see an unusual periodicity in the
>> > number of keys allocated to partitions.  I charted this in Excel [1].
>> > After further investigation, it appears that there is a relationship
>> > between the number of keys being inserted, the number of partitions
>> > assigned to the cache and amount of apparent periodicity: a small number
>> of
>> > partitions will cause periodicity to appear with lower numbers of keys.
>> >
>> > The RendezvousAffinityFunction#partition function performs a simple
>> > calculation of key hashcode modulo partition-count:
>> >
>> > U.safeAbs(key.hashCode() % parts)
>> >
>> >
>> > Digging further I was led to the fact that this is how the Java HashMap
>> > *used* to behave [2], but was upgraded around Java 1.4 to perform the
>> > following:
>> >
>> > key.hashCode() & (parts - 1)
>> >
>> > which performs more efficiently.  It was then updated further to do the
>> > following:
>> >
>> > (h = key.hashCode()) ^ (h >>> 16);
>> >
>> > with the bit-shift performed to
>> >
>> > incorporate impact of the highest bits that would otherwise
>> > never be used in index calculations because of table bounds
>> >
>> >
>> > When using this function, rather than our
>> > RendezvousAffinityFunction#partition implementation, I also saw a
>> > significant decrease in the periodicity and a better distribution of keys
>> > amongst partitions [3].
>> >
>> > I would like to suggest that we adopt this modified hash function inside
>> > RendezvousAffinityFunction.
>> >
>> > Regards
>> > Mike
>> >
>> >
>> > [1]: https://i.imgur.com/0FtCZ2A.png
>> > [2]:
>> > https://www.quora.com/Why-does-Java-use-a-mediocre-
>> hashCode-implementation-for-strings
>> > [3]: https://i.imgur.com/8ZuCSA3.png
>>

Re: Distribution of keys to partitions

Posted by "michael.griggs" <mi...@gridgain.com>.

Valentin Kulichenko wrote
> In 99% of cases number of partition is a power of two, because it's the
> default value. Almost no one changes it. If this change actually provides
> better distribution, it absolutely makes sense to do it.
> 
> Michael, can you create a Jira ticket and put you findings there?
> 
> -Val

Done - https://issues.apache.org/jira/browse/IGNITE-4828


Andrey Gura wrote
> On Wed, Mar 15, 2017 at 3:58 PM, Andrey Gura &lt;

> agura@

> &gt; wrote:
> 
>> it makes sense only for cases when partitions count is power of two.
>> Affinity function doesn't have this limitation.

The Replicated cache, which is where this issue was found, has 512
partitions by default.  The default value for Partitioned caches is 1024. 
Presumably we can provide a custom implementation of the hash function when
the number of partitions is modified to a non-power-of-two value?



--
View this message in context: http://apache-ignite-developers.2346864.n4.nabble.com/Distribution-of-keys-to-partitions-tp15455p15471.html
Sent from the Apache Ignite Developers mailing list archive at Nabble.com.

Re: Distribution of keys to partitions

Posted by Valentin Kulichenko <va...@gmail.com>.

In 99% of cases number of partition is a power of two, because it's the
default value. Almost no one changes it. If this change actually provides
better distribution, it absolutely makes sense to do it.

Michael, can you create a Jira ticket and put you findings there?

-Val

On Wed, Mar 15, 2017 at 3:58 PM, Andrey Gura <ag...@apache.org> wrote:

> Michael,
>
> it makes sense only for cases when partitions count is power of two.
> Affinity function doesn't have this limitation.
>
> Bu, of course, we can check, that partitions count is power of two and
> use optimized hash code calculation.
>
>
> On Wed, Mar 15, 2017 at 4:09 PM, Michael Griggs
> <mi...@gridgain.com> wrote:
> > Hi Igniters,
> >
> > Last week I was working with a group of Ignite users.  They are inserting
> > several million string keys in to a cache.  Each string key was
> > approximately 22-characters in length.  When I exported the partition
> > counts (via GG Visor) I was able to see an unusual periodicity in the
> > number of keys allocated to partitions.  I charted this in Excel [1].
> > After further investigation, it appears that there is a relationship
> > between the number of keys being inserted, the number of partitions
> > assigned to the cache and amount of apparent periodicity: a small number
> of
> > partitions will cause periodicity to appear with lower numbers of keys.
> >
> > The RendezvousAffinityFunction#partition function performs a simple
> > calculation of key hashcode modulo partition-count:
> >
> > U.safeAbs(key.hashCode() % parts)
> >
> >
> > Digging further I was led to the fact that this is how the Java HashMap
> > *used* to behave [2], but was upgraded around Java 1.4 to perform the
> > following:
> >
> > key.hashCode() & (parts - 1)
> >
> > which performs more efficiently.  It was then updated further to do the
> > following:
> >
> > (h = key.hashCode()) ^ (h >>> 16);
> >
> > with the bit-shift performed to
> >
> > incorporate impact of the highest bits that would otherwise
> > never be used in index calculations because of table bounds
> >
> >
> > When using this function, rather than our
> > RendezvousAffinityFunction#partition implementation, I also saw a
> > significant decrease in the periodicity and a better distribution of keys
> > amongst partitions [3].
> >
> > I would like to suggest that we adopt this modified hash function inside
> > RendezvousAffinityFunction.
> >
> > Regards
> > Mike
> >
> >
> > [1]: https://i.imgur.com/0FtCZ2A.png
> > [2]:
> > https://www.quora.com/Why-does-Java-use-a-mediocre-
> hashCode-implementation-for-strings
> > [3]: https://i.imgur.com/8ZuCSA3.png
>

Re: Distribution of keys to partitions

Posted by Andrey Gura <ag...@apache.org>.

Michael,

it makes sense only for cases when partitions count is power of two.
Affinity function doesn't have this limitation.

Bu, of course, we can check, that partitions count is power of two and
use optimized hash code calculation.


On Wed, Mar 15, 2017 at 4:09 PM, Michael Griggs
<mi...@gridgain.com> wrote:
> Hi Igniters,
>
> Last week I was working with a group of Ignite users.  They are inserting
> several million string keys in to a cache.  Each string key was
> approximately 22-characters in length.  When I exported the partition
> counts (via GG Visor) I was able to see an unusual periodicity in the
> number of keys allocated to partitions.  I charted this in Excel [1].
> After further investigation, it appears that there is a relationship
> between the number of keys being inserted, the number of partitions
> assigned to the cache and amount of apparent periodicity: a small number of
> partitions will cause periodicity to appear with lower numbers of keys.
>
> The RendezvousAffinityFunction#partition function performs a simple
> calculation of key hashcode modulo partition-count:
>
> U.safeAbs(key.hashCode() % parts)
>
>
> Digging further I was led to the fact that this is how the Java HashMap
> *used* to behave [2], but was upgraded around Java 1.4 to perform the
> following:
>
> key.hashCode() & (parts - 1)
>
> which performs more efficiently.  It was then updated further to do the
> following:
>
> (h = key.hashCode()) ^ (h >>> 16);
>
> with the bit-shift performed to
>
> incorporate impact of the highest bits that would otherwise
> never be used in index calculations because of table bounds
>
>
> When using this function, rather than our
> RendezvousAffinityFunction#partition implementation, I also saw a
> significant decrease in the periodicity and a better distribution of keys
> amongst partitions [3].
>
> I would like to suggest that we adopt this modified hash function inside
> RendezvousAffinityFunction.
>
> Regards
> Mike
>
>
> [1]: https://i.imgur.com/0FtCZ2A.png
> [2]:
> https://www.quora.com/Why-does-Java-use-a-mediocre-hashCode-implementation-for-strings
> [3]: https://i.imgur.com/8ZuCSA3.png