You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Chris Berry <ch...@gmail.com> on 2017/10/09 17:57:33 UTC

BackupFilter for the RendezvousAffinityFunction Questions

Hi,

I have 2 availability zones (AZs), and an Ignite Grid that spans them.
I have implemented a BackupFilter for the RendezvousAffinityFunction, which
attempts to keep the Primary and Backups balanced.

In other words, if I have 1 Primary and 3 Backups (for a PARTITIONED cache)
across 16 Nodes (8 per AZ) 
Then I will have  4 copies of the data – with 2 copies on each AZ.

This way I can lose an entire AZ – for maintenance or whatever – and be able
to withstand it.

My questions:

1) By messing with the RendezvousAffinityFunction, am I messing with the
Cache Affinity?? (I believe not?)
We have many caches – and they all use the same cache keys (the same set of
UUIDs – imagine a User Id) 
Which ensures that all data that is affiliated with a particular UUID lives
on the same Node, and thus is collocated in the Compute Grid.
This is essential to our system’s performance, and we want to be certain
that we are not affecting that by implementing the BackupFilter??

2) How can I visualize the distribution of cache data in the Primary &
Backups??
I’d love to determine if all of this is working as expected.
Even being able to dump this to a log would be helpful.
Better would be something similar to how Elasticsearch can show you the
Shard distribution across Nodes.

Thanks,
-- Chris 




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: BackupFilter for the RendezvousAffinityFunction Questions

Posted by Andrey Mashenkov <an...@gmail.com>.
Hi Cris,

1. Just to clarify.
Ignite use AffinityFunction for each cache (cache group from 2.1 [1]) to
make 2 kind of mapping: key->partition and partitions->node.
When we talk about data collocation (affinity collocation), we mean
collocated data belongs to same partition.

So, entries with same key (actually, it is more common with same key binary
representation hash code) will resides in same partition.
This is true for entries belongs to same cache. Also it will be true for
entries from different caches if AF is idempotent and caches have same
partition number and doesn't relays on previous distribution, e.g.
RendezvousAF.
But I wouldn't recommend to rely on this fact until you understand what you
are doing as it seems error prone approach.

FYI: Some time ago we had a problem with FairAffinityFunction when there
was no guarantee that same keys from different caches will belong to same
partition. FairAF was removed from code and there is a ticket for its
resurrection.

What about partitions->node mapping. It is used by Ignite for partition
distributing among nodes.

Actually, BackupFilter is outdated and AffinityBackupFilter should be used
uinstead.
AffinityBackupFilter shouldn't have noticeable performance impact.

FYI: For partition mapping calculation Ignite uses Rendezvous_hashing [2]
(by default).
In two words it build a sorted list of node (actually node+part hash)
hashes for every partition where first node is a primary for partition
ans others are candidates to be bcakups. So, Ignite just to through the
list and assign backup nodes according to AffinityBackupFilter.


2. You can try to use WebConsole to check distribution, but I'm doubt it
has such ability for now. Feel free to create a ticket if you find it
useful.
Also you can run cluster wide task to collect data distribution statistic.
See methods:
 IgniteCache.sizeLong(partition, peekMode) [3]
 node.affinity(cache_name).primaryPartitions() [4]
 node.affinity(cache_name).backupPartitions() [4]



[1] https://issues.apache.org/jira/browse/IGNITE-5075
[2] https://en.wikipedia.org/wiki/Rendezvous_hashing
[3]
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/IgniteCache.html
[4]
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/affinity/Affinity.html

On Mon, Oct 9, 2017 at 8:57 PM, Chris Berry <ch...@gmail.com> wrote:

> Hi,
>
> I have 2 availability zones (AZs), and an Ignite Grid that spans them.
> I have implemented a BackupFilter for the RendezvousAffinityFunction, which
> attempts to keep the Primary and Backups balanced.
>
> In other words, if I have 1 Primary and 3 Backups (for a PARTITIONED cache)
> across 16 Nodes (8 per AZ)
> Then I will have  4 copies of the data – with 2 copies on each AZ.
>
> This way I can lose an entire AZ – for maintenance or whatever – and be
> able
> to withstand it.
>
> My questions:
>
> 1) By messing with the RendezvousAffinityFunction, am I messing with the
> Cache Affinity?? (I believe not?)
> We have many caches – and they all use the same cache keys (the same set of
> UUIDs – imagine a User Id)
> Which ensures that all data that is affiliated with a particular UUID lives
> on the same Node, and thus is collocated in the Compute Grid.
> This is essential to our system’s performance, and we want to be certain
> that we are not affecting that by implementing the BackupFilter??
>
> 2) How can I visualize the distribution of cache data in the Primary &
> Backups??
> I’d love to determine if all of this is working as expected.
> Even being able to dump this to a log would be helpful.
> Better would be something similar to how Elasticsearch can show you the
> Shard distribution across Nodes.
>
> Thanks,
> -- Chris
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>



-- 
Best regards,
Andrey V. Mashenkov