You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by the_palakkaran <ji...@suntecsbs.com> on 2018/06/12 08:55:31 UTC

How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Hi,

How can I use Affinity Function to map a set of keys (may be an id range) to
a particular node? 

What I need is that all the time this node will be responsible for
loading/handling these set of keys. 

Also if there is a node failure, this should be distributed to other nodes.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by the_palakkaran <ji...@suntecsbs.com>.

Thanks a ton man !!

One last doubt.

Will this overridden partition method be invoked everytime a new node is
added to the cluster? (Since there can be a new partition created at that
time). 

Similarly, when one node in the cluster is down, hence one partition is
gone, then also will this be invoked everytime so that rebalancing is done?

Would have tried this and checked on my own, but its hard to peek in a node
for the cache entries that it has in it.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by dkarachentsev <dk...@gridgain.com>.

1. Affinity knows that, because it does assignments. Method
assignPartitions() returns that assignments. Please read the javadoc [1].

2. I just described how keys could be assigned to partition. For example:
    @Override public int partition(Object key) {
        if (key instanceof Integer)
            return (Integer)key / 10000;

        return 0;
    }
How it should be applied to your case, you need to think.

3. Please check how assignPartitions() is implemented in
RendezvousAffinityFunction.

It doesn't matter from how many and what kind of nodes you're loading data,
affinity works equal for each node.

[1]
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/affinity/AffinityFunction.html#assignPartitions-org.apache.ignite.cache.affinity.AffinityFunctionContext-

Thanks!
-Dmitry



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by the_palakkaran <ji...@suntecsbs.com>.

Thanks a ton again.

Still few more doubts :

//It's much better if you in AffinityFunction.partition() method will 
calculate node according to your key. If you have key 1-10000 it should go 
to partitions that belong to a single node. But at the same time method 
assignPartitions() should assign related partitions to the same node. //


1. AffinityFunction.partition() will calculate the partition where it will
be gone into. How to know which node/machine is this partition in? Is there
an API for this?

2.  // If you have key 1-10000 it should go to partitions that belong to a
single node // 

How can I be sure about this? or how can I force this? Since the ids are
unique and does not have anything in common.

3.  //But at the same time method assignPartitions() should assign related
partitions to the same node.//
is this used for assigning partitions to nodes? when I checked, this returns
something else?

And finally and most importantly, this will all
work(load,rebalance,failover) if I load data from a single node and other
nodes won't have any data loading part.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by dkarachentsev <dk...@gridgain.com>.

There are various  possible ways, but use one partition per node is
definitely a bad idea, because you're loosing scaling possibilities. If you
have 5 partitions and 5 nodes, then 6 node will be empty.

It's much better if you in AffinityFunction.partition() method will
calculate node according to your key. If you have key 1-10000 it should go
to partitions that belong to a single node. But at the same time method
assignPartitions() should assign related partitions to the same node.

Or (bad solution, but easier), use 5 partitions, distribute across nodes and
put related keys to proper partition.

Thanks!
-Dmitry



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

RE: How to use Affinity Function to Map a set of Keys to a particularnode in a cluster?

Posted by Stanislav Lukyanov <st...@gmail.com>.

Hi,

1. Yes
2. This is controlled by the AffinityFunction::assignPartitions implementation. If the default one doesn’t suite you, you have to write another one yourself.

Reading this thread I have a feeling that this is a case of the XY problem (https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem).
You’re asking about how to make sure that your keys 1-10000 are stored on the node 1, but this is not a business problem (this is the Y, not the X).

I believe that 99% of the Ignite’s use cases don’t require to implement the AffinityFunction (or even tune the default one).
In these cases, choosing an affinity key for your objects and defining it via @AffinityKeyMapped is enough. And a lot of the times you don’t even need the affinity key.
The developer normally don’t even need to think about the partition-to-node mappings (at least, not from the start).
That’s a part of the beauty of the scalable systems – you may add or remove a node and the system will just work, no need to change anything.

Try solving your problem by just defining the @AffinityKeyMapped – it seems that this thread has a lot of info on how the affinity collocation works.
If you’re still unsure of how to proceed, I suggest you describe your use case in more details.

Thanks,
Stan

From: the_palakkaran
Sent: 13 июня 2018 г. 12:11
To: user@ignite.apache.org
Subject: Re: How to use Affinity Function to Map a set of Keys to a particularnode in a cluster?

Thanks for the detailed reply, now everything makes more sense to me.

1.Do I have a control over number of partitions ignite makes? I think this
is done using affinity.setPartitions(5) right? 

2. Can I tell ignite to have a single partition of my cache in each node? [5
nodes => 5 partitions]

Your suggestion method 1 will have worked for me, if I knew in which
partition of which node all entries marked with (int) affinity = 1 --> means
ranging from 1-10000. I know about using
CacheConfiguration().getAffinityFunction().partition(Object key), but I
can't use this for each record, I need to know it in prior. And I also need
to know details of partitions against node.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by the_palakkaran <ji...@suntecsbs.com>.

Thanks for the detailed reply, now everything makes more sense to me.

1.Do I have a control over number of partitions ignite makes? I think this
is done using affinity.setPartitions(5) right? 

2. Can I tell ignite to have a single partition of my cache in each node? [5
nodes => 5 partitions]

Your suggestion method 1 will have worked for me, if I knew in which
partition of which node all entries marked with (int) affinity = 1 --> means
ranging from 1-10000. I know about using
CacheConfiguration().getAffinityFunction().partition(Object key), but I
can't use this for each record, I need to know it in prior. And I also need
to know details of partitions against node.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by dkarachentsev <dk...@gridgain.com>.

Hi,

I totally agree with Val that implementing own AffinityFunction is quite
complex way. Requirement that you described is named affinity co-location as
I wrote before. 
Let me explain in more details what to do and what are the drawbacks.

1. Use use @AffinityKeyMapped for all your keys. For example, on each cache
save, you set this field for group of keys. Let's say, CustomerKey contains
additional annotated field "int affinity". It will be equals of customerId /
10000. In this case you will be sure that all keys, grouped with "affinity"
will fall into the same partition. You do not have to implement
AffintiyFunction and it works automagically. 
*BUT*, there is no guarantee that each node will hold only one such
partition, there is a high risk that one node will keep two or more
partitions, when other could be empty.

2. For example, you exactly know that you will not need more than 5 nodes.
In this case everything becomes much easier. You implement AffinityFunction
in that way it has 5 partitions and assigns to each node only one partition.
Method partition() groups your keys on rule that I showed before. If you
have less than 5 nodes, you may just put more than one partition to nodes.
If you have more than 50000 customers, you need to enhance your partition()
method:
 if (key instanceof Integer)
            return (Integer)key / 10000 % parts;  

Everything fine, *BUT*, you cannot scale. If you add more nodes - they will
be empty, just because you have only 5 partitions (actually you may write
affinity in a more complex way, but in the end you'll finish with regular
RendezvousAffinity). 

So analyze your requirements and choose the right way.

Thanks!
-Dmitry



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by the_palakkaran <ji...@suntecsbs.com>.

Hi Val,

My requirement is something like this:

5 nodes, 50000 customers.

I will keep data of 10000 on each node.

Whenever a file in the range 1-10000 is coming in, I will process it in node
1, so that the read time is minimal (rather than looking for data on all
nodes).

I hope you got my requirement.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by vkulichenko <va...@gmail.com>.

Yes, every time topology is changed, assignments will be recalculated and
data will be rebalanced.

By partitioning mechanisms I meant basically the same that Dmitry was
describing before. Ignite automatically distributes cache entries across
nodes based on built-in affinity function. This function is pluggable and
you are free to create your own, but this task is usually not trivial - I
would recommend to do that only if you're absolutely sure that's the only
way to go. As a matter of fact, I have never seen a use case where it would
be a strong requirement.

I also believe that the actual partitioning algorithm (regardless of which
one you use) should be abstracted form the application, meaning that the
business logic should not depend on it. It sounds like you want to break
this abstraction which makes me think that you might go in the wrong
direction. But unfortunately I don't think I can help more since you're not
ready to disclose details :)

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by the_palakkaran <ji...@suntecsbs.com>.

The requirement is an actual scenario, mate. I would have explained it if it
was not a public forum.

Also this is now a personal interest of mine, you have a great product,
buddy.

What are those "partitioning" mechanisms?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by vkulichenko <va...@gmail.com>.

Hi the_palakkaran,

Where this requirement is coming from? Why won't you just use partitioning
mechanisms Ignite provides out of the box?

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by the_palakkaran <ji...@suntecsbs.com>.

Actually my requirement is something like the below:

I have customers with customer numbers from 1 to 50000, I have 5 nodes.
I need my first 10000 in node1, next 10000 in node2 and so on.
So I guess I cannot do that here.

Again about the below:

//AffinityFunction does two things: maps partitions to nodes and keys to 
partitions. If you override RendezvousAffinityFunction in that way, when 
partitions 1-4 will go to node 1, you need to make sure that your keys will 
fall into that partitions. //

Does this mean a node can have multiple partitions? How to make it single
partition on a single node so that if I have 5 nodes, then there shall be
maximum 5 partitions?




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by Mikhail <mi...@gmail.com>.

Hi

You can gather your data by @AffinityKeyMapped in one partition, but you can
not force partition to be stored on some particular node.
So you can use the same @AffinityKeyMapped for all data related to a
particular city and this means all data about this city and/or related to
this city would be on one particular node. However, this means that data
related to London and Zurich might get to the same partition. 
Anyway if you list of cities >1000 distribution would be fiar enough. In the
case of a small set of cities, you can try to implement your own affinity
function, but this isn't an easy task and can take significant time.

Thanks,
Mike.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by krishd <de...@tcs.com>.

Hello

I have a similar requirement :
Need particular AffinityKey related data to go to Partitions OF PREDEFINED
NODES ONLY.
For eg.
a PersonCache with 5 partitions distributed across 3 nodes.
The Person object has cityCode as AffinityKeyMapped.  
The Person objects can have any of these 4 cityCodes say : Zurich, London,
Melbourne,Wellingdon
While streaming large number of Person objects into the PersonCache, I want
all persons belonging to Zurich & Melbourne to go to Node1 ,  London to
Node2, Wellingdon to Node3.

Is there a way to enforce this ?

The reason why I need this is because I have a lot of other master/reference
data that is specific to the City and I don't want to load all of that on
all nodes. 

thanks



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by dkarachentsev <dk...@gridgain.com>.

Normally (without @AffinityKeyMapped) Ignite will use CustomerKey hash code
(not object hashCode()) to find a partition. Ignite will colsult with
AffinityFunction (partition() method) and to what partition put key and with
assignPartitions find concrete node that holds that partition.

In other hand, if you annotate some field with @AffinityKeyMapped, value
from that field would be used for mapping to partition. In your case, I
suppose, you need to map a field that is common to your keys, by what you
can group into one partition. For example, if you set annotation to customer
name will mean that keys with the same customer name will always hit the
same partition.

AffinityFunction does two things: maps partitions to nodes and keys to
partitions. If you override RendezvousAffinityFunction in that way, when
partitions 1-4 will go to node 1, you need to make sure that your keys will
fall into that partitions.

You may start with annotation first (this process named affinity
co-location, when some related keys put into same partition), I think that
is what you need.

Affinity implementation is set in CacheConfiguration.setAffinity().

Thanks!
-Dmitry



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by the_palakkaran <ji...@suntecsbs.com>.

Thanks.

I still have few doubts:

1. If I have a customer id, customer number, customer name fields in
CustomerKey, but only customer number is unique, I should annotate only
customer number with @@AffinityKeyMapped. Is that so? 

2. If so, ignite internally decides which node a key with the annotation
should always go into, right? To overcome this, are you suggesting
Rendezvous Affinity Function?

3. How do I get Rendezvous Affinity Function instance? How do I map it with
a node, say N1 in a cluster of 3 nodes N1,N2,N3 ?

Sorry to ask for spoon feeding, I am in a hurry to complete this, that is
why.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to use Affinity Function to Map a set of Keys to a particular node in a cluster?

Posted by dkarachentsev <dk...@gridgain.com>.

Hi,

Make sure that your keys are go to specific partition. Only one node could
keep that partition at a time (except backups, of course). To do that, you
may use @AffinityKeyMapped annotation [1].

Additionally you can implement your own AffinityFunction that will assign
partitions that you need to specific node(s). You may try to extend
RendezvousAffinityFunction for that. In this case you may assign number of
partitions to proper node that you need.

[1]
https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/model/EmployeeKey.java#L33

Thanks!
-Dmitry



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/