You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@geode.apache.org by aashish choudhary <aa...@gmail.com> on 2019/08/19 20:19:04 UTC

Gemfire functions giving duplicate records

We use data aware function and We make a call to region X from a data aware
function using getLocalData API and then we do getall. Recently we
introduced redundancy for our partitioned region and now we are getting
duplicate enteries for that region X from function response. My hunch is
that it is becuase of getLocalData + get all call so if we will change it
to getLocalPrimaryData(hope name is correct) for region X it should only do
get for primary copies. Is that correct way of handling  it.?

With best regards,
Ashish

Re: Gemfire functions giving duplicate records

Posted by Barry Oglesby <bo...@pivotal.io>.

> Option 3 not sure if I really get it "Still use a filter, but just use
the region in that case. Some of the gets will be remote".

> So you are saying that if I make the function onregion call with the
withfilter thing then other region X call where we get duplicate willl get
resolved? And it's irrespective of whether I have optimizieforwrite true or
false.

onRegion.withFilter will split the keys according to optimizeForWrite, so
each key will only be sent to one server. If you use
cache.getRegion(regionX).getAll(keys) in the function, there won't be any
duplicates. Some of those gets will be local and some will be remote
depending on where the buckets for region X are defined.

Execute the function like:

Set keys = ...;
Object result =
FunctionService.onRegion(this.region).withFilter(keys).execute("MyFunction").getResult();

With these keys:

keysSize=24; keys=[77, 24, 193, 184, 272, 196, 264, 165, 342, 397, 2, 101,
125, 158, 433, 335, 446, 204, 205, 414, 7, 306, 405, 31]

You'll see this kind of behavior:

----------------------
optimizeForWrite false
----------------------

Server1
-------
keysSize=9; keys=[264, 342, 158, 335, 405, 306, 193, 272, 184]
regionKeysAndValues=9; regionKeysAndValues={264=264, 342=342, 158=158,
335=335, 405=405, 306=306, 193=193, 272=272, 184=184}

Server2
-------
keysSize=16; keys=[36, 15, 27, 19, 273, 386, 465, 136, 368, 357, 314, 6,
216, 306, 339, 63]
regionKeysAndValues=16; regionKeysAndValues={36=36, 15=15, 27=27, 19=19,
273=273, 386=386, 465=465, 136=136, 368=368, 357=357, 314=314, 6=6,
216=216, 306=306, 339=339, 63=63}

---------------------
optimizeForWrite true
---------------------

Server1
-------
keysSize=9; keys=[165, 397, 101, 24, 204, 446, 7, 414, 31]
regionKeysAndValues=9; regionKeysAndValues={165=165, 397=397, 101=101,
24=24, 204=204, 446=446, 7=7, 414=414, 31=31}

Server2
-------
keysSize=7; keys=[2, 158, 335, 205, 306, 272, 184]
regionKeysAndValues=7; regionKeysAndValues={2=2, 158=158, 335=335, 205=205,
306=306, 272=272, 184=184}

Server3
-------
keysSize=8; keys=[77, 264, 342, 125, 433, 405, 193, 196]
regionKeysAndValues=8; regionKeysAndValues={77=77, 264=264, 342=342,
125=125, 433=433, 405=405, 193=193, 196=196}

Thanks,
Barry Oglesby



On Tue, Aug 20, 2019 at 2:55 AM aashish choudhary <
aashish.choudhary1@gmail.com> wrote:

> Thanks Barry for thorough analysis. I am kind of in favour of option 1 as
> we are not so sure about colocated regions.
> Option 3 not sure if I really get it "Still use a filter, but just use
> the region in that case. Some of the gets will be remote".
>
> So you are saying that if I make the function onregion call with the
> withfilter thing then other region X call where we get duplicate willl get
> resolved? And it's irrespective of whether I have optimizieforwrite true or
> false.
>
> I was also thinking about making a nested function call to region X but
> not sure if it is recommended or could run into some distributed lock
> situation.
>
> On Tue, Aug 20, 2019, 4:49 AM Barry Oglesby <bo...@pivotal.io> wrote:
>
>> Ashish,
>>
>> Here is a bunch of analysis on that scenario.
>>
>> -------------
>> No redundancy
>> -------------
>> With partitioned regions, no redundancy and no filter, the function is
>> being sent to every member that contains buckets.
>>
>> In that case, you see this kind of behavior (I have 3 servers in my test):
>>
>> The argument containing the keys is sent to every member. In this case, I
>> have 24 keys.
>>
>> keysSize=24; keys=[44, 67, 59, 49, 162, 261, 284, 473, 397, 475, 376,
>> 387, 101, 157, 366, 301, 469, 403, 427, 70, 229, 108, 50, 85]
>>
>> When you call PartitionRegionHelper.getLocalData or getLocalPrimaryData,
>> you're getting back a LocalDataSet. Calling get or getAll on a LocalDataSet
>> returns null if the value is not in that LocalDataSet. This causes all the
>> get calls to be local and a bunch of nulls in the result.
>>
>> If I print the LocalDataSet and the value of getAll in all three servers,
>> I see 24 non-null results across the servers.
>>
>> Server1
>> -------
>> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
>> ;bucketIds=[2, 3, 7, 10, 11, 15, 16, 19, 20, 25, 26, 29, 31, 34, 40, 41,
>> 46, 49, 52, 56, 57, 58, 60, 65, 67, 68, 71, 75, 78, 82, 84, 87, 92, 93,
>> 102, 104, 110]]
>> localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=10;
>> localDataKeysAndValues={44=44, 67=67, 59=null, 49=49, 162=162, 261=null,
>> 284=null, 473=473, 397=null, 475=null, 376=null, 387=387, 101=null,
>> 157=157, 366=366, 301=null, 469=null, 403=null, 427=427, 70=70, 229=null,
>> 108=null, 50=null, 85=null}; nonNullLocalDataKeysAndValues={44=44, 473=473,
>> 67=67, 387=387, 157=157, 366=366, 49=49, 427=427, 70=70, 162=162}
>>
>> Server2
>> -------
>> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
>> ;bucketIds=[0, 5, 6, 8, 13, 18, 21, 24, 27, 32, 35, 36, 37, 38, 43, 45, 48,
>> 51, 55, 61, 64, 66, 69, 72, 74, 79, 80, 83, 86, 91, 94, 96, 99, 100, 105,
>> 106, 108, 112]]
>> localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=8;
>> localDataKeysAndValues={44=null, 67=null, 59=59, 49=null, 162=null,
>> 261=null, 284=284, 473=null, 397=397, 475=null, 376=null, 387=null,
>> 101=101, 157=null, 366=null, 301=301, 469=null, 403=403, 427=null, 70=null,
>> 229=null, 108=108, 50=null, 85=85}; nonNullLocalDataKeysAndValues={397=397,
>> 101=101, 59=59, 301=301, 403=403, 108=108, 85=85, 284=284}
>>
>> Server3
>> -------
>> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
>> ;bucketIds=[1, 4, 9, 12, 14, 17, 22, 23, 28, 30, 33, 39, 42, 44, 47, 50,
>> 53, 54, 59, 62, 63, 70, 73, 76, 77, 81, 85, 88, 89, 90, 95, 97, 98, 101,
>> 103, 107, 109, 111]]
>> localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=6;
>> localDataKeysAndValues={44=null, 67=null, 59=null, 49=null, 162=null,
>> 261=261, 284=null, 473=null, 397=null, 475=475, 376=376, 387=null,
>> 101=null, 157=null, 366=null, 301=null, 469=469, 403=null, 427=null,
>> 70=null, 229=229, 108=null, 50=50, 85=null};
>> nonNullLocalDataKeysAndValues={475=475, 376=376, 469=469, 229=229, 50=50,
>> 261=261}
>>
>> ---------------------------------
>> Redundancy optimizeForWrite false
>> ---------------------------------
>> If I change both regions to be redundant, I see very different behavior.
>>
>> First, with optimizeForWrite returning false (the default), only 2 of the
>> servers invoke the function. In the optimizeForWrite false case, the
>> function is sent to the fewest number of servers that include all the
>> buckets.
>>
>> keysSize=24; keys=[390, 292, 370, 261, 250, 273, 130, 460, 274, 452, 123,
>> 388, 113, 268, 455, 400, 159, 435, 314, 429, 51, 419, 84, 43]
>>
>> As you saw, getLocalData will produce duplicates since some of the
>> buckets overlap between the servers. In this case, you'll see all the data.
>> If you call getLocalPrimaryData, you probably won't see all the data since
>> some of the primaries will be in the server that doesn't execute the
>> function.
>>
>> You can see below the local data set returns 35 entries for the 24 keys;
>> the primary set returns only 15.
>>
>> Server1
>> -------
>> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
>> ;bucketIds=[0, 2, 3, 4, 7, 9, 10, 12, 15, 16, 17, 19, 21, 22, 23, 24, 26,
>> 28, 29, 31, 32, 33, 34, 36, 37, 38, 40, 41, 43, 45, 46, 47, 48, 49, 51, 53,
>> 54, 56, 58, 61, 62, 63, 66, 67, 68, 69, 70, 72, 74, 75, 76, 78, 80, 81, 82,
>> 84, 86, 87, 88, 89, 90, 91, 93, 94, 95, 97, 98, 99, 100, 101, 103, 105,
>> 106, 110, 111]]
>> nonNullLocalDataKeysAndValuesSize=19;
>> nonNullLocalDataKeysAndValues={390=390, 292=292, 261=261, 250=250, 273=273,
>> 130=130, 460=460, 274=274, 452=452, 123=123, 388=388, 113=113, 400=400,
>> 159=159, 435=435, 429=429, 51=51, 84=84, 43=43}
>>
>> primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
>> ;bucketIds=[0, 66, 4, 68, 70, 7, 10, 74, 12, 76, 78, 17, 82, 84, 21, 87,
>> 24, 90, 28, 95, 32, 33, 99, 37, 101, 38, 103, 106, 43, 45, 46, 110, 49, 51,
>> 54, 58, 62, 63]]
>> nonNullPrimaryDataKeysAndValuesSize=7;
>> nonNullPrimaryDataKeysAndValues={452=452, 388=388, 435=435, 429=429, 51=51,
>> 250=250, 274=274}
>>
>> Server2
>> -------
>> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
>> ;bucketIds=[0, 1, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 16, 18, 20, 21, 23,
>> 25, 27, 28, 29, 30, 33, 35, 36, 39, 40, 41, 42, 44, 46, 49, 50, 52, 53, 55,
>> 56, 57, 59, 60, 61, 64, 65, 66, 67, 69, 71, 72, 73, 75, 76, 77, 78, 79, 81,
>> 82, 83, 85, 87, 88, 91, 92, 93, 96, 97, 98, 101, 102, 103, 104, 105, 107,
>> 108, 109, 111, 112]]
>> nonNullLocalDataKeysAndValuesSize=16;
>> nonNullLocalDataKeysAndValues={370=370, 261=261, 250=250, 130=130, 460=460,
>> 274=274, 388=388, 113=113, 268=268, 455=455, 400=400, 435=435, 314=314,
>> 419=419, 84=84, 43=43}
>>
>> primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
>> ;bucketIds=[64, 1, 67, 5, 69, 72, 9, 11, 75, 15, 79, 16, 81, 18, 83, 20,
>> 23, 88, 91, 29, 93, 30, 97, 98, 35, 36, 102, 40, 41, 105, 108, 109, 111,
>> 50, 53, 56, 61]]
>> nonNullPrimaryDataKeysAndValuesSize=8;
>> nonNullPrimaryDataKeysAndValues={113=113, 400=400, 419=419, 84=84, 261=261,
>> 130=130, 460=460, 43=43}
>>
>> --------------------------------
>> Redundancy optimizeForWrite true
>> --------------------------------
>> With optimizeForWrite returning true, I see exactly 24 results. Thats
>> because the function is executed on all members, and each key is primary in
>> only one. The approach still won't work though if you have some servers
>> with no primaries.
>>
>> keysSize=24; keys=[13, 58, 25, 181, 491, 183, 282, 173, 294, 495, 122,
>> 365, 222, 486, 476, 146, 236, 139, 70, 449, 60, 51, 63, 43]
>>
>> nonNullPrimaryDataKeysAndValuesSize=8;
>> nonNullPrimaryDataKeysAndValues={365=365, 146=146, 236=236, 139=139,
>> 491=491, 183=183, 173=173, 43=43}
>> nonNullPrimaryDataKeysAndValuesSize=7;
>> nonNullPrimaryDataKeysAndValues={13=13, 222=222, 449=449, 282=282, 51=51,
>> 294=294, 63=63}
>> nonNullPrimaryDataKeysAndValuesSize=9;
>> nonNullPrimaryDataKeysAndValues={495=495, 122=122, 486=486, 58=58, 25=25,
>> 476=476, 70=70, 181=181, 60=60}
>>
>> -------
>> Options
>> -------
>> There are a few things you could look into:
>>
>> 1. Plan for duplicates by defining a custom ResultCollector that filters
>> the duplicates
>> 2. Co-locate your regions which will mean that the same buckets will be
>> on the same servers and they will be primary on the same servers. Then use
>> a filter instead of an argument and return true for optimizeForWrite. In
>> this case, it shouldn't matter how you get the other region.
>> 3. If you can't colocate your regions, then don't use either getLocalData
>> or getLocalPrimaryData. Still use a filter, but just use the region in that
>> case. Some of the gets will be remote.
>>
>> The result of option 2 on three servers with 24 keys looks like below.
>> The keys are split among the servers, region.getAll is entirely local and
>> all the values are returned.
>>
>> keysSize=9; keys=[222, 179, 476, 147, 346, 259, 128, 81, 153]
>> regionKeysAndValues=9; regionKeysAndValues={222=222, 179=179, 476=476,
>> 147=147, 346=346, 259=259, 128=128, 81=81, 153=153}
>>
>> keysSize=9; keys=[133, 45, 387, 343, 58, 234, 107, 9, 141]
>> regionKeysAndValues=9; regionKeysAndValues={133=133, 45=45, 387=387,
>> 343=343, 58=58, 234=234, 107=107, 9=9, 141=141}
>>
>> keysSize=6; keys=[122, 279, 237, 381, 131, 351]
>> regionKeysAndValues=6; regionKeysAndValues={122=122, 279=279, 237=237,
>> 381=381, 131=131, 351=351}
>>
>> Thanks,
>> Barry Oglesby
>>
>>
>>
>> On Mon, Aug 19, 2019 at 2:18 PM aashish choudhary <
>> aashish.choudhary1@gmail.com> wrote:
>>
>>> You have a data-aware function (invoked by onRegion) from which you call
>>> getAll in region X. That's correct.
>>>
>>> Is region X the region on which the function is executed? Or is it
>>> another region? X is a Different region.
>>> If multiple regions are involved, are they co-located? Not colocated.
>>>
>>> How do you determine the keys to getAll? Let's just say that key passed
>>> to both region is same we basically merge data and return the result.
>>>
>>> Are they passed into the function? If so, as a filter or as an argument?
>>> As an argument. With filter could have been a better approach.
>>>
>>> What does optimizeForWrite return? How many members are running? Have to
>>> check and confirm. We have 12 nodes running.
>>>
>>> Tue, Aug 20, 2019, 2:32 AM Barry Oglesby <bo...@pivotal.io> wrote:
>>>
>>>> Ashish,
>>>>
>>>> Sorry for all the questions, but I want to make sure I understand the
>>>> scenario. You have a data-aware function (invoked by onRegion) from which
>>>> you call getAll in region X. Is region X the region on which the function
>>>> is executed? Or is it another region? If multiple regions are involved, are
>>>> they co-located? How do you determine the keys to getAll? Are they passed
>>>> into the function? If so, as a filter or as an argument? What does
>>>> optimizeForWrite return? How many members are running?
>>>>
>>>> Thanks,
>>>> Barry Oglesby
>>>>
>>>>
>>>>
>>>> On Mon, Aug 19, 2019 at 1:19 PM aashish choudhary <
>>>> aashish.choudhary1@gmail.com> wrote:
>>>>
>>>>> We use data aware function and We make a call to region X from a data
>>>>> aware function using getLocalData API and then we do getall. Recently we
>>>>> introduced redundancy for our partitioned region and now we are getting
>>>>> duplicate enteries for that region X from function response. My hunch is
>>>>> that it is becuase of getLocalData + get all call so if we will change it
>>>>> to getLocalPrimaryData(hope name is correct) for region X it should only do
>>>>> get for primary copies. Is that correct way of handling  it.?
>>>>>
>>>>> With best regards,
>>>>> Ashish
>>>>>
>>>>

Re: Gemfire functions giving duplicate records

Posted by aashish choudhary <aa...@gmail.com>.

Thanks Barry for thorough analysis. I am kind of in favour of option 1 as
we are not so sure about colocated regions.
Option 3 not sure if I really get it "Still use a filter, but just use the
region in that case. Some of the gets will be remote".

So you are saying that if I make the function onregion call with the
withfilter thing then other region X call where we get duplicate willl get
resolved? And it's irrespective of whether I have optimizieforwrite true or
false.

I was also thinking about making a nested function call to region X but not
sure if it is recommended or could run into some distributed lock situation.

On Tue, Aug 20, 2019, 4:49 AM Barry Oglesby <bo...@pivotal.io> wrote:

> Ashish,
>
> Here is a bunch of analysis on that scenario.
>
> -------------
> No redundancy
> -------------
> With partitioned regions, no redundancy and no filter, the function is
> being sent to every member that contains buckets.
>
> In that case, you see this kind of behavior (I have 3 servers in my test):
>
> The argument containing the keys is sent to every member. In this case, I
> have 24 keys.
>
> keysSize=24; keys=[44, 67, 59, 49, 162, 261, 284, 473, 397, 475, 376, 387,
> 101, 157, 366, 301, 469, 403, 427, 70, 229, 108, 50, 85]
>
> When you call PartitionRegionHelper.getLocalData or getLocalPrimaryData,
> you're getting back a LocalDataSet. Calling get or getAll on a LocalDataSet
> returns null if the value is not in that LocalDataSet. This causes all the
> get calls to be local and a bunch of nulls in the result.
>
> If I print the LocalDataSet and the value of getAll in all three servers,
> I see 24 non-null results across the servers.
>
> Server1
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[2, 3, 7, 10, 11, 15, 16, 19, 20, 25, 26, 29, 31, 34, 40, 41,
> 46, 49, 52, 56, 57, 58, 60, 65, 67, 68, 71, 75, 78, 82, 84, 87, 92, 93,
> 102, 104, 110]]
> localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=10;
> localDataKeysAndValues={44=44, 67=67, 59=null, 49=49, 162=162, 261=null,
> 284=null, 473=473, 397=null, 475=null, 376=null, 387=387, 101=null,
> 157=157, 366=366, 301=null, 469=null, 403=null, 427=427, 70=70, 229=null,
> 108=null, 50=null, 85=null}; nonNullLocalDataKeysAndValues={44=44, 473=473,
> 67=67, 387=387, 157=157, 366=366, 49=49, 427=427, 70=70, 162=162}
>
> Server2
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[0, 5, 6, 8, 13, 18, 21, 24, 27, 32, 35, 36, 37, 38, 43, 45, 48,
> 51, 55, 61, 64, 66, 69, 72, 74, 79, 80, 83, 86, 91, 94, 96, 99, 100, 105,
> 106, 108, 112]]
> localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=8;
> localDataKeysAndValues={44=null, 67=null, 59=59, 49=null, 162=null,
> 261=null, 284=284, 473=null, 397=397, 475=null, 376=null, 387=null,
> 101=101, 157=null, 366=null, 301=301, 469=null, 403=403, 427=null, 70=null,
> 229=null, 108=108, 50=null, 85=85}; nonNullLocalDataKeysAndValues={397=397,
> 101=101, 59=59, 301=301, 403=403, 108=108, 85=85, 284=284}
>
> Server3
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[1, 4, 9, 12, 14, 17, 22, 23, 28, 30, 33, 39, 42, 44, 47, 50,
> 53, 54, 59, 62, 63, 70, 73, 76, 77, 81, 85, 88, 89, 90, 95, 97, 98, 101,
> 103, 107, 109, 111]]
> localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=6;
> localDataKeysAndValues={44=null, 67=null, 59=null, 49=null, 162=null,
> 261=261, 284=null, 473=null, 397=null, 475=475, 376=376, 387=null,
> 101=null, 157=null, 366=null, 301=null, 469=469, 403=null, 427=null,
> 70=null, 229=229, 108=null, 50=50, 85=null};
> nonNullLocalDataKeysAndValues={475=475, 376=376, 469=469, 229=229, 50=50,
> 261=261}
>
> ---------------------------------
> Redundancy optimizeForWrite false
> ---------------------------------
> If I change both regions to be redundant, I see very different behavior.
>
> First, with optimizeForWrite returning false (the default), only 2 of the
> servers invoke the function. In the optimizeForWrite false case, the
> function is sent to the fewest number of servers that include all the
> buckets.
>
> keysSize=24; keys=[390, 292, 370, 261, 250, 273, 130, 460, 274, 452, 123,
> 388, 113, 268, 455, 400, 159, 435, 314, 429, 51, 419, 84, 43]
>
> As you saw, getLocalData will produce duplicates since some of the buckets
> overlap between the servers. In this case, you'll see all the data. If you
> call getLocalPrimaryData, you probably won't see all the data since some of
> the primaries will be in the server that doesn't execute the function.
>
> You can see below the local data set returns 35 entries for the 24 keys;
> the primary set returns only 15.
>
> Server1
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[0, 2, 3, 4, 7, 9, 10, 12, 15, 16, 17, 19, 21, 22, 23, 24, 26,
> 28, 29, 31, 32, 33, 34, 36, 37, 38, 40, 41, 43, 45, 46, 47, 48, 49, 51, 53,
> 54, 56, 58, 61, 62, 63, 66, 67, 68, 69, 70, 72, 74, 75, 76, 78, 80, 81, 82,
> 84, 86, 87, 88, 89, 90, 91, 93, 94, 95, 97, 98, 99, 100, 101, 103, 105,
> 106, 110, 111]]
> nonNullLocalDataKeysAndValuesSize=19;
> nonNullLocalDataKeysAndValues={390=390, 292=292, 261=261, 250=250, 273=273,
> 130=130, 460=460, 274=274, 452=452, 123=123, 388=388, 113=113, 400=400,
> 159=159, 435=435, 429=429, 51=51, 84=84, 43=43}
>
> primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[0, 66, 4, 68, 70, 7, 10, 74, 12, 76, 78, 17, 82, 84, 21, 87,
> 24, 90, 28, 95, 32, 33, 99, 37, 101, 38, 103, 106, 43, 45, 46, 110, 49, 51,
> 54, 58, 62, 63]]
> nonNullPrimaryDataKeysAndValuesSize=7;
> nonNullPrimaryDataKeysAndValues={452=452, 388=388, 435=435, 429=429, 51=51,
> 250=250, 274=274}
>
> Server2
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[0, 1, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 16, 18, 20, 21, 23,
> 25, 27, 28, 29, 30, 33, 35, 36, 39, 40, 41, 42, 44, 46, 49, 50, 52, 53, 55,
> 56, 57, 59, 60, 61, 64, 65, 66, 67, 69, 71, 72, 73, 75, 76, 77, 78, 79, 81,
> 82, 83, 85, 87, 88, 91, 92, 93, 96, 97, 98, 101, 102, 103, 104, 105, 107,
> 108, 109, 111, 112]]
> nonNullLocalDataKeysAndValuesSize=16;
> nonNullLocalDataKeysAndValues={370=370, 261=261, 250=250, 130=130, 460=460,
> 274=274, 388=388, 113=113, 268=268, 455=455, 400=400, 435=435, 314=314,
> 419=419, 84=84, 43=43}
>
> primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[64, 1, 67, 5, 69, 72, 9, 11, 75, 15, 79, 16, 81, 18, 83, 20,
> 23, 88, 91, 29, 93, 30, 97, 98, 35, 36, 102, 40, 41, 105, 108, 109, 111,
> 50, 53, 56, 61]]
> nonNullPrimaryDataKeysAndValuesSize=8;
> nonNullPrimaryDataKeysAndValues={113=113, 400=400, 419=419, 84=84, 261=261,
> 130=130, 460=460, 43=43}
>
> --------------------------------
> Redundancy optimizeForWrite true
> --------------------------------
> With optimizeForWrite returning true, I see exactly 24 results. Thats
> because the function is executed on all members, and each key is primary in
> only one. The approach still won't work though if you have some servers
> with no primaries.
>
> keysSize=24; keys=[13, 58, 25, 181, 491, 183, 282, 173, 294, 495, 122,
> 365, 222, 486, 476, 146, 236, 139, 70, 449, 60, 51, 63, 43]
>
> nonNullPrimaryDataKeysAndValuesSize=8;
> nonNullPrimaryDataKeysAndValues={365=365, 146=146, 236=236, 139=139,
> 491=491, 183=183, 173=173, 43=43}
> nonNullPrimaryDataKeysAndValuesSize=7;
> nonNullPrimaryDataKeysAndValues={13=13, 222=222, 449=449, 282=282, 51=51,
> 294=294, 63=63}
> nonNullPrimaryDataKeysAndValuesSize=9;
> nonNullPrimaryDataKeysAndValues={495=495, 122=122, 486=486, 58=58, 25=25,
> 476=476, 70=70, 181=181, 60=60}
>
> -------
> Options
> -------
> There are a few things you could look into:
>
> 1. Plan for duplicates by defining a custom ResultCollector that filters
> the duplicates
> 2. Co-locate your regions which will mean that the same buckets will be on
> the same servers and they will be primary on the same servers. Then use a
> filter instead of an argument and return true for optimizeForWrite. In this
> case, it shouldn't matter how you get the other region.
> 3. If you can't colocate your regions, then don't use either getLocalData
> or getLocalPrimaryData. Still use a filter, but just use the region in that
> case. Some of the gets will be remote.
>
> The result of option 2 on three servers with 24 keys looks like below. The
> keys are split among the servers, region.getAll is entirely local and all
> the values are returned.
>
> keysSize=9; keys=[222, 179, 476, 147, 346, 259, 128, 81, 153]
> regionKeysAndValues=9; regionKeysAndValues={222=222, 179=179, 476=476,
> 147=147, 346=346, 259=259, 128=128, 81=81, 153=153}
>
> keysSize=9; keys=[133, 45, 387, 343, 58, 234, 107, 9, 141]
> regionKeysAndValues=9; regionKeysAndValues={133=133, 45=45, 387=387,
> 343=343, 58=58, 234=234, 107=107, 9=9, 141=141}
>
> keysSize=6; keys=[122, 279, 237, 381, 131, 351]
> regionKeysAndValues=6; regionKeysAndValues={122=122, 279=279, 237=237,
> 381=381, 131=131, 351=351}
>
> Thanks,
> Barry Oglesby
>
>
>
> On Mon, Aug 19, 2019 at 2:18 PM aashish choudhary <
> aashish.choudhary1@gmail.com> wrote:
>
>> You have a data-aware function (invoked by onRegion) from which you call
>> getAll in region X. That's correct.
>>
>> Is region X the region on which the function is executed? Or is it
>> another region? X is a Different region.
>> If multiple regions are involved, are they co-located? Not colocated.
>>
>> How do you determine the keys to getAll? Let's just say that key passed
>> to both region is same we basically merge data and return the result.
>>
>> Are they passed into the function? If so, as a filter or as an argument?
>> As an argument. With filter could have been a better approach.
>>
>> What does optimizeForWrite return? How many members are running? Have to
>> check and confirm. We have 12 nodes running.
>>
>> Tue, Aug 20, 2019, 2:32 AM Barry Oglesby <bo...@pivotal.io> wrote:
>>
>>> Ashish,
>>>
>>> Sorry for all the questions, but I want to make sure I understand the
>>> scenario. You have a data-aware function (invoked by onRegion) from which
>>> you call getAll in region X. Is region X the region on which the function
>>> is executed? Or is it another region? If multiple regions are involved, are
>>> they co-located? How do you determine the keys to getAll? Are they passed
>>> into the function? If so, as a filter or as an argument? What does
>>> optimizeForWrite return? How many members are running?
>>>
>>> Thanks,
>>> Barry Oglesby
>>>
>>>
>>>
>>> On Mon, Aug 19, 2019 at 1:19 PM aashish choudhary <
>>> aashish.choudhary1@gmail.com> wrote:
>>>
>>>> We use data aware function and We make a call to region X from a data
>>>> aware function using getLocalData API and then we do getall. Recently we
>>>> introduced redundancy for our partitioned region and now we are getting
>>>> duplicate enteries for that region X from function response. My hunch is
>>>> that it is becuase of getLocalData + get all call so if we will change it
>>>> to getLocalPrimaryData(hope name is correct) for region X it should only do
>>>> get for primary copies. Is that correct way of handling  it.?
>>>>
>>>> With best regards,
>>>> Ashish
>>>>
>>>

Re: Gemfire functions giving duplicate records

Posted by Barry Oglesby <bo...@pivotal.io>.

Ashish,

Here is a bunch of analysis on that scenario.

-------------
No redundancy
-------------
With partitioned regions, no redundancy and no filter, the function is
being sent to every member that contains buckets.

In that case, you see this kind of behavior (I have 3 servers in my test):

The argument containing the keys is sent to every member. In this case, I
have 24 keys.

keysSize=24; keys=[44, 67, 59, 49, 162, 261, 284, 473, 397, 475, 376, 387,
101, 157, 366, 301, 469, 403, 427, 70, 229, 108, 50, 85]

When you call PartitionRegionHelper.getLocalData or getLocalPrimaryData,
you're getting back a LocalDataSet. Calling get or getAll on a LocalDataSet
returns null if the value is not in that LocalDataSet. This causes all the
get calls to be local and a bunch of nulls in the result.

If I print the LocalDataSet and the value of getAll in all three servers, I
see 24 non-null results across the servers.

Server1
-------
localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
;bucketIds=[2, 3, 7, 10, 11, 15, 16, 19, 20, 25, 26, 29, 31, 34, 40, 41,
46, 49, 52, 56, 57, 58, 60, 65, 67, 68, 71, 75, 78, 82, 84, 87, 92, 93,
102, 104, 110]]
localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=10;
localDataKeysAndValues={44=44, 67=67, 59=null, 49=49, 162=162, 261=null,
284=null, 473=473, 397=null, 475=null, 376=null, 387=387, 101=null,
157=157, 366=366, 301=null, 469=null, 403=null, 427=427, 70=70, 229=null,
108=null, 50=null, 85=null}; nonNullLocalDataKeysAndValues={44=44, 473=473,
67=67, 387=387, 157=157, 366=366, 49=49, 427=427, 70=70, 162=162}

Server2
-------
localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
;bucketIds=[0, 5, 6, 8, 13, 18, 21, 24, 27, 32, 35, 36, 37, 38, 43, 45, 48,
51, 55, 61, 64, 66, 69, 72, 74, 79, 80, 83, 86, 91, 94, 96, 99, 100, 105,
106, 108, 112]]
localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=8;
localDataKeysAndValues={44=null, 67=null, 59=59, 49=null, 162=null,
261=null, 284=284, 473=null, 397=397, 475=null, 376=null, 387=null,
101=101, 157=null, 366=null, 301=301, 469=null, 403=403, 427=null, 70=null,
229=null, 108=108, 50=null, 85=85}; nonNullLocalDataKeysAndValues={397=397,
101=101, 59=59, 301=301, 403=403, 108=108, 85=85, 284=284}

Server3
-------
localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
;bucketIds=[1, 4, 9, 12, 14, 17, 22, 23, 28, 30, 33, 39, 42, 44, 47, 50,
53, 54, 59, 62, 63, 70, 73, 76, 77, 81, 85, 88, 89, 90, 95, 97, 98, 101,
103, 107, 109, 111]]
localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=6;
localDataKeysAndValues={44=null, 67=null, 59=null, 49=null, 162=null,
261=261, 284=null, 473=null, 397=null, 475=475, 376=376, 387=null,
101=null, 157=null, 366=null, 301=null, 469=469, 403=null, 427=null,
70=null, 229=229, 108=null, 50=50, 85=null};
nonNullLocalDataKeysAndValues={475=475, 376=376, 469=469, 229=229, 50=50,
261=261}

---------------------------------
Redundancy optimizeForWrite false
---------------------------------
If I change both regions to be redundant, I see very different behavior.

First, with optimizeForWrite returning false (the default), only 2 of the
servers invoke the function. In the optimizeForWrite false case, the
function is sent to the fewest number of servers that include all the
buckets.

keysSize=24; keys=[390, 292, 370, 261, 250, 273, 130, 460, 274, 452, 123,
388, 113, 268, 455, 400, 159, 435, 314, 429, 51, 419, 84, 43]

As you saw, getLocalData will produce duplicates since some of the buckets
overlap between the servers. In this case, you'll see all the data. If you
call getLocalPrimaryData, you probably won't see all the data since some of
the primaries will be in the server that doesn't execute the function.

You can see below the local data set returns 35 entries for the 24 keys;
the primary set returns only 15.

Server1
-------
localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
;bucketIds=[0, 2, 3, 4, 7, 9, 10, 12, 15, 16, 17, 19, 21, 22, 23, 24, 26,
28, 29, 31, 32, 33, 34, 36, 37, 38, 40, 41, 43, 45, 46, 47, 48, 49, 51, 53,
54, 56, 58, 61, 62, 63, 66, 67, 68, 69, 70, 72, 74, 75, 76, 78, 80, 81, 82,
84, 86, 87, 88, 89, 90, 91, 93, 94, 95, 97, 98, 99, 100, 101, 103, 105,
106, 110, 111]]
nonNullLocalDataKeysAndValuesSize=19;
nonNullLocalDataKeysAndValues={390=390, 292=292, 261=261, 250=250, 273=273,
130=130, 460=460, 274=274, 452=452, 123=123, 388=388, 113=113, 400=400,
159=159, 435=435, 429=429, 51=51, 84=84, 43=43}

primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
;bucketIds=[0, 66, 4, 68, 70, 7, 10, 74, 12, 76, 78, 17, 82, 84, 21, 87,
24, 90, 28, 95, 32, 33, 99, 37, 101, 38, 103, 106, 43, 45, 46, 110, 49, 51,
54, 58, 62, 63]]
nonNullPrimaryDataKeysAndValuesSize=7;
nonNullPrimaryDataKeysAndValues={452=452, 388=388, 435=435, 429=429, 51=51,
250=250, 274=274}

Server2
-------
localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
;bucketIds=[0, 1, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 16, 18, 20, 21, 23,
25, 27, 28, 29, 30, 33, 35, 36, 39, 40, 41, 42, 44, 46, 49, 50, 52, 53, 55,
56, 57, 59, 60, 61, 64, 65, 66, 67, 69, 71, 72, 73, 75, 76, 77, 78, 79, 81,
82, 83, 85, 87, 88, 91, 92, 93, 96, 97, 98, 101, 102, 103, 104, 105, 107,
108, 109, 111, 112]]
nonNullLocalDataKeysAndValuesSize=16;
nonNullLocalDataKeysAndValues={370=370, 261=261, 250=250, 130=130, 460=460,
274=274, 388=388, 113=113, 268=268, 455=455, 400=400, 435=435, 314=314,
419=419, 84=84, 43=43}

primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
;bucketIds=[64, 1, 67, 5, 69, 72, 9, 11, 75, 15, 79, 16, 81, 18, 83, 20,
23, 88, 91, 29, 93, 30, 97, 98, 35, 36, 102, 40, 41, 105, 108, 109, 111,
50, 53, 56, 61]]
nonNullPrimaryDataKeysAndValuesSize=8;
nonNullPrimaryDataKeysAndValues={113=113, 400=400, 419=419, 84=84, 261=261,
130=130, 460=460, 43=43}

--------------------------------
Redundancy optimizeForWrite true
--------------------------------
With optimizeForWrite returning true, I see exactly 24 results. Thats
because the function is executed on all members, and each key is primary in
only one. The approach still won't work though if you have some servers
with no primaries.

keysSize=24; keys=[13, 58, 25, 181, 491, 183, 282, 173, 294, 495, 122, 365,
222, 486, 476, 146, 236, 139, 70, 449, 60, 51, 63, 43]

nonNullPrimaryDataKeysAndValuesSize=8;
nonNullPrimaryDataKeysAndValues={365=365, 146=146, 236=236, 139=139,
491=491, 183=183, 173=173, 43=43}
nonNullPrimaryDataKeysAndValuesSize=7;
nonNullPrimaryDataKeysAndValues={13=13, 222=222, 449=449, 282=282, 51=51,
294=294, 63=63}
nonNullPrimaryDataKeysAndValuesSize=9;
nonNullPrimaryDataKeysAndValues={495=495, 122=122, 486=486, 58=58, 25=25,
476=476, 70=70, 181=181, 60=60}

-------
Options
-------
There are a few things you could look into:

1. Plan for duplicates by defining a custom ResultCollector that filters
the duplicates
2. Co-locate your regions which will mean that the same buckets will be on
the same servers and they will be primary on the same servers. Then use a
filter instead of an argument and return true for optimizeForWrite. In this
case, it shouldn't matter how you get the other region.
3. If you can't colocate your regions, then don't use either getLocalData
or getLocalPrimaryData. Still use a filter, but just use the region in that
case. Some of the gets will be remote.

The result of option 2 on three servers with 24 keys looks like below. The
keys are split among the servers, region.getAll is entirely local and all
the values are returned.

keysSize=9; keys=[222, 179, 476, 147, 346, 259, 128, 81, 153]
regionKeysAndValues=9; regionKeysAndValues={222=222, 179=179, 476=476,
147=147, 346=346, 259=259, 128=128, 81=81, 153=153}

keysSize=9; keys=[133, 45, 387, 343, 58, 234, 107, 9, 141]
regionKeysAndValues=9; regionKeysAndValues={133=133, 45=45, 387=387,
343=343, 58=58, 234=234, 107=107, 9=9, 141=141}

keysSize=6; keys=[122, 279, 237, 381, 131, 351]
regionKeysAndValues=6; regionKeysAndValues={122=122, 279=279, 237=237,
381=381, 131=131, 351=351}

Thanks,
Barry Oglesby



On Mon, Aug 19, 2019 at 2:18 PM aashish choudhary <
aashish.choudhary1@gmail.com> wrote:

> You have a data-aware function (invoked by onRegion) from which you call
> getAll in region X. That's correct.
>
> Is region X the region on which the function is executed? Or is it another
> region? X is a Different region.
> If multiple regions are involved, are they co-located? Not colocated.
>
> How do you determine the keys to getAll? Let's just say that key passed to
> both region is same we basically merge data and return the result.
>
> Are they passed into the function? If so, as a filter or as an argument?
> As an argument. With filter could have been a better approach.
>
> What does optimizeForWrite return? How many members are running? Have to
> check and confirm. We have 12 nodes running.
>
> Tue, Aug 20, 2019, 2:32 AM Barry Oglesby <bo...@pivotal.io> wrote:
>
>> Ashish,
>>
>> Sorry for all the questions, but I want to make sure I understand the
>> scenario. You have a data-aware function (invoked by onRegion) from which
>> you call getAll in region X. Is region X the region on which the function
>> is executed? Or is it another region? If multiple regions are involved, are
>> they co-located? How do you determine the keys to getAll? Are they passed
>> into the function? If so, as a filter or as an argument? What does
>> optimizeForWrite return? How many members are running?
>>
>> Thanks,
>> Barry Oglesby
>>
>>
>>
>> On Mon, Aug 19, 2019 at 1:19 PM aashish choudhary <
>> aashish.choudhary1@gmail.com> wrote:
>>
>>> We use data aware function and We make a call to region X from a data
>>> aware function using getLocalData API and then we do getall. Recently we
>>> introduced redundancy for our partitioned region and now we are getting
>>> duplicate enteries for that region X from function response. My hunch is
>>> that it is becuase of getLocalData + get all call so if we will change it
>>> to getLocalPrimaryData(hope name is correct) for region X it should only do
>>> get for primary copies. Is that correct way of handling  it.?
>>>
>>> With best regards,
>>> Ashish
>>>
>>

Re: Gemfire functions giving duplicate records

Posted by aashish choudhary <aa...@gmail.com>.

You have a data-aware function (invoked by onRegion) from which you call
getAll in region X. That's correct.

Is region X the region on which the function is executed? Or is it another
region? X is a Different region.
If multiple regions are involved, are they co-located? Not colocated.

How do you determine the keys to getAll? Let's just say that key passed to
both region is same we basically merge data and return the result.

Are they passed into the function? If so, as a filter or as an argument? As
an argument. With filter could have been a better approach.

What does optimizeForWrite return? How many members are running? Have to
check and confirm. We have 12 nodes running.

Tue, Aug 20, 2019, 2:32 AM Barry Oglesby <bo...@pivotal.io> wrote:

> Ashish,
>
> Sorry for all the questions, but I want to make sure I understand the
> scenario. You have a data-aware function (invoked by onRegion) from which
> you call getAll in region X. Is region X the region on which the function
> is executed? Or is it another region? If multiple regions are involved, are
> they co-located? How do you determine the keys to getAll? Are they passed
> into the function? If so, as a filter or as an argument? What does
> optimizeForWrite return? How many members are running?
>
> Thanks,
> Barry Oglesby
>
>
>
> On Mon, Aug 19, 2019 at 1:19 PM aashish choudhary <
> aashish.choudhary1@gmail.com> wrote:
>
>> We use data aware function and We make a call to region X from a data
>> aware function using getLocalData API and then we do getall. Recently we
>> introduced redundancy for our partitioned region and now we are getting
>> duplicate enteries for that region X from function response. My hunch is
>> that it is becuase of getLocalData + get all call so if we will change it
>> to getLocalPrimaryData(hope name is correct) for region X it should only do
>> get for primary copies. Is that correct way of handling  it.?
>>
>> With best regards,
>> Ashish
>>
>

Re: Gemfire functions giving duplicate records

Posted by Barry Oglesby <bo...@pivotal.io>.

Ashish,

Sorry for all the questions, but I want to make sure I understand the
scenario. You have a data-aware function (invoked by onRegion) from which
you call getAll in region X. Is region X the region on which the function
is executed? Or is it another region? If multiple regions are involved, are
they co-located? How do you determine the keys to getAll? Are they passed
into the function? If so, as a filter or as an argument? What does
optimizeForWrite return? How many members are running?

Thanks,
Barry Oglesby

On Mon, Aug 19, 2019 at 1:19 PM aashish choudhary <
aashish.choudhary1@gmail.com> wrote:

> We use data aware function and We make a call to region X from a data
> aware function using getLocalData API and then we do getall. Recently we
> introduced redundancy for our partitioned region and now we are getting
> duplicate enteries for that region X from function response. My hunch is
> that it is becuase of getLocalData + get all call so if we will change it
> to getLocalPrimaryData(hope name is correct) for region X it should only do
> get for primary copies. Is that correct way of handling  it.?
>
> With best regards,
> Ashish
>