You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Akash Shinde <ak...@gmail.com> on 2020/06/06 13:10:06 UTC

CountDownLatch issue in Ignite 2.6 version

*Issue:* Countdown latched gets reinitialize to original value(4) when one
or more (but not all) node goes down. *(Partition loss happened)*

We are using ignite's distributed countdownlatch to make sure that cache
loading is completed on all server nodes. We do this to make sure that our
kafka consumers starts only after cache loading is complete on all server
nodes. This is the basic criteria which needs to be fulfilled before starts
actual processing


 We have 4 server nodes and countdownlatch is initialized to 4. We use
cache.loadCache method to start the cache loading. When each server
completes cache loading it reduces the count by 1 using countDown method.
So when all the nodes completes cache loading, the count reaches to zero.
When this count  reaches to zero we start kafka consumers on all server
nodes.

 But we saw weird behavior in prod env. The 3 server nodes were shut down
at the same time. But 1 node is still alive. When this happened the count
down was reinitialized to original value i.e. 4. But I am not able to
reproduce this in dev env.

 Is this a bug, when one or more (but not all) nodes goes down then count
re initializes back to original value?

Thanks,
Akash

Re: CountDownLatch issue in Ignite 2.6 version

Posted by Evgenii Zhuravlev <e....@gmail.com>.

Prasad,

Please don't use the dev list for the questions regarding the product
usage, dev list used for development-related activities.

To see how this configuration used for countDownLatch you can take a look
at these 2 methods:
https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/datastructures/DataStructuresProcessor.java#L1187

https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/datastructures/DataStructuresProcessor.java#L495

Evgenii

пн, 8 июн. 2020 г. в 20:43, Prasad Bhalerao <pr...@gmail.com>:

> I just checked the ignite doc for atomic configuration.
> But it doesn't say that it is applicable to distributed data structures.
>
> Is it really applicable to distributed data structures like count down
> latch
>
> On Tue 9 Jun, 2020, 7:26 AM Prasad Bhalerao <prasadbhalerao1983@gmail.com
> wrote:
>
>> Hi,
>> I was under the impression that countdown latch is implemented in
>> replicated cache. So when any number of nodes go down it does not loose
>> it's state.
>>
>> Can you please explain why atmoc data structures are using 1 back when
>> its state is very important?
>>
>> Can we enforce  atomic data structures to use replicated cache?
>>
>> Which cache does ignite use to store atomic data structures?
>>
>> Thanks
>> Prasad
>>
>> On Mon 8 Jun, 2020, 11:58 PM Evgenii Zhuravlev <e.zhuravlev.wk@gmail.com
>> wrote:
>>
>>> Hi,
>>>
>>> By default, cache, that stores all atomic structures has only 1 backup,
>>> so, after losing all data for this certain latch, it recreates it. To
>>> change the default atomic configuration use
>>> IgniteConfiguration.setAtomicConfiguration.
>>>
>>> Evgenii
>>>
>>> сб, 6 июн. 2020 г. в 06:20, Akash Shinde <ak...@gmail.com>:
>>>
>>>> *Issue:* Countdown latched gets reinitialize to original value(4) when
>>>> one or more (but not all) node goes down. *(Partition loss happened)*
>>>>
>>>> We are using ignite's distributed countdownlatch to make sure that
>>>> cache loading is completed on all server nodes. We do this to make sure
>>>> that our kafka consumers starts only after cache loading is complete on all
>>>> server nodes. This is the basic criteria which needs to be fulfilled before
>>>> starts actual processing
>>>>
>>>>
>>>>  We have 4 server nodes and countdownlatch is initialized to 4. We use
>>>> cache.loadCache method to start the cache loading. When each server
>>>> completes cache loading it reduces the count by 1 using countDown method.
>>>> So when all the nodes completes cache loading, the count reaches to zero.
>>>> When this count  reaches to zero we start kafka consumers on all server
>>>> nodes.
>>>>
>>>>  But we saw weird behavior in prod env. The 3 server nodes were shut
>>>> down at the same time. But 1 node is still alive. When this happened the
>>>> count down was reinitialized to original value i.e. 4. But I am not able to
>>>> reproduce this in dev env.
>>>>
>>>>  Is this a bug, when one or more (but not all) nodes goes down then
>>>> count re initializes back to original value?
>>>>
>>>> Thanks,
>>>> Akash
>>>>
>>>

Re: CountDownLatch issue in Ignite 2.6 version

Posted by Prasad Bhalerao <pr...@gmail.com>.

I just checked the ignite doc for atomic configuration.
But it doesn't say that it is applicable to distributed data structures.

Is it really applicable to distributed data structures like count down latch

On Tue 9 Jun, 2020, 7:26 AM Prasad Bhalerao <prasadbhalerao1983@gmail.com
wrote:

> Hi,
> I was under the impression that countdown latch is implemented in
> replicated cache. So when any number of nodes go down it does not loose
> it's state.
>
> Can you please explain why atmoc data structures are using 1 back when its
> state is very important?
>
> Can we enforce  atomic data structures to use replicated cache?
>
> Which cache does ignite use to store atomic data structures?
>
> Thanks
> Prasad
>
> On Mon 8 Jun, 2020, 11:58 PM Evgenii Zhuravlev <e.zhuravlev.wk@gmail.com
> wrote:
>
>> Hi,
>>
>> By default, cache, that stores all atomic structures has only 1 backup,
>> so, after losing all data for this certain latch, it recreates it. To
>> change the default atomic configuration use
>> IgniteConfiguration.setAtomicConfiguration.
>>
>> Evgenii
>>
>> сб, 6 июн. 2020 г. в 06:20, Akash Shinde <ak...@gmail.com>:
>>
>>> *Issue:* Countdown latched gets reinitialize to original value(4) when
>>> one or more (but not all) node goes down. *(Partition loss happened)*
>>>
>>> We are using ignite's distributed countdownlatch to make sure that cache
>>> loading is completed on all server nodes. We do this to make sure that our
>>> kafka consumers starts only after cache loading is complete on all server
>>> nodes. This is the basic criteria which needs to be fulfilled before starts
>>> actual processing
>>>
>>>
>>>  We have 4 server nodes and countdownlatch is initialized to 4. We use
>>> cache.loadCache method to start the cache loading. When each server
>>> completes cache loading it reduces the count by 1 using countDown method.
>>> So when all the nodes completes cache loading, the count reaches to zero.
>>> When this count  reaches to zero we start kafka consumers on all server
>>> nodes.
>>>
>>>  But we saw weird behavior in prod env. The 3 server nodes were shut
>>> down at the same time. But 1 node is still alive. When this happened the
>>> count down was reinitialized to original value i.e. 4. But I am not able to
>>> reproduce this in dev env.
>>>
>>>  Is this a bug, when one or more (but not all) nodes goes down then
>>> count re initializes back to original value?
>>>
>>> Thanks,
>>> Akash
>>>
>>

Re: CountDownLatch issue in Ignite 2.6 version

Posted by Prasad Bhalerao <pr...@gmail.com>.

I just checked the ignite doc for atomic configuration.
But it doesn't say that it is applicable to distributed data structures.

Is it really applicable to distributed data structures like count down latch

On Tue 9 Jun, 2020, 7:26 AM Prasad Bhalerao <prasadbhalerao1983@gmail.com
wrote:

> Hi,
> I was under the impression that countdown latch is implemented in
> replicated cache. So when any number of nodes go down it does not loose
> it's state.
>
> Can you please explain why atmoc data structures are using 1 back when its
> state is very important?
>
> Can we enforce  atomic data structures to use replicated cache?
>
> Which cache does ignite use to store atomic data structures?
>
> Thanks
> Prasad
>
> On Mon 8 Jun, 2020, 11:58 PM Evgenii Zhuravlev <e.zhuravlev.wk@gmail.com
> wrote:
>
>> Hi,
>>
>> By default, cache, that stores all atomic structures has only 1 backup,
>> so, after losing all data for this certain latch, it recreates it. To
>> change the default atomic configuration use
>> IgniteConfiguration.setAtomicConfiguration.
>>
>> Evgenii
>>
>> сб, 6 июн. 2020 г. в 06:20, Akash Shinde <ak...@gmail.com>:
>>
>>> *Issue:* Countdown latched gets reinitialize to original value(4) when
>>> one or more (but not all) node goes down. *(Partition loss happened)*
>>>
>>> We are using ignite's distributed countdownlatch to make sure that cache
>>> loading is completed on all server nodes. We do this to make sure that our
>>> kafka consumers starts only after cache loading is complete on all server
>>> nodes. This is the basic criteria which needs to be fulfilled before starts
>>> actual processing
>>>
>>>
>>>  We have 4 server nodes and countdownlatch is initialized to 4. We use
>>> cache.loadCache method to start the cache loading. When each server
>>> completes cache loading it reduces the count by 1 using countDown method.
>>> So when all the nodes completes cache loading, the count reaches to zero.
>>> When this count  reaches to zero we start kafka consumers on all server
>>> nodes.
>>>
>>>  But we saw weird behavior in prod env. The 3 server nodes were shut
>>> down at the same time. But 1 node is still alive. When this happened the
>>> count down was reinitialized to original value i.e. 4. But I am not able to
>>> reproduce this in dev env.
>>>
>>>  Is this a bug, when one or more (but not all) nodes goes down then
>>> count re initializes back to original value?
>>>
>>> Thanks,
>>> Akash
>>>
>>

Re: CountDownLatch issue in Ignite 2.6 version

Posted by Prasad Bhalerao <pr...@gmail.com>.

Hi,
I was under the impression that countdown latch is implemented in
replicated cache. So when any number of nodes go down it does not loose
it's state.

Can you please explain why atmoc data structures are using 1 back when its
state is very important?

Can we enforce  atomic data structures to use replicated cache?

Which cache does ignite use to store atomic data structures?

Thanks
Prasad

On Mon 8 Jun, 2020, 11:58 PM Evgenii Zhuravlev <e.zhuravlev.wk@gmail.com
wrote:

> Hi,
>
> By default, cache, that stores all atomic structures has only 1 backup,
> so, after losing all data for this certain latch, it recreates it. To
> change the default atomic configuration use
> IgniteConfiguration.setAtomicConfiguration.
>
> Evgenii
>
> сб, 6 июн. 2020 г. в 06:20, Akash Shinde <ak...@gmail.com>:
>
>> *Issue:* Countdown latched gets reinitialize to original value(4) when
>> one or more (but not all) node goes down. *(Partition loss happened)*
>>
>> We are using ignite's distributed countdownlatch to make sure that cache
>> loading is completed on all server nodes. We do this to make sure that our
>> kafka consumers starts only after cache loading is complete on all server
>> nodes. This is the basic criteria which needs to be fulfilled before starts
>> actual processing
>>
>>
>>  We have 4 server nodes and countdownlatch is initialized to 4. We use
>> cache.loadCache method to start the cache loading. When each server
>> completes cache loading it reduces the count by 1 using countDown method.
>> So when all the nodes completes cache loading, the count reaches to zero.
>> When this count  reaches to zero we start kafka consumers on all server
>> nodes.
>>
>>  But we saw weird behavior in prod env. The 3 server nodes were shut down
>> at the same time. But 1 node is still alive. When this happened the count
>> down was reinitialized to original value i.e. 4. But I am not able to
>> reproduce this in dev env.
>>
>>  Is this a bug, when one or more (but not all) nodes goes down then count
>> re initializes back to original value?
>>
>> Thanks,
>> Akash
>>
>

Re: CountDownLatch issue in Ignite 2.6 version

Posted by Evgenii Zhuravlev <e....@gmail.com>.

Hi,

By default, cache, that stores all atomic structures has only 1 backup, so,
after losing all data for this certain latch, it recreates it. To change
the default atomic configuration use
IgniteConfiguration.setAtomicConfiguration.

Evgenii

сб, 6 июн. 2020 г. в 06:20, Akash Shinde <ak...@gmail.com>:

> *Issue:* Countdown latched gets reinitialize to original value(4) when
> one or more (but not all) node goes down. *(Partition loss happened)*
>
> We are using ignite's distributed countdownlatch to make sure that cache
> loading is completed on all server nodes. We do this to make sure that our
> kafka consumers starts only after cache loading is complete on all server
> nodes. This is the basic criteria which needs to be fulfilled before starts
> actual processing
>
>
>  We have 4 server nodes and countdownlatch is initialized to 4. We use
> cache.loadCache method to start the cache loading. When each server
> completes cache loading it reduces the count by 1 using countDown method.
> So when all the nodes completes cache loading, the count reaches to zero.
> When this count  reaches to zero we start kafka consumers on all server
> nodes.
>
>  But we saw weird behavior in prod env. The 3 server nodes were shut down
> at the same time. But 1 node is still alive. When this happened the count
> down was reinitialized to original value i.e. 4. But I am not able to
> reproduce this in dev env.
>
>  Is this a bug, when one or more (but not all) nodes goes down then count
> re initializes back to original value?
>
> Thanks,
> Akash
>