You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Dmitry Karachentsev <dk...@gridgain.com> on 2017/04/04 09:05:40 UTC

Re: IgniteSemaphore and failoverSafe flag

Hi Vladislav,

I see you're developing [1] for a while, did you have any chance to fix 
it? If no, is there any estimate?

[1] https://issues.apache.org/jira/browse/IGNITE-1977

Thanks!

-Dmitry.



20.03.2017 10:28, Alexey Goncharuk \u043f\u0438\u0448\u0435\u0442:
> I think re-creation should be handled by a user who will make sure that
> nobody else is currently executing the guarded logic before the
> re-creation. This is exactly the same semantics as with
> BrokenBarrierException for j.u.c.CyclicBarrier.
>
> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <vl...@gmail.com>:
>
>> Hi everyone,
>>
>> I agree with Val, he's got a point; recreating the lock doesn't seem
>> possible
>> (at least not the with the transactional cache lock/semaphore we have).
>> Is this re-create behavior really needed?
>>
>> Best regards,
>> Vladisav
>>
>>
>>
>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>> valentin.kulichenko@gmail.com> wrote:
>>
>>> Guys,
>>>
>>> How does recreation of the lock helps? My understanding is that scenario
>> is
>>> the following:
>>>
>>> 1. Client A creates and acquires a lock, and then starts to execute
>> guarded
>>> logic.
>>> 2. Client B tries to acquire the same lock and parks to wait.
>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock
>>> disappears from the cache.
>>> 4. Client B fails with exception, recreates the lock, acquires it, and
>>> starts to execute guarded logic concurrently with client A.
>>>
>>> In my view this is wrong anyway, regardless of whether this happens
>>> silently or with an exception handled in user's code. Because this code
>>> doesn't have any way to know if client A still holds the lock or not.
>>>
>>> Am I missing something?
>>>
>>> -Val
>>>
>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>> dsetrakyan@apache.org
>>> wrote:
>>>
>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>> alexey.goncharuk@gmail.com> wrote:
>>>>
>>>>>> Which user operation would result in exception? To my knowledge,
>> user
>>>> may
>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>
>>>>> Yes, this is exactly my point.
>>>>>
>>>>> Imagine that a node already holds a lock and another node is waiting
>>> for
>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>> re-created,
>>>>> this second node will immediately acquire the lock and we will have
>> two
>>>>> lock owners. I think in this case this second node (blocked on
>> lock())
>>>>> should get an exception saying that the lock was lost (which is, by
>> the
>>>>> way, the current behavior), and the first node should get an
>> exception
>>> on
>>>>> unlock.
>>>>>
>>>> Makes sense.
>>>>

Re: IgniteSemaphore and failoverSafe flag

Posted by Dmitry Karachentsev <dk...@gridgain.com>.

It's not 100% reproducible, to get failed locally I've ran it many times 
in a loop (Intellij IDEA feature).
N.B. This test was muted before the fix, so yes, it's could not be a cause.

Thanks!

14.04.2017 17:23, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
> Hmm, I cannot reproduce this behavior locally,
> my guess is interrupt flag is not always cleared properly in 
> #GridCacheSemaphore.acquire method (but it doesn't have anything to do 
> with latest fix)
>
> Can you make it reproducible?
>
> On Fri, Apr 14, 2017 at 2:46 PM, Dmitry Karachentsev 
> <dkarachentsev@gridgain.com <ma...@gridgain.com>> wrote:
>
>     Vladislav,
>
>     One more thing, This test [1] started failing on semaphore close
>     when this fix [2] was introduced.
>     Could you check it please?
>
>     [1]
>     http://ci.ignite.apache.org/viewLog.html?buildId=547151&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures#testNameId-979977708202725050
>     <http://ci.ignite.apache.org/viewLog.html?buildId=547151&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures#testNameId-979977708202725050>
>     [2] https://issues.apache.org/jira/browse/IGNITE-1977
>     <https://issues.apache.org/jira/browse/IGNITE-1977>
>
>     Thanks!
>
>     14.04.2017 15:27, Dmitry Karachentsev \u043f\u0438\u0448\u0435\u0442:
>>     Vladislav,
>>
>>     Yep, you're right. I'll fix it.
>>
>>     Thanks!
>>
>>     14.04.2017 15:18, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>>>     Hi Dmitry,
>>>
>>>     it looks to me that this test is not valid - after the semaphore
>>>     2 fails the permits are redistributed
>>>     so the expected number of permits should really be 20 not 10. Do
>>>     you agree?
>>>
>>>     I guess before latest fix this test was (incorrectly) passing
>>>     because permits weren't released properly.
>>>
>>>     What do you think?
>>>
>>>     On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev
>>>     <dkarachentsev@gridgain.com <ma...@gridgain.com>>
>>>     wrote:
>>>
>>>         Hi Vladislav,
>>>
>>>         It looks like after fix was merged these tests [1] started
>>>         failing. Could you please take a look?
>>>
>>>         [1]
>>>         http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures
>>>         <http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures>
>>>
>>>         Thanks!
>>>
>>>         -Dmitry.
>>>
>>>         13.04.2017 16:15, Dmitry Karachentsev \u043f\u0438\u0448\u0435\u0442:
>>>>         Thanks a lot!
>>>>
>>>>         12.04.2017 16:35, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>>>>>         Hi Dmitry,
>>>>>
>>>>>         sure, I made a fix, take a look at the PR and the comments
>>>>>         in the ticket.
>>>>>
>>>>>         Best regards,
>>>>>         Vladisav
>>>>>
>>>>>         On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev
>>>>>         <dkarachentsev@gridgain.com
>>>>>         <ma...@gridgain.com>> wrote:
>>>>>
>>>>>             Hi Vladislav,
>>>>>
>>>>>             Thanks for your contribution! But it seems doesn't fix
>>>>>             related tickets, in particular [1].
>>>>>             Could you please take a look?
>>>>>
>>>>>             [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>>>>             <https://issues.apache.org/jira/browse/IGNITE-4173>
>>>>>
>>>>>             Thanks!
>>>>>
>>>>>             06.04.2017 16:27, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>>>>>>             Hey Dmitry,
>>>>>>
>>>>>>             sorry for the late reply, I'll try to bake a pr later
>>>>>>             during the day.
>>>>>>
>>>>>>             Best regards,
>>>>>>             Vladisav
>>>>>>
>>>>>>
>>>>>>
>>>>>>             On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
>>>>>>             <dkarachentsev@gridgain.com
>>>>>>             <ma...@gridgain.com>> wrote:
>>>>>>
>>>>>>                 Hi Vladislav,
>>>>>>
>>>>>>                 I see you're developing [1] for a while, did you
>>>>>>                 have any chance to fix it? If no, is there any
>>>>>>                 estimate?
>>>>>>
>>>>>>                 [1]
>>>>>>                 https://issues.apache.org/jira/browse/IGNITE-1977
>>>>>>                 <https://issues.apache.org/jira/browse/IGNITE-1977>
>>>>>>
>>>>>>                 Thanks!
>>>>>>
>>>>>>                 -Dmitry.
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 20.03.2017 10:28, Alexey Goncharuk \u043f\u0438\u0448\u0435\u0442:
>>>>>>
>>>>>>                     I think re-creation should be handled by a
>>>>>>                     user who will make sure that
>>>>>>                     nobody else is currently executing the
>>>>>>                     guarded logic before the
>>>>>>                     re-creation. This is exactly the same
>>>>>>                     semantics as with
>>>>>>                     BrokenBarrierException for j.u.c.CyclicBarrier.
>>>>>>
>>>>>>                     2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>>>>>>                     <vladisavj@gmail.com
>>>>>>                     <ma...@gmail.com>>:
>>>>>>
>>>>>>                         Hi everyone,
>>>>>>
>>>>>>                         I agree with Val, he's got a point;
>>>>>>                         recreating the lock doesn't seem
>>>>>>                         possible
>>>>>>                         (at least not the with the transactional
>>>>>>                         cache lock/semaphore we have).
>>>>>>                         Is this re-create behavior really needed?
>>>>>>
>>>>>>                         Best regards,
>>>>>>                         Vladisav
>>>>>>
>>>>>>
>>>>>>
>>>>>>                         On Thu, Mar 16, 2017 at 8:34 PM, Valentin
>>>>>>                         Kulichenko <
>>>>>>                         valentin.kulichenko@gmail.com
>>>>>>                         <ma...@gmail.com>>
>>>>>>                         wrote:
>>>>>>
>>>>>>                             Guys,
>>>>>>
>>>>>>                             How does recreation of the lock
>>>>>>                             helps? My understanding is that scenario
>>>>>>
>>>>>>                         is
>>>>>>
>>>>>>                             the following:
>>>>>>
>>>>>>                             1. Client A creates and acquires a
>>>>>>                             lock, and then starts to execute
>>>>>>
>>>>>>                         guarded
>>>>>>
>>>>>>                             logic.
>>>>>>                             2. Client B tries to acquire the same
>>>>>>                             lock and parks to wait.
>>>>>>                             3. Before client A unlocks, all
>>>>>>                             affinity nodes for the lock fail, lock
>>>>>>                             disappears from the cache.
>>>>>>                             4. Client B fails with exception,
>>>>>>                             recreates the lock, acquires it, and
>>>>>>                             starts to execute guarded logic
>>>>>>                             concurrently with client A.
>>>>>>
>>>>>>                             In my view this is wrong anyway,
>>>>>>                             regardless of whether this happens
>>>>>>                             silently or with an exception handled
>>>>>>                             in user's code. Because this code
>>>>>>                             doesn't have any way to know if
>>>>>>                             client A still holds the lock or not.
>>>>>>
>>>>>>                             Am I missing something?
>>>>>>
>>>>>>                             -Val
>>>>>>
>>>>>>                             On Tue, Mar 14, 2017 at 10:14 AM,
>>>>>>                             Dmitriy Setrakyan <
>>>>>>
>>>>>>                         dsetrakyan@apache.org
>>>>>>                         <ma...@apache.org>
>>>>>>
>>>>>>                             wrote:
>>>>>>
>>>>>>                                 On Tue, Mar 14, 2017 at 12:46 AM,
>>>>>>                                 Alexey Goncharuk <
>>>>>>                                 alexey.goncharuk@gmail.com
>>>>>>                                 <ma...@gmail.com>>
>>>>>>                                 wrote:
>>>>>>
>>>>>>                                         Which user operation
>>>>>>                                         would result in
>>>>>>                                         exception? To my knowledge,
>>>>>>
>>>>>>                         user
>>>>>>
>>>>>>                                 may
>>>>>>
>>>>>>                                         already be holding the
>>>>>>                                         lock and not invoking any
>>>>>>                                         Ignite APIs, no?
>>>>>>
>>>>>>                                     Yes, this is exactly my point.
>>>>>>
>>>>>>                                     Imagine that a node already
>>>>>>                                     holds a lock and another node
>>>>>>                                     is waiting
>>>>>>
>>>>>>                             for
>>>>>>
>>>>>>                                     the lock. If all partition
>>>>>>                                     nodes leave the grid and the
>>>>>>                                     lock is
>>>>>>
>>>>>>                                 re-created,
>>>>>>
>>>>>>                                     this second node will
>>>>>>                                     immediately acquire the lock
>>>>>>                                     and we will have
>>>>>>
>>>>>>                         two
>>>>>>
>>>>>>                                     lock owners. I think in this
>>>>>>                                     case this second node (blocked on
>>>>>>
>>>>>>                         lock())
>>>>>>
>>>>>>                                     should get an exception
>>>>>>                                     saying that the lock was lost
>>>>>>                                     (which is, by
>>>>>>
>>>>>>                         the
>>>>>>
>>>>>>                                     way, the current behavior),
>>>>>>                                     and the first node should get an
>>>>>>
>>>>>>                         exception
>>>>>>
>>>>>>                             on
>>>>>>
>>>>>>                                     unlock.
>>>>>>
>>>>>>                                 Makes sense.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>

Re: IgniteSemaphore and failoverSafe flag

Posted by Vladisav Jelisavcic <vl...@gmail.com>.

Hmm, I cannot reproduce this behavior locally,
my guess is interrupt flag is not always cleared properly in
#GridCacheSemaphore.acquire method (but it doesn't have anything to do with
latest fix)

Can you make it reproducible?

On Fri, Apr 14, 2017 at 2:46 PM, Dmitry Karachentsev <
dkarachentsev@gridgain.com> wrote:

> Vladislav,
>
> One more thing, This test [1] started failing on semaphore close when this
> fix [2] was introduced.
> Could you check it please?
>
> [1] http://ci.ignite.apache.org/viewLog.html?buildId=547151&
> tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures#
> testNameId-979977708202725050
> [2] https://issues.apache.org/jira/browse/IGNITE-1977
>
> Thanks!
>
> 14.04.2017 15:27, Dmitry Karachentsev пишет:
>
> Vladislav,
>
> Yep, you're right. I'll fix it.
>
> Thanks!
>
> 14.04.2017 15:18, Vladisav Jelisavcic пишет:
>
> Hi Dmitry,
>
> it looks to me that this test is not valid - after the semaphore 2 fails
> the permits are redistributed
> so the expected number of permits should really be 20 not 10. Do you agree?
>
> I guess before latest fix this test was (incorrectly) passing because
> permits weren't released properly.
>
> What do you think?
>
> On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev <
> dkarachentsev@gridgain.com> wrote:
>
>> Hi Vladislav,
>>
>> It looks like after fix was merged these tests [1] started failing. Could
>> you please take a look?
>>
>> [1] http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=
>> buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObject
>> sDataStrucutures
>>
>> Thanks!
>>
>> -Dmitry.
>>
>> 13.04.2017 16:15, Dmitry Karachentsev пишет:
>>
>> Thanks a lot!
>>
>> 12.04.2017 16:35, Vladisav Jelisavcic пишет:
>>
>> Hi Dmitry,
>>
>> sure, I made a fix, take a look at the PR and the comments in the ticket.
>>
>> Best regards,
>> Vladisav
>>
>> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev <
>> dkarachentsev@gridgain.com> wrote:
>>
>>> Hi Vladislav,
>>>
>>> Thanks for your contribution! But it seems doesn't fix related tickets,
>>> in particular [1].
>>> Could you please take a look?
>>>
>>> [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>>
>>> Thanks!
>>>
>>> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>>>
>>> Hey Dmitry,
>>>
>>> sorry for the late reply, I'll try to bake a pr later during the day.
>>>
>>> Best regards,
>>> Vladisav
>>>
>>>
>>>
>>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev <
>>> dkarachentsev@gridgain.com> wrote:
>>>
>>>> Hi Vladislav,
>>>>
>>>> I see you're developing [1] for a while, did you have any chance to fix
>>>> it? If no, is there any estimate?
>>>>
>>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>>>
>>>> Thanks!
>>>>
>>>> -Dmitry.
>>>>
>>>>
>>>>
>>>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>>>
>>>> I think re-creation should be handled by a user who will make sure that
>>>>> nobody else is currently executing the guarded logic before the
>>>>> re-creation. This is exactly the same semantics as with
>>>>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>>>>
>>>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <vl...@gmail.com>:
>>>>>
>>>>> Hi everyone,
>>>>>>
>>>>>> I agree with Val, he's got a point; recreating the lock doesn't seem
>>>>>> possible
>>>>>> (at least not the with the transactional cache lock/semaphore we
>>>>>> have).
>>>>>> Is this re-create behavior really needed?
>>>>>>
>>>>>> Best regards,
>>>>>> Vladisav
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>
>>>>>> Guys,
>>>>>>>
>>>>>>> How does recreation of the lock helps? My understanding is that
>>>>>>> scenario
>>>>>>>
>>>>>> is
>>>>>>
>>>>>>> the following:
>>>>>>>
>>>>>>> 1. Client A creates and acquires a lock, and then starts to execute
>>>>>>>
>>>>>> guarded
>>>>>>
>>>>>>> logic.
>>>>>>> 2. Client B tries to acquire the same lock and parks to wait.
>>>>>>> 3. Before client A unlocks, all affinity nodes for the lock fail,
>>>>>>> lock
>>>>>>> disappears from the cache.
>>>>>>> 4. Client B fails with exception, recreates the lock, acquires it,
>>>>>>> and
>>>>>>> starts to execute guarded logic concurrently with client A.
>>>>>>>
>>>>>>> In my view this is wrong anyway, regardless of whether this happens
>>>>>>> silently or with an exception handled in user's code. Because this
>>>>>>> code
>>>>>>> doesn't have any way to know if client A still holds the lock or not.
>>>>>>>
>>>>>>> Am I missing something?
>>>>>>>
>>>>>>> -Val
>>>>>>>
>>>>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>>>>>>
>>>>>> dsetrakyan@apache.org
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>>>>>> alexey.goncharuk@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Which user operation would result in exception? To my knowledge,
>>>>>>>>>>
>>>>>>>>> user
>>>>>>
>>>>>>> may
>>>>>>>>
>>>>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>>>>>
>>>>>>>>>> Yes, this is exactly my point.
>>>>>>>>>
>>>>>>>>> Imagine that a node already holds a lock and another node is
>>>>>>>>> waiting
>>>>>>>>>
>>>>>>>> for
>>>>>>>
>>>>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>>>>>>>
>>>>>>>> re-created,
>>>>>>>>
>>>>>>>>> this second node will immediately acquire the lock and we will have
>>>>>>>>>
>>>>>>>> two
>>>>>>
>>>>>>> lock owners. I think in this case this second node (blocked on
>>>>>>>>>
>>>>>>>> lock())
>>>>>>
>>>>>>> should get an exception saying that the lock was lost (which is, by
>>>>>>>>>
>>>>>>>> the
>>>>>>
>>>>>>> way, the current behavior), and the first node should get an
>>>>>>>>>
>>>>>>>> exception
>>>>>>
>>>>>>> on
>>>>>>>
>>>>>>>> unlock.
>>>>>>>>>
>>>>>>>>> Makes sense.
>>>>>>>>
>>>>>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
>

Re: IgniteSemaphore and failoverSafe flag

Posted by Dmitry Karachentsev <dk...@gridgain.com>.

Vladislav,

One more thing, This test [1] started failing on semaphore close when 
this fix [2] was introduced.
Could you check it please?

[1] 
http://ci.ignite.apache.org/viewLog.html?buildId=547151&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures#testNameId-979977708202725050
[2] https://issues.apache.org/jira/browse/IGNITE-1977

Thanks!

14.04.2017 15:27, Dmitry Karachentsev \u043f\u0438\u0448\u0435\u0442:
> Vladislav,
>
> Yep, you're right. I'll fix it.
>
> Thanks!
>
> 14.04.2017 15:18, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>> Hi Dmitry,
>>
>> it looks to me that this test is not valid - after the semaphore 2 
>> fails the permits are redistributed
>> so the expected number of permits should really be 20 not 10. Do you 
>> agree?
>>
>> I guess before latest fix this test was (incorrectly) passing because 
>> permits weren't released properly.
>>
>> What do you think?
>>
>> On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev 
>> <dkarachentsev@gridgain.com <ma...@gridgain.com>> wrote:
>>
>>     Hi Vladislav,
>>
>>     It looks like after fix was merged these tests [1] started
>>     failing. Could you please take a look?
>>
>>     [1]
>>     http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures
>>     <http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures>
>>
>>     Thanks!
>>
>>     -Dmitry.
>>
>>     13.04.2017 16:15, Dmitry Karachentsev \u043f\u0438\u0448\u0435\u0442:
>>>     Thanks a lot!
>>>
>>>     12.04.2017 16:35, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>>>>     Hi Dmitry,
>>>>
>>>>     sure, I made a fix, take a look at the PR and the comments in
>>>>     the ticket.
>>>>
>>>>     Best regards,
>>>>     Vladisav
>>>>
>>>>     On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev
>>>>     <dkarachentsev@gridgain.com
>>>>     <ma...@gridgain.com>> wrote:
>>>>
>>>>         Hi Vladislav,
>>>>
>>>>         Thanks for your contribution! But it seems doesn't fix
>>>>         related tickets, in particular [1].
>>>>         Could you please take a look?
>>>>
>>>>         [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>>>         <https://issues.apache.org/jira/browse/IGNITE-4173>
>>>>
>>>>         Thanks!
>>>>
>>>>         06.04.2017 16:27, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>>>>>         Hey Dmitry,
>>>>>
>>>>>         sorry for the late reply, I'll try to bake a pr later
>>>>>         during the day.
>>>>>
>>>>>         Best regards,
>>>>>         Vladisav
>>>>>
>>>>>
>>>>>
>>>>>         On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
>>>>>         <dkarachentsev@gridgain.com
>>>>>         <ma...@gridgain.com>> wrote:
>>>>>
>>>>>             Hi Vladislav,
>>>>>
>>>>>             I see you're developing [1] for a while, did you have
>>>>>             any chance to fix it? If no, is there any estimate?
>>>>>
>>>>>             [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>>>>             <https://issues.apache.org/jira/browse/IGNITE-1977>
>>>>>
>>>>>             Thanks!
>>>>>
>>>>>             -Dmitry.
>>>>>
>>>>>
>>>>>
>>>>>             20.03.2017 10:28, Alexey Goncharuk \u043f\u0438\u0448\u0435\u0442:
>>>>>
>>>>>                 I think re-creation should be handled by a user
>>>>>                 who will make sure that
>>>>>                 nobody else is currently executing the guarded
>>>>>                 logic before the
>>>>>                 re-creation. This is exactly the same semantics as
>>>>>                 with
>>>>>                 BrokenBarrierException for j.u.c.CyclicBarrier.
>>>>>
>>>>>                 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>>>>>                 <vladisavj@gmail.com <ma...@gmail.com>>:
>>>>>
>>>>>                     Hi everyone,
>>>>>
>>>>>                     I agree with Val, he's got a point; recreating
>>>>>                     the lock doesn't seem
>>>>>                     possible
>>>>>                     (at least not the with the transactional cache
>>>>>                     lock/semaphore we have).
>>>>>                     Is this re-create behavior really needed?
>>>>>
>>>>>                     Best regards,
>>>>>                     Vladisav
>>>>>
>>>>>
>>>>>
>>>>>                     On Thu, Mar 16, 2017 at 8:34 PM, Valentin
>>>>>                     Kulichenko <
>>>>>                     valentin.kulichenko@gmail.com
>>>>>                     <ma...@gmail.com>> wrote:
>>>>>
>>>>>                         Guys,
>>>>>
>>>>>                         How does recreation of the lock helps? My
>>>>>                         understanding is that scenario
>>>>>
>>>>>                     is
>>>>>
>>>>>                         the following:
>>>>>
>>>>>                         1. Client A creates and acquires a lock,
>>>>>                         and then starts to execute
>>>>>
>>>>>                     guarded
>>>>>
>>>>>                         logic.
>>>>>                         2. Client B tries to acquire the same lock
>>>>>                         and parks to wait.
>>>>>                         3. Before client A unlocks, all affinity
>>>>>                         nodes for the lock fail, lock
>>>>>                         disappears from the cache.
>>>>>                         4. Client B fails with exception,
>>>>>                         recreates the lock, acquires it, and
>>>>>                         starts to execute guarded logic
>>>>>                         concurrently with client A.
>>>>>
>>>>>                         In my view this is wrong anyway,
>>>>>                         regardless of whether this happens
>>>>>                         silently or with an exception handled in
>>>>>                         user's code. Because this code
>>>>>                         doesn't have any way to know if client A
>>>>>                         still holds the lock or not.
>>>>>
>>>>>                         Am I missing something?
>>>>>
>>>>>                         -Val
>>>>>
>>>>>                         On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy
>>>>>                         Setrakyan <
>>>>>
>>>>>                     dsetrakyan@apache.org
>>>>>                     <ma...@apache.org>
>>>>>
>>>>>                         wrote:
>>>>>
>>>>>                             On Tue, Mar 14, 2017 at 12:46 AM,
>>>>>                             Alexey Goncharuk <
>>>>>                             alexey.goncharuk@gmail.com
>>>>>                             <ma...@gmail.com>>
>>>>>                             wrote:
>>>>>
>>>>>                                     Which user operation would
>>>>>                                     result in exception? To my
>>>>>                                     knowledge,
>>>>>
>>>>>                     user
>>>>>
>>>>>                             may
>>>>>
>>>>>                                     already be holding the lock
>>>>>                                     and not invoking any Ignite
>>>>>                                     APIs, no?
>>>>>
>>>>>                                 Yes, this is exactly my point.
>>>>>
>>>>>                                 Imagine that a node already holds
>>>>>                                 a lock and another node is waiting
>>>>>
>>>>>                         for
>>>>>
>>>>>                                 the lock. If all partition nodes
>>>>>                                 leave the grid and the lock is
>>>>>
>>>>>                             re-created,
>>>>>
>>>>>                                 this second node will immediately
>>>>>                                 acquire the lock and we will have
>>>>>
>>>>>                     two
>>>>>
>>>>>                                 lock owners. I think in this case
>>>>>                                 this second node (blocked on
>>>>>
>>>>>                     lock())
>>>>>
>>>>>                                 should get an exception saying
>>>>>                                 that the lock was lost (which is, by
>>>>>
>>>>>                     the
>>>>>
>>>>>                                 way, the current behavior), and
>>>>>                                 the first node should get an
>>>>>
>>>>>                     exception
>>>>>
>>>>>                         on
>>>>>
>>>>>                                 unlock.
>>>>>
>>>>>                             Makes sense.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>

Re: IgniteSemaphore and failoverSafe flag

Posted by Dmitry Karachentsev <dk...@gridgain.com>.

Vladislav,

Yep, you're right. I'll fix it.

Thanks!

14.04.2017 15:18, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
> Hi Dmitry,
>
> it looks to me that this test is not valid - after the semaphore 2 
> fails the permits are redistributed
> so the expected number of permits should really be 20 not 10. Do you 
> agree?
>
> I guess before latest fix this test was (incorrectly) passing because 
> permits weren't released properly.
>
> What do you think?
>
> On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev 
> <dkarachentsev@gridgain.com <ma...@gridgain.com>> wrote:
>
>     Hi Vladislav,
>
>     It looks like after fix was merged these tests [1] started
>     failing. Could you please take a look?
>
>     [1]
>     http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures
>     <http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures>
>
>     Thanks!
>
>     -Dmitry.
>
>     13.04.2017 16:15, Dmitry Karachentsev \u043f\u0438\u0448\u0435\u0442:
>>     Thanks a lot!
>>
>>     12.04.2017 16:35, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>>>     Hi Dmitry,
>>>
>>>     sure, I made a fix, take a look at the PR and the comments in
>>>     the ticket.
>>>
>>>     Best regards,
>>>     Vladisav
>>>
>>>     On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev
>>>     <dkarachentsev@gridgain.com <ma...@gridgain.com>>
>>>     wrote:
>>>
>>>         Hi Vladislav,
>>>
>>>         Thanks for your contribution! But it seems doesn't fix
>>>         related tickets, in particular [1].
>>>         Could you please take a look?
>>>
>>>         [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>>         <https://issues.apache.org/jira/browse/IGNITE-4173>
>>>
>>>         Thanks!
>>>
>>>         06.04.2017 16:27, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>>>>         Hey Dmitry,
>>>>
>>>>         sorry for the late reply, I'll try to bake a pr later
>>>>         during the day.
>>>>
>>>>         Best regards,
>>>>         Vladisav
>>>>
>>>>
>>>>
>>>>         On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
>>>>         <dkarachentsev@gridgain.com
>>>>         <ma...@gridgain.com>> wrote:
>>>>
>>>>             Hi Vladislav,
>>>>
>>>>             I see you're developing [1] for a while, did you have
>>>>             any chance to fix it? If no, is there any estimate?
>>>>
>>>>             [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>>>             <https://issues.apache.org/jira/browse/IGNITE-1977>
>>>>
>>>>             Thanks!
>>>>
>>>>             -Dmitry.
>>>>
>>>>
>>>>
>>>>             20.03.2017 10:28, Alexey Goncharuk \u043f\u0438\u0448\u0435\u0442:
>>>>
>>>>                 I think re-creation should be handled by a user who
>>>>                 will make sure that
>>>>                 nobody else is currently executing the guarded
>>>>                 logic before the
>>>>                 re-creation. This is exactly the same semantics as with
>>>>                 BrokenBarrierException for j.u.c.CyclicBarrier.
>>>>
>>>>                 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>>>>                 <vladisavj@gmail.com <ma...@gmail.com>>:
>>>>
>>>>                     Hi everyone,
>>>>
>>>>                     I agree with Val, he's got a point; recreating
>>>>                     the lock doesn't seem
>>>>                     possible
>>>>                     (at least not the with the transactional cache
>>>>                     lock/semaphore we have).
>>>>                     Is this re-create behavior really needed?
>>>>
>>>>                     Best regards,
>>>>                     Vladisav
>>>>
>>>>
>>>>
>>>>                     On Thu, Mar 16, 2017 at 8:34 PM, Valentin
>>>>                     Kulichenko <
>>>>                     valentin.kulichenko@gmail.com
>>>>                     <ma...@gmail.com>> wrote:
>>>>
>>>>                         Guys,
>>>>
>>>>                         How does recreation of the lock helps? My
>>>>                         understanding is that scenario
>>>>
>>>>                     is
>>>>
>>>>                         the following:
>>>>
>>>>                         1. Client A creates and acquires a lock,
>>>>                         and then starts to execute
>>>>
>>>>                     guarded
>>>>
>>>>                         logic.
>>>>                         2. Client B tries to acquire the same lock
>>>>                         and parks to wait.
>>>>                         3. Before client A unlocks, all affinity
>>>>                         nodes for the lock fail, lock
>>>>                         disappears from the cache.
>>>>                         4. Client B fails with exception, recreates
>>>>                         the lock, acquires it, and
>>>>                         starts to execute guarded logic
>>>>                         concurrently with client A.
>>>>
>>>>                         In my view this is wrong anyway, regardless
>>>>                         of whether this happens
>>>>                         silently or with an exception handled in
>>>>                         user's code. Because this code
>>>>                         doesn't have any way to know if client A
>>>>                         still holds the lock or not.
>>>>
>>>>                         Am I missing something?
>>>>
>>>>                         -Val
>>>>
>>>>                         On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy
>>>>                         Setrakyan <
>>>>
>>>>                     dsetrakyan@apache.org
>>>>                     <ma...@apache.org>
>>>>
>>>>                         wrote:
>>>>
>>>>                             On Tue, Mar 14, 2017 at 12:46 AM,
>>>>                             Alexey Goncharuk <
>>>>                             alexey.goncharuk@gmail.com
>>>>                             <ma...@gmail.com>> wrote:
>>>>
>>>>                                     Which user operation would
>>>>                                     result in exception? To my
>>>>                                     knowledge,
>>>>
>>>>                     user
>>>>
>>>>                             may
>>>>
>>>>                                     already be holding the lock and
>>>>                                     not invoking any Ignite APIs, no?
>>>>
>>>>                                 Yes, this is exactly my point.
>>>>
>>>>                                 Imagine that a node already holds a
>>>>                                 lock and another node is waiting
>>>>
>>>>                         for
>>>>
>>>>                                 the lock. If all partition nodes
>>>>                                 leave the grid and the lock is
>>>>
>>>>                             re-created,
>>>>
>>>>                                 this second node will immediately
>>>>                                 acquire the lock and we will have
>>>>
>>>>                     two
>>>>
>>>>                                 lock owners. I think in this case
>>>>                                 this second node (blocked on
>>>>
>>>>                     lock())
>>>>
>>>>                                 should get an exception saying that
>>>>                                 the lock was lost (which is, by
>>>>
>>>>                     the
>>>>
>>>>                                 way, the current behavior), and the
>>>>                                 first node should get an
>>>>
>>>>                     exception
>>>>
>>>>                         on
>>>>
>>>>                                 unlock.
>>>>
>>>>                             Makes sense.
>>>>
>>>>
>>>>
>>>
>>>
>>
>
>

Re: IgniteSemaphore and failoverSafe flag

Posted by Vladisav Jelisavcic <vl...@gmail.com>.

Hi Dmitry,

it looks to me that this test is not valid - after the semaphore 2 fails
the permits are redistributed
so the expected number of permits should really be 20 not 10. Do you agree?

I guess before latest fix this test was (incorrectly) passing because
permits weren't released properly.

What do you think?

On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev <
dkarachentsev@gridgain.com> wrote:

> Hi Vladislav,
>
> It looks like after fix was merged these tests [1] started failing. Could
> you please take a look?
>
> [1] http://ci.ignite.apache.org/viewLog.html?buildId=544238&
> tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucut
> ures
>
> Thanks!
>
> -Dmitry.
>
> 13.04.2017 16:15, Dmitry Karachentsev пишет:
>
> Thanks a lot!
>
> 12.04.2017 16:35, Vladisav Jelisavcic пишет:
>
> Hi Dmitry,
>
> sure, I made a fix, take a look at the PR and the comments in the ticket.
>
> Best regards,
> Vladisav
>
> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev <
> dkarachentsev@gridgain.com> wrote:
>
>> Hi Vladislav,
>>
>> Thanks for your contribution! But it seems doesn't fix related tickets,
>> in particular [1].
>> Could you please take a look?
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>
>> Thanks!
>>
>> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>>
>> Hey Dmitry,
>>
>> sorry for the late reply, I'll try to bake a pr later during the day.
>>
>> Best regards,
>> Vladisav
>>
>>
>>
>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev <
>> dkarachentsev@gridgain.com> wrote:
>>
>>> Hi Vladislav,
>>>
>>> I see you're developing [1] for a while, did you have any chance to fix
>>> it? If no, is there any estimate?
>>>
>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>>
>>> Thanks!
>>>
>>> -Dmitry.
>>>
>>>
>>>
>>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>>
>>> I think re-creation should be handled by a user who will make sure that
>>>> nobody else is currently executing the guarded logic before the
>>>> re-creation. This is exactly the same semantics as with
>>>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>>>
>>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <vl...@gmail.com>:
>>>>
>>>> Hi everyone,
>>>>>
>>>>> I agree with Val, he's got a point; recreating the lock doesn't seem
>>>>> possible
>>>>> (at least not the with the transactional cache lock/semaphore we have).
>>>>> Is this re-create behavior really needed?
>>>>>
>>>>> Best regards,
>>>>> Vladisav
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>
>>>>> Guys,
>>>>>>
>>>>>> How does recreation of the lock helps? My understanding is that
>>>>>> scenario
>>>>>>
>>>>> is
>>>>>
>>>>>> the following:
>>>>>>
>>>>>> 1. Client A creates and acquires a lock, and then starts to execute
>>>>>>
>>>>> guarded
>>>>>
>>>>>> logic.
>>>>>> 2. Client B tries to acquire the same lock and parks to wait.
>>>>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock
>>>>>> disappears from the cache.
>>>>>> 4. Client B fails with exception, recreates the lock, acquires it, and
>>>>>> starts to execute guarded logic concurrently with client A.
>>>>>>
>>>>>> In my view this is wrong anyway, regardless of whether this happens
>>>>>> silently or with an exception handled in user's code. Because this
>>>>>> code
>>>>>> doesn't have any way to know if client A still holds the lock or not.
>>>>>>
>>>>>> Am I missing something?
>>>>>>
>>>>>> -Val
>>>>>>
>>>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>>>>>
>>>>> dsetrakyan@apache.org
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>>>>> alexey.goncharuk@gmail.com> wrote:
>>>>>>>
>>>>>>> Which user operation would result in exception? To my knowledge,
>>>>>>>>>
>>>>>>>> user
>>>>>
>>>>>> may
>>>>>>>
>>>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>>>>
>>>>>>>>> Yes, this is exactly my point.
>>>>>>>>
>>>>>>>> Imagine that a node already holds a lock and another node is waiting
>>>>>>>>
>>>>>>> for
>>>>>>
>>>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>>>>>>
>>>>>>> re-created,
>>>>>>>
>>>>>>>> this second node will immediately acquire the lock and we will have
>>>>>>>>
>>>>>>> two
>>>>>
>>>>>> lock owners. I think in this case this second node (blocked on
>>>>>>>>
>>>>>>> lock())
>>>>>
>>>>>> should get an exception saying that the lock was lost (which is, by
>>>>>>>>
>>>>>>> the
>>>>>
>>>>>> way, the current behavior), and the first node should get an
>>>>>>>>
>>>>>>> exception
>>>>>
>>>>>> on
>>>>>>
>>>>>>> unlock.
>>>>>>>>
>>>>>>>> Makes sense.
>>>>>>>
>>>>>>>
>>>
>>
>>
>
>
>

Re: IgniteSemaphore and failoverSafe flag

Posted by Dmitry Karachentsev <dk...@gridgain.com>.

Hi Vladislav,

It looks like after fix was merged these tests [1] started failing. 
Could you please take a look?

[1] 
http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures

Thanks!

-Dmitry.

13.04.2017 16:15, Dmitry Karachentsev \u043f\u0438\u0448\u0435\u0442:
> Thanks a lot!
>
> 12.04.2017 16:35, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>> Hi Dmitry,
>>
>> sure, I made a fix, take a look at the PR and the comments in the ticket.
>>
>> Best regards,
>> Vladisav
>>
>> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev 
>> <dkarachentsev@gridgain.com <ma...@gridgain.com>> wrote:
>>
>>     Hi Vladislav,
>>
>>     Thanks for your contribution! But it seems doesn't fix related
>>     tickets, in particular [1].
>>     Could you please take a look?
>>
>>     [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>     <https://issues.apache.org/jira/browse/IGNITE-4173>
>>
>>     Thanks!
>>
>>     06.04.2017 16:27, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>>>     Hey Dmitry,
>>>
>>>     sorry for the late reply, I'll try to bake a pr later during the
>>>     day.
>>>
>>>     Best regards,
>>>     Vladisav
>>>
>>>
>>>
>>>     On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
>>>     <dkarachentsev@gridgain.com <ma...@gridgain.com>>
>>>     wrote:
>>>
>>>         Hi Vladislav,
>>>
>>>         I see you're developing [1] for a while, did you have any
>>>         chance to fix it? If no, is there any estimate?
>>>
>>>         [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>>         <https://issues.apache.org/jira/browse/IGNITE-1977>
>>>
>>>         Thanks!
>>>
>>>         -Dmitry.
>>>
>>>
>>>
>>>         20.03.2017 10:28, Alexey Goncharuk \u043f\u0438\u0448\u0435\u0442:
>>>
>>>             I think re-creation should be handled by a user who will
>>>             make sure that
>>>             nobody else is currently executing the guarded logic
>>>             before the
>>>             re-creation. This is exactly the same semantics as with
>>>             BrokenBarrierException for j.u.c.CyclicBarrier.
>>>
>>>             2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>>>             <vladisavj@gmail.com <ma...@gmail.com>>:
>>>
>>>                 Hi everyone,
>>>
>>>                 I agree with Val, he's got a point; recreating the
>>>                 lock doesn't seem
>>>                 possible
>>>                 (at least not the with the transactional cache
>>>                 lock/semaphore we have).
>>>                 Is this re-create behavior really needed?
>>>
>>>                 Best regards,
>>>                 Vladisav
>>>
>>>
>>>
>>>                 On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>>                 valentin.kulichenko@gmail.com
>>>                 <ma...@gmail.com>> wrote:
>>>
>>>                     Guys,
>>>
>>>                     How does recreation of the lock helps? My
>>>                     understanding is that scenario
>>>
>>>                 is
>>>
>>>                     the following:
>>>
>>>                     1. Client A creates and acquires a lock, and
>>>                     then starts to execute
>>>
>>>                 guarded
>>>
>>>                     logic.
>>>                     2. Client B tries to acquire the same lock and
>>>                     parks to wait.
>>>                     3. Before client A unlocks, all affinity nodes
>>>                     for the lock fail, lock
>>>                     disappears from the cache.
>>>                     4. Client B fails with exception, recreates the
>>>                     lock, acquires it, and
>>>                     starts to execute guarded logic concurrently
>>>                     with client A.
>>>
>>>                     In my view this is wrong anyway, regardless of
>>>                     whether this happens
>>>                     silently or with an exception handled in user's
>>>                     code. Because this code
>>>                     doesn't have any way to know if client A still
>>>                     holds the lock or not.
>>>
>>>                     Am I missing something?
>>>
>>>                     -Val
>>>
>>>                     On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy
>>>                     Setrakyan <
>>>
>>>                 dsetrakyan@apache.org <ma...@apache.org>
>>>
>>>                     wrote:
>>>
>>>                         On Tue, Mar 14, 2017 at 12:46 AM, Alexey
>>>                         Goncharuk <
>>>                         alexey.goncharuk@gmail.com
>>>                         <ma...@gmail.com>> wrote:
>>>
>>>                                 Which user operation would result in
>>>                                 exception? To my knowledge,
>>>
>>>                 user
>>>
>>>                         may
>>>
>>>                                 already be holding the lock and not
>>>                                 invoking any Ignite APIs, no?
>>>
>>>                             Yes, this is exactly my point.
>>>
>>>                             Imagine that a node already holds a lock
>>>                             and another node is waiting
>>>
>>>                     for
>>>
>>>                             the lock. If all partition nodes leave
>>>                             the grid and the lock is
>>>
>>>                         re-created,
>>>
>>>                             this second node will immediately
>>>                             acquire the lock and we will have
>>>
>>>                 two
>>>
>>>                             lock owners. I think in this case this
>>>                             second node (blocked on
>>>
>>>                 lock())
>>>
>>>                             should get an exception saying that the
>>>                             lock was lost (which is, by
>>>
>>>                 the
>>>
>>>                             way, the current behavior), and the
>>>                             first node should get an
>>>
>>>                 exception
>>>
>>>                     on
>>>
>>>                             unlock.
>>>
>>>                         Makes sense.
>>>
>>>
>>>
>>
>>
>

Re: IgniteSemaphore and failoverSafe flag

Posted by Dmitry Karachentsev <dk...@gridgain.com>.

Thanks a lot!

12.04.2017 16:35, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
> Hi Dmitry,
>
> sure, I made a fix, take a look at the PR and the comments in the ticket.
>
> Best regards,
> Vladisav
>
> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev 
> <dkarachentsev@gridgain.com <ma...@gridgain.com>> wrote:
>
>     Hi Vladislav,
>
>     Thanks for your contribution! But it seems doesn't fix related
>     tickets, in particular [1].
>     Could you please take a look?
>
>     [1] https://issues.apache.org/jira/browse/IGNITE-4173
>     <https://issues.apache.org/jira/browse/IGNITE-4173>
>
>     Thanks!
>
>     06.04.2017 16:27, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
>>     Hey Dmitry,
>>
>>     sorry for the late reply, I'll try to bake a pr later during the day.
>>
>>     Best regards,
>>     Vladisav
>>
>>
>>
>>     On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
>>     <dkarachentsev@gridgain.com <ma...@gridgain.com>>
>>     wrote:
>>
>>         Hi Vladislav,
>>
>>         I see you're developing [1] for a while, did you have any
>>         chance to fix it? If no, is there any estimate?
>>
>>         [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>         <https://issues.apache.org/jira/browse/IGNITE-1977>
>>
>>         Thanks!
>>
>>         -Dmitry.
>>
>>
>>
>>         20.03.2017 10:28, Alexey Goncharuk \u043f\u0438\u0448\u0435\u0442:
>>
>>             I think re-creation should be handled by a user who will
>>             make sure that
>>             nobody else is currently executing the guarded logic
>>             before the
>>             re-creation. This is exactly the same semantics as with
>>             BrokenBarrierException for j.u.c.CyclicBarrier.
>>
>>             2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>>             <vladisavj@gmail.com <ma...@gmail.com>>:
>>
>>                 Hi everyone,
>>
>>                 I agree with Val, he's got a point; recreating the
>>                 lock doesn't seem
>>                 possible
>>                 (at least not the with the transactional cache
>>                 lock/semaphore we have).
>>                 Is this re-create behavior really needed?
>>
>>                 Best regards,
>>                 Vladisav
>>
>>
>>
>>                 On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>                 valentin.kulichenko@gmail.com
>>                 <ma...@gmail.com>> wrote:
>>
>>                     Guys,
>>
>>                     How does recreation of the lock helps? My
>>                     understanding is that scenario
>>
>>                 is
>>
>>                     the following:
>>
>>                     1. Client A creates and acquires a lock, and then
>>                     starts to execute
>>
>>                 guarded
>>
>>                     logic.
>>                     2. Client B tries to acquire the same lock and
>>                     parks to wait.
>>                     3. Before client A unlocks, all affinity nodes
>>                     for the lock fail, lock
>>                     disappears from the cache.
>>                     4. Client B fails with exception, recreates the
>>                     lock, acquires it, and
>>                     starts to execute guarded logic concurrently with
>>                     client A.
>>
>>                     In my view this is wrong anyway, regardless of
>>                     whether this happens
>>                     silently or with an exception handled in user's
>>                     code. Because this code
>>                     doesn't have any way to know if client A still
>>                     holds the lock or not.
>>
>>                     Am I missing something?
>>
>>                     -Val
>>
>>                     On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>
>>                 dsetrakyan@apache.org <ma...@apache.org>
>>
>>                     wrote:
>>
>>                         On Tue, Mar 14, 2017 at 12:46 AM, Alexey
>>                         Goncharuk <
>>                         alexey.goncharuk@gmail.com
>>                         <ma...@gmail.com>> wrote:
>>
>>                                 Which user operation would result in
>>                                 exception? To my knowledge,
>>
>>                 user
>>
>>                         may
>>
>>                                 already be holding the lock and not
>>                                 invoking any Ignite APIs, no?
>>
>>                             Yes, this is exactly my point.
>>
>>                             Imagine that a node already holds a lock
>>                             and another node is waiting
>>
>>                     for
>>
>>                             the lock. If all partition nodes leave
>>                             the grid and the lock is
>>
>>                         re-created,
>>
>>                             this second node will immediately acquire
>>                             the lock and we will have
>>
>>                 two
>>
>>                             lock owners. I think in this case this
>>                             second node (blocked on
>>
>>                 lock())
>>
>>                             should get an exception saying that the
>>                             lock was lost (which is, by
>>
>>                 the
>>
>>                             way, the current behavior), and the first
>>                             node should get an
>>
>>                 exception
>>
>>                     on
>>
>>                             unlock.
>>
>>                         Makes sense.
>>
>>
>>
>
>

Re: IgniteSemaphore and failoverSafe flag

Posted by Vladisav Jelisavcic <vl...@gmail.com>.

Hi Dmitry,

sure, I made a fix, take a look at the PR and the comments in the ticket.

Best regards,
Vladisav

On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev <
dkarachentsev@gridgain.com> wrote:

> Hi Vladislav,
>
> Thanks for your contribution! But it seems doesn't fix related tickets, in
> particular [1].
> Could you please take a look?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-4173
>
> Thanks!
>
> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>
> Hey Dmitry,
>
> sorry for the late reply, I'll try to bake a pr later during the day.
>
> Best regards,
> Vladisav
>
>
>
> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev <
> dkarachentsev@gridgain.com> wrote:
>
>> Hi Vladislav,
>>
>> I see you're developing [1] for a while, did you have any chance to fix
>> it? If no, is there any estimate?
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>
>> Thanks!
>>
>> -Dmitry.
>>
>>
>>
>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>
>> I think re-creation should be handled by a user who will make sure that
>>> nobody else is currently executing the guarded logic before the
>>> re-creation. This is exactly the same semantics as with
>>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>>
>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <vl...@gmail.com>:
>>>
>>> Hi everyone,
>>>>
>>>> I agree with Val, he's got a point; recreating the lock doesn't seem
>>>> possible
>>>> (at least not the with the transactional cache lock/semaphore we have).
>>>> Is this re-create behavior really needed?
>>>>
>>>> Best regards,
>>>> Vladisav
>>>>
>>>>
>>>>
>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>>> valentin.kulichenko@gmail.com> wrote:
>>>>
>>>> Guys,
>>>>>
>>>>> How does recreation of the lock helps? My understanding is that
>>>>> scenario
>>>>>
>>>> is
>>>>
>>>>> the following:
>>>>>
>>>>> 1. Client A creates and acquires a lock, and then starts to execute
>>>>>
>>>> guarded
>>>>
>>>>> logic.
>>>>> 2. Client B tries to acquire the same lock and parks to wait.
>>>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock
>>>>> disappears from the cache.
>>>>> 4. Client B fails with exception, recreates the lock, acquires it, and
>>>>> starts to execute guarded logic concurrently with client A.
>>>>>
>>>>> In my view this is wrong anyway, regardless of whether this happens
>>>>> silently or with an exception handled in user's code. Because this code
>>>>> doesn't have any way to know if client A still holds the lock or not.
>>>>>
>>>>> Am I missing something?
>>>>>
>>>>> -Val
>>>>>
>>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>>>>
>>>> dsetrakyan@apache.org
>>>>
>>>>> wrote:
>>>>>
>>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>>>> alexey.goncharuk@gmail.com> wrote:
>>>>>>
>>>>>> Which user operation would result in exception? To my knowledge,
>>>>>>>>
>>>>>>> user
>>>>
>>>>> may
>>>>>>
>>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>>>
>>>>>>>> Yes, this is exactly my point.
>>>>>>>
>>>>>>> Imagine that a node already holds a lock and another node is waiting
>>>>>>>
>>>>>> for
>>>>>
>>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>>>>>
>>>>>> re-created,
>>>>>>
>>>>>>> this second node will immediately acquire the lock and we will have
>>>>>>>
>>>>>> two
>>>>
>>>>> lock owners. I think in this case this second node (blocked on
>>>>>>>
>>>>>> lock())
>>>>
>>>>> should get an exception saying that the lock was lost (which is, by
>>>>>>>
>>>>>> the
>>>>
>>>>> way, the current behavior), and the first node should get an
>>>>>>>
>>>>>> exception
>>>>
>>>>> on
>>>>>
>>>>>> unlock.
>>>>>>>
>>>>>>> Makes sense.
>>>>>>
>>>>>>
>>
>
>

Re: IgniteSemaphore and failoverSafe flag

Posted by Dmitry Karachentsev <dk...@gridgain.com>.

Hi Vladislav,

Thanks for your contribution! But it seems doesn't fix related tickets, 
in particular [1].
Could you please take a look?

[1] https://issues.apache.org/jira/browse/IGNITE-4173

Thanks!

06.04.2017 16:27, Vladisav Jelisavcic \u043f\u0438\u0448\u0435\u0442:
> Hey Dmitry,
>
> sorry for the late reply, I'll try to bake a pr later during the day.
>
> Best regards,
> Vladisav
>
>
>
> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev 
> <dkarachentsev@gridgain.com <ma...@gridgain.com>> wrote:
>
>     Hi Vladislav,
>
>     I see you're developing [1] for a while, did you have any chance
>     to fix it? If no, is there any estimate?
>
>     [1] https://issues.apache.org/jira/browse/IGNITE-1977
>     <https://issues.apache.org/jira/browse/IGNITE-1977>
>
>     Thanks!
>
>     -Dmitry.
>
>
>
>     20.03.2017 10:28, Alexey Goncharuk \u043f\u0438\u0448\u0435\u0442:
>
>         I think re-creation should be handled by a user who will make
>         sure that
>         nobody else is currently executing the guarded logic before the
>         re-creation. This is exactly the same semantics as with
>         BrokenBarrierException for j.u.c.CyclicBarrier.
>
>         2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>         <vladisavj@gmail.com <ma...@gmail.com>>:
>
>             Hi everyone,
>
>             I agree with Val, he's got a point; recreating the lock
>             doesn't seem
>             possible
>             (at least not the with the transactional cache
>             lock/semaphore we have).
>             Is this re-create behavior really needed?
>
>             Best regards,
>             Vladisav
>
>
>
>             On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>             valentin.kulichenko@gmail.com
>             <ma...@gmail.com>> wrote:
>
>                 Guys,
>
>                 How does recreation of the lock helps? My
>                 understanding is that scenario
>
>             is
>
>                 the following:
>
>                 1. Client A creates and acquires a lock, and then
>                 starts to execute
>
>             guarded
>
>                 logic.
>                 2. Client B tries to acquire the same lock and parks
>                 to wait.
>                 3. Before client A unlocks, all affinity nodes for the
>                 lock fail, lock
>                 disappears from the cache.
>                 4. Client B fails with exception, recreates the lock,
>                 acquires it, and
>                 starts to execute guarded logic concurrently with
>                 client A.
>
>                 In my view this is wrong anyway, regardless of whether
>                 this happens
>                 silently or with an exception handled in user's code.
>                 Because this code
>                 doesn't have any way to know if client A still holds
>                 the lock or not.
>
>                 Am I missing something?
>
>                 -Val
>
>                 On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>
>             dsetrakyan@apache.org <ma...@apache.org>
>
>                 wrote:
>
>                     On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>                     alexey.goncharuk@gmail.com
>                     <ma...@gmail.com>> wrote:
>
>                             Which user operation would result in
>                             exception? To my knowledge,
>
>             user
>
>                     may
>
>                             already be holding the lock and not
>                             invoking any Ignite APIs, no?
>
>                         Yes, this is exactly my point.
>
>                         Imagine that a node already holds a lock and
>                         another node is waiting
>
>                 for
>
>                         the lock. If all partition nodes leave the
>                         grid and the lock is
>
>                     re-created,
>
>                         this second node will immediately acquire the
>                         lock and we will have
>
>             two
>
>                         lock owners. I think in this case this second
>                         node (blocked on
>
>             lock())
>
>                         should get an exception saying that the lock
>                         was lost (which is, by
>
>             the
>
>                         way, the current behavior), and the first node
>                         should get an
>
>             exception
>
>                 on
>
>                         unlock.
>
>                     Makes sense.
>
>
>

Re: IgniteSemaphore and failoverSafe flag

Posted by Vladisav Jelisavcic <vl...@gmail.com>.

Hey Dmitry,

sorry for the late reply, I'll try to bake a pr later during the day.

Best regards,
Vladisav



On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev <
dkarachentsev@gridgain.com> wrote:

> Hi Vladislav,
>
> I see you're developing [1] for a while, did you have any chance to fix
> it? If no, is there any estimate?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>
> Thanks!
>
> -Dmitry.
>
>
>
> 20.03.2017 10:28, Alexey Goncharuk пишет:
>
> I think re-creation should be handled by a user who will make sure that
>> nobody else is currently executing the guarded logic before the
>> re-creation. This is exactly the same semantics as with
>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>
>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <vl...@gmail.com>:
>>
>> Hi everyone,
>>>
>>> I agree with Val, he's got a point; recreating the lock doesn't seem
>>> possible
>>> (at least not the with the transactional cache lock/semaphore we have).
>>> Is this re-create behavior really needed?
>>>
>>> Best regards,
>>> Vladisav
>>>
>>>
>>>
>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>> valentin.kulichenko@gmail.com> wrote:
>>>
>>> Guys,
>>>>
>>>> How does recreation of the lock helps? My understanding is that scenario
>>>>
>>> is
>>>
>>>> the following:
>>>>
>>>> 1. Client A creates and acquires a lock, and then starts to execute
>>>>
>>> guarded
>>>
>>>> logic.
>>>> 2. Client B tries to acquire the same lock and parks to wait.
>>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock
>>>> disappears from the cache.
>>>> 4. Client B fails with exception, recreates the lock, acquires it, and
>>>> starts to execute guarded logic concurrently with client A.
>>>>
>>>> In my view this is wrong anyway, regardless of whether this happens
>>>> silently or with an exception handled in user's code. Because this code
>>>> doesn't have any way to know if client A still holds the lock or not.
>>>>
>>>> Am I missing something?
>>>>
>>>> -Val
>>>>
>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>>>
>>> dsetrakyan@apache.org
>>>
>>>> wrote:
>>>>
>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>>> alexey.goncharuk@gmail.com> wrote:
>>>>>
>>>>> Which user operation would result in exception? To my knowledge,
>>>>>>>
>>>>>> user
>>>
>>>> may
>>>>>
>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>>
>>>>>>> Yes, this is exactly my point.
>>>>>>
>>>>>> Imagine that a node already holds a lock and another node is waiting
>>>>>>
>>>>> for
>>>>
>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>>>>
>>>>> re-created,
>>>>>
>>>>>> this second node will immediately acquire the lock and we will have
>>>>>>
>>>>> two
>>>
>>>> lock owners. I think in this case this second node (blocked on
>>>>>>
>>>>> lock())
>>>
>>>> should get an exception saying that the lock was lost (which is, by
>>>>>>
>>>>> the
>>>
>>>> way, the current behavior), and the first node should get an
>>>>>>
>>>>> exception
>>>
>>>> on
>>>>
>>>>> unlock.
>>>>>>
>>>>>> Makes sense.
>>>>>
>>>>>
>