You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Alexander Menshikov (JIRA)" <ji...@apache.org> on 2017/11/29 09:39:00 UTC
[jira] [Comment Edited] (IGNITE-4908) Ignite.reentrantLock looks much slower than IgniteCache.lock.

    [ https://issues.apache.org/jira/browse/IGNITE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16270474#comment-16270474 ] 

Alexander Menshikov edited comment on IGNITE-4908 at 11/29/17 9:38 AM:
-----------------------------------------------------------------------

[~amashenkov] [~avinogradov]
I’ve implemented the new fast reentrant lock. Please review (PR and other stuff have been linked already).

The main idea:
Current locks implementation is based on shared states in a cache, while the new lock uses IgniteCache#invoke* methods to update shared state atomically. New lock implementation doesn’t use a continuous query, so the cache became atomic now.

New lock implementation has two types: fair and unfair, split into different classes for performance reasons. 

Benchmarks show (hardware: Core i5 (2 gen) + 6GB RAM):

|| type || Number of nodes || treads per node || us/op (fair mode) ||  us/op (unfair mode) ||
|Old lock|  1               |        1         | 696    ± 22        | 688   ± 19            |
|New lock|  1               |        1         | 30.5   ± 0.4       | 30.7  ± 0.4           |
|Old lock|  2               |        1         | 892    ± 36        | 855   ± 34            |
|New lock|  2               |        1         | 260    ± 5         | 267   ± 9             |
|Old lock|  5               |        1         | 8881   ± 1376      | 7087  ± 225           |
|New lock|  5               |        1         | 895    ± 30        | 884   ± 16            |
|Old lock|  10              |        1         | 38302  ± 1042      | 39484 ± 2466          |
|New lock|  10              |        1         | 2136   ± 112       | 2074  ± 75            |
|Old lock|  1               |        2         | 305    ± 10        | 288  ± 11             |
|New lock|  1               |        2         | 78     ± 2         | 8.6   ±  0.1          |
|Old lock|  1               |        10        | 1777   ± 105       | 10298 ± 894           |
|New lock|  1               |        10        | 507    ± 7         | 49    ± 1             |
|Old lock|  5               |        2         | 22027  ± 1789      | 15904 ± 2321          |
|New lock|  5               |        2         | 1635   ± 66        | 50    ± 3             |
|Old lock|  5               |        10        | 127560 ± 18090     | 97304 ± 6703          |
|New lock|  5               |        10        | 8479   ± 277       | 250   ± 5             |

Speed up: single thread + fair:   21.9x (1 node), 3.4x (2 nodes), 9.9x (5 nodes), 17.9x (10 nodes)
Speed up: single thread + unfair: 22.4x (1 node), 3.2x (2 nodes), 8.0x (5 nodes), 19.0x (10 nodes)

Speed up: multi-threads + fair:   3.9x (1 n,2 t), 3.5x (1 n,10t), 13.5x (5 n,2t), 15.0x (5n, 10t)
Speed up: multi-threads + unfair: 33.5x (1 n,2t), 210x (1 n,10t), 318x  (5 n,2t), 389x  (5n, 10t)

Benchmark result details:
1) The unfair lock has a local reentrant lock which is used for local synchronization as a guard before the shared state. That allows reaching performance close to a local reentrant lock.
2) One server node can be a primary for the shared state, which gives us a performance boost on only one node.
3) Speedup grows with a number of nodes.


What are NOT implemented:

1) Conditions.

Why:
It's better to create a new issue in Jira for such task because implementation could be complicated. It also requires additional benchmarks and tests.

2) Removing shared lock state from the cache when it is no longer needed.

Why:
In the old reentrant lock, the #close method can remove the shared state from a cache while other nodes still use it (but the lock should be free at the same time). It is not a trivial task too, in case if we want to behave safer (which I suppose we should do). There are a lot of details like:
- Should #close method wait for releasing the locks by all threads in local node?
- If yes, how to manage Ignite#close() call (it may hang while waiting for release in lock#close)?
And so on. It needs a new issue in Jira too and discussion.

3) Failover = false mode.

Why:
IgniteCache#invoke* helps to create failover. So in current implementation failover always switch on (price is very low). But if you don't want to lose shared state, you have to use REPLICATED (or set some backups) + FULL_SYNC cache mode (see the IgniteReentrantLockTest). It is also true for the old lock, and this is why I left an opportunity to set that options by users.
The idea of a breaking lock object (as it works in old locks) seems weird because you can't find out the lock was broken somewhere when you under the lock, but before calling #unlock.
For being honest I don't see a reason for supporting failover-off mode in a distributed system.

4) Some status methods: hasWaiters, getWaitQueueLength, hasQueuedThread, hasQueuedThreads, hasQueuedThreads

Why:
Just haven't enough time. I think these methods are not important for the first step and we can discuss more important topics now and leave that status methods for a future.

5) Method for time configuration of holding lock without a reacquiring. Right now it's hardcoded at 50 ms.

Why:
The same as for status methods above. Also, I think we can calculate time from the number of nodes (like 5*n ms). So it needs a short discussion too.



FAQ:

1) Can you create a super class for the GridCacheLockImpl2Unfair.GlobalSync and the GridCacheLockImpl2Fair.Sync and remove some code duplication?

I have tried, but the main problem is how Java generic works. Please look at lockView, and you will see that IgniteInternalCache<GridCacheInternalKey, GridCacheLockState2Base<?>> uncastable to IgniteInternalCache<GridCacheInternalKey, GridCacheLockState2Base<LockOwner>> or IgniteInternalCache<GridCacheInternalKey, GridCacheLockState2Base<UUID>>. So all operations with cache (which is the main operations in *Sync classes) we have to duplicate. It's a price for typesafety.

Most of other lines a little different in fair and unfair implementations, so there are very few lines which we can extract to the super class. And I'm not sure we should create the super class for such small effect.



was (Author: sharpler):
[~amashenkov] [~avinogradov]
I’ve implemented the new fast reentrant lock.

The main idea:
Current locks implementation is based on shared states in a cache, while the new lock uses IgniteCache#invoke* methods to update shared state atomically. New lock implementation doesn’t use a continuous query, so the cache became atomic now.

New lock implementation has two types: fair and unfair, split into different classes for performance reasons. 

Benchmarks show (hardware: Core i5 (2 gen) + 6GB RAM):

|| type || Number of nodes || treads per node || us/op (fair mode) ||  us/op (unfair mode) ||
|Old lock|  1               |        1         | 696    ± 22        | 688   ± 19            |
|New lock|  1               |        1         | 30.5   ± 0.4       | 30.7  ± 0.4           |
|Old lock|  2               |        1         | 892    ± 36        | 855   ± 34            |
|New lock|  2               |        1         | 260    ± 5         | 267   ± 9             |
|Old lock|  5               |        1         | 8881   ± 1376      | 7087  ± 225           |
|New lock|  5               |        1         | 895    ± 30        | 884   ± 16            |
|Old lock|  10              |        1         | 38302  ± 1042      | 39484 ± 2466          |
|New lock|  10              |        1         | 2136   ± 112       | 2074  ± 75            |
|Old lock|  1               |        2         | 305    ± 10        | 288  ± 11             |
|New lock|  1               |        2         | 78     ± 2         | 8.6   ±  0.1          |
|Old lock|  1               |        10        | 1777   ± 105       | 10298 ± 894           |
|New lock|  1               |        10        | 507    ± 7         | 49    ± 1             |
|Old lock|  5               |        2         | 22027  ± 1789      | 15904 ± 2321          |
|New lock|  5               |        2         | 1635   ± 66        | 50    ± 3             |
|Old lock|  5               |        10        | 127560 ± 18090     | 97304 ± 6703          |
|New lock|  5               |        10        | 8479   ± 277       | 250   ± 5             |

Speed up: single thread + fair:   21.9x (1 node), 3.4x (2 nodes), 9.9x (5 nodes), 17.9x (10 nodes)
Speed up: single thread + unfair: 22.4x (1 node), 3.2x (2 nodes), 8.0x (5 nodes), 19.0x (10 nodes)

Speed up: multi-threads + fair:   3.9x (1 n,2 t), 3.5x (1 n,10t), 13.5x (5 n,2t), 15.0x (5n, 10t)
Speed up: multi-threads + unfair: 33.5x (1 n,2t), 210x (1 n,10t), 318x  (5 n,2t), 389x  (5n, 10t)

Benchmark result details:
1) The unfair lock has a local reentrant lock which is used for local synchronization as a guard before the shared state. That allows reaching performance close to a local reentrant lock.
2) One server node can be a primary for the shared state, which gives us a performance boost on only one node.
3) Speedup grows with a number of nodes.


What are NOT implemented:

1) Conditions.

Why:
It's better to create a new issue in Jira for such task because implementation could be complicated. It also requires additional benchmarks and tests.

2) Removing shared lock state from the cache when it is no longer needed.

Why:
In the old reentrant lock, the #close method can remove the shared state from a cache while other nodes still use it (but the lock should be free at the same time). It is not a trivial task too, in case if we want to behave safer (which I suppose we should do). There are a lot of details like:
- Should #close method wait for releasing the locks by all threads in local node?
- If yes, how to manage Ignite#close() call (it may hang while waiting for release in lock#close)?
And so on. It needs a new issue in Jira too and discussion.

3) Failover = false mode.

Why:
IgniteCache#invoke* helps to create failover. So in current implementation failover always switch on (price is very low). But if you don't want to lose shared state, you have to use REPLICATED (or set some backups) + FULL_SYNC cache mode (see the IgniteReentrantLockTest). It is also true for the old lock, and this is why I left an opportunity to set that options by users.
The idea of a breaking lock object (as it works in old locks) seems weird because you can't find out the lock was broken somewhere when you under the lock, but before calling #unlock.
For being honest I don't see a reason for supporting failover-off mode in a distributed system.

4) Some status methods: hasWaiters, getWaitQueueLength, hasQueuedThread, hasQueuedThreads, hasQueuedThreads

Why:
Just haven't enough time. I think these methods are not important for the first step and we can discuss more important topics now and leave that status methods for a future.

5) Method for time configuration of holding lock without a reacquiring. Right now it's hardcoded at 50 ms.

Why:
The same as for status methods above. Also, I think we can calculate time from the number of nodes (like 5*n ms). So it needs a short discussion too.



FAQ:

1) Can you create a super class for the GridCacheLockImpl2Unfair.GlobalSync and the GridCacheLockImpl2Fair.Sync and remove some code duplication?

I have tried, but the main problem is how Java generic works. Please look at lockView, and you will see that IgniteInternalCache<GridCacheInternalKey, GridCacheLockState2Base<?>> uncastable to IgniteInternalCache<GridCacheInternalKey, GridCacheLockState2Base<LockOwner>> or IgniteInternalCache<GridCacheInternalKey, GridCacheLockState2Base<UUID>>. So all operations with cache (which is the main operations in *Sync classes) we have to duplicate. It's a price for typesafety.

Most of other lines a little different in fair and unfair implementations, so there are very few lines which we can extract to the super class. And I'm not sure we should create the super class for such small effect.


> Ignite.reentrantLock looks much slower than IgniteCache.lock.
> -------------------------------------------------------------
>
>                 Key: IGNITE-4908
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4908
>             Project: Ignite
>          Issue Type: Improvement
>          Components: data structures
>    Affects Versions: 1.8
>            Reporter: Andrew Mashenkov
>            Assignee: Alexander Menshikov
>
> Design discussed with Alexander:
> 1) Lock 
> Entry Processor (sync) -> 
> ....add candidate. 
> ....returns "added candidate at first position"
> ....retry failover -> 
> ........if already at first position -> return true
> In case lock not acquired, wait for acquire (AbstractQueuedSynchronizer should be used).
> 2) Unlock 
> Entry Processor (async) -> 
> ....remove candidate at first position
> ....retry failover -> remove only if "candidate at first position" equals to expected
> ....listener ->
> ........notify current "candidate at first position" it got lock
> 3)Failover
> 3.1) Originating node failed
> Failed node listener ->
> ....For each local(primary) lock ->
> ........Entry Processor (async) ->
> ............remove candidates related no failed node
> ............retry failover not needed
> ............listener -> 
> ................if "candidate at first position" removed ->
> ....................notify current "candidate at first position" it got lock
> 3.2) Primary node failed
> After rebalancing schedule Callable ->
> ....For each local(primary) lock ->
> ........Entry Processor (async) ->
> ............remove candidates related to failed nodes
> ............retry failover not needed
> ............listener -> 
> ................notify current "candidate at first position" it got lock



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)