You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@harmony.apache.org by "weldon washburn (JIRA)" <ji...@apache.org> on 2007/06/08 07:49:26 UTC

[jira] Updated: (HARMONY-3995) [drlvm][thread][performance] Performance improvement for uncontended synchronization.

     [ https://issues.apache.org/jira/browse/HARMONY-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

weldon washburn updated HARMONY-3995:
-------------------------------------

    Attachment: rough_ideas.diff

> [drlvm][thread][performance] Performance improvement for uncontended synchronization.
> -------------------------------------------------------------------------------------
>
>                 Key: HARMONY-3995
>                 URL: https://issues.apache.org/jira/browse/HARMONY-3995
>             Project: Harmony
>          Issue Type: Improvement
>          Components: DRLVM
>            Reporter: Sergey Kuksenko
>            Assignee: weldon washburn
>         Attachments: rough_ideas.diff, soft_unreserv_2.patch, synchTest.zip
>
>
> It is fact that even simple atomic instructions (lock cmpxchg, etc...) have a big influence on performance especially for multyprocessors systems. DRLVM uses reservation locks scheme for uncontended synchronizarion. Here is in case of local (from the single thread) and uncontended synchronization all monitor_enter and monitor_enter primitives are executed without atomic instructions. In case of non-local (from several threads) and still uncontended synchronization DRLVM uses thin-locks scheme (with atomic instructions).  Lock unreservation is rather expensive operation because of necessity to stop the owner thread. That is why DRLVM uses unreservation only once - for transferring to thin lock. From the other side there is a common situation which are not covered by the current scheme - it is transferring locality - when after several synchronizations from one thread data are tranferred to another thread and locality (access from one thread) is continued in new thread. 
> The attached patch provide improvement in case of tranferring locality. The following heuristics is used:
> - If at the moment of unreservation the owner thread is already stopped then the lock will be unreserved but won't be switched to thin lock state. The lock stays in reservation mode and will be reserved for the next thread tryied to acquire it. In others words if unreservation costs nothing (thread is already stopped (in wait, sleep, terminated ... state)) then DRLVM unreserve the lock but save it for future reservations.
> There are a bunch of applications where it gives a performance boost. Also I've attached a microbenchmark which shows the real performance boost of the patch. From the other site we need to do additional investigation where the patch gives boost. That is why the patch doesn't change the current unreservation. The patch introduses new option "-XX:thread.soft_unreservation" which is turned off by default. Turning it on allows to use new unreservation (soft) scheme.
> Some datails about attached microbenchmark. Here I emulates the following scenario:
> - the main thread creates a bunch of data (objects with synchronized access) 
> - the main thread separates all data for 4 "processing" threads
> - the main thread runs 4 processing threads and waits results from them.
> The number is amount of synchronized operations divided by 10. (then more then better)
> For example:
> synchronized OPS     = 7147           - Here is we have ~71470 synch ops per second.
> non-synchronized OPS = 19891    - 
> The last number shows speed of the same operations without any synchronization.
> Ratio between synchronized and non-synchronized OPS shows the dagradation caused by synchronization (even uncontended).
> Here is some measurements for Sun1.6 and DRLVM on the microbench:
> 1. Sun1.6
> CMDLINE:  java -server -jar synchTest.jar
> Measure phase; threads(4); time(180)
> synchronized OPS     = 7886
> non-synchronized OPS = 59907
> 2. DRLVM 
> 2.1 java -XX:thread.soft_unreservation=false -Xem:server -jar synchTest.jar
> synchronized OPS     = 7939
> non-synchronized OPS = 50985
> 2.1 java -XX:thread.soft_unreservation=true -Xem:server -jar synchTest.jar
> synchronized OPS     = 25735
> non-synchronized OPS = 50998
> Thus turning the option on gives DRLVM speedup of 3.2x times. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.