You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/07/09 22:58:06 UTC
[GitHub] [incubator-druid] himanshug commented on issue #8031: remove unnecessary synchronization overhead from complex Aggregators

himanshug commented on issue #8031: remove unnecessary synchronization overhead from complex Aggregators
URL: https://github.com/apache/incubator-druid/issues/8031#issuecomment-509840466
 
 
   I ran this benchmark 
   ```
   @State(Scope.Benchmark)
   public class SynchronizeOverhead
   {
     private int n = 1000000;
   
     @Benchmark
     @BenchmarkMode(Mode.AverageTime)
     @OutputTimeUnit(TimeUnit.MICROSECONDS)
     public void unsynchronizedAdd(Blackhole blackhole)
     {
       for (int i = 0; i < n; i++) {
         blackhole.consume(1);
       }
     }
   
     @Benchmark
     @BenchmarkMode(Mode.AverageTime)
     @OutputTimeUnit(TimeUnit.MICROSECONDS)
     public void synchronizedAdd(Blackhole blackhole)
     {
       for (int i = 0; i < n; i++) {
         synchronized(this) {
           blackhole.consume(1);
         }
       }
     }
   
     public static void main(String[] args) throws RunnerException
     {
       Options opt = new OptionsBuilder()
           .include(SynchronizeOverhead.class.getSimpleName())
           .warmupIterations(5)
           .measurementIterations(10)
           .forks(1)
           .build();
   
       new Runner(opt).run();
     }
   }
   ```
   
   which gave me...
   ```
   Benchmark                              Mode  Cnt      Score     Error  Units
   SynchronizeOverhead.synchronizedAdd    avgt   10  18509.829 ± 372.391  us/op
   SynchronizeOverhead.unsynchronizedAdd  avgt   10   1828.303 ±  50.494  us/op
   ```
   
   Sidenote/Caution: I did run it on my mac laptop(which is not ideal for running benchmarks) as I didn't have a linux or EC2 instance handy, but I did 5 warmup and 10 measurement iterations, so it is not too bad for high level understanding.
   
   that says that there is about 15 ms overhead for 1mn  lock/release on object locks. I am pretty sure this is negligible compared to time spent doing sketch operations in all those synchronized blocks. So, I don't think this change will have any measurable impact on performance overall.
   (In fact, I am not entirely sure but It is also possible that server jvm can further dynamically optimize the lock away when its JIT profiling discovers that synchronized block is never accessed concurrently, that will make the overhead go away altogether)
   
   so, code for this proposal  could only be useful for reducing some confusion for extension writers where this question keeps coming up that they have unnecessary synchronization overhead inside their aggregator implementations. It might be good enough for facts getting documented in this proposal so that those worries can be alleviated by just pointing those conversations to this page.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org