You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/12/04 06:26:01 UTC

[GitHub] [pinot] mqliang opened a new issue #7866: Benchmark result of ThreadTimer, System.currentTimeMillis() and System.nanoTime()

mqliang opened a new issue #7866:
URL: https://github.com/apache/pinot/issues/7866


   Benchmark code
   ```
   @BenchmarkMode(Mode.AverageTime)
   @OutputTimeUnit(TimeUnit.NANOSECONDS)
   @Measurement(iterations = 5, time = 30)
   @Fork(1)
   @State(Scope.Benchmark)
   public class BenchmarkTimer {
   
     public static void main(String[] args)
         throws Exception {
       ChainedOptionsBuilder opt = new OptionsBuilder().include(BenchmarkTimer.class.getSimpleName());
       new Runner(opt.build()).run();
     }
   
     @Benchmark
     public void benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       long totalThreadCpuTimeNs = threadTimer.getThreadTimeNs();
     }
   
     @Benchmark
     public void benchmarkSystemCurrentTimeMillis() {
       long startWallClockTimeMs = System.currentTimeMillis();
       long totalWallClockTimeMs = System.currentTimeMillis() - startWallClockTimeMs;
       long totalWallClockTimeNs = TimeUnit.MILLISECONDS.toNanos(totalWallClockTimeMs);
     }
   
     @Benchmark
     public void benchmarkSystemNanoTime() {
       long startWallClockTimeNs = System.nanoTime();
       long totalWallClockTimeNs = System.nanoTime() - startWallClockTimeNs;
     }
   }
   
   ```
   
   Benchmark result of adoptopenjdk-11.jdk on macOS Monterey v12.0.1
   ```
   Benchmark                                        Mode  Cnt   Score    Error  Units
   BenchmarkTimer.benchmarkSystemCurrentTimeMillis  avgt    5  54.111 ± 12.025  ns/op
   BenchmarkTimer.benchmarkSystemNanoTime           avgt    5  61.704 ±  0.645  ns/op
   BenchmarkTimer.benchmarkThreadCpuTimer           avgt    5   0.526 ±  0.023  ns/op
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on issue #7866: Benchmark of ThreadTimer, System.currentTimeMillis() and System.nanoTime()

Posted by GitBox <gi...@apache.org>.
richardstartin commented on issue #7866:
URL: https://github.com/apache/pinot/issues/7866#issuecomment-985997796


   For the sake of posterity, here are the results for `benchmarkThreadCpuTimer` with and without changes on my MacBook:
   
   DCE:
   ```java
     @Benchmark
     public void benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       long totalThreadCpuTimeNs = threadTimer.getThreadTimeNs();
     }
   ```
   
   ```
   Benchmark                               Mode  Cnt  Score   Error  Units
   BenchmarkTimer.benchmarkThreadCpuTimer  avgt    5  0.523 ± 0.106  ns/op
   ```
   
   Blackhole, measurement disabled:
   
   ```java
     @Benchmark
     public long benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       return threadTimer.getThreadTimeNs();
     }
   ```
   
   ```
   Benchmark                               Mode  Cnt  Score   Error  Units
   BenchmarkTimer.benchmarkThreadCpuTimer  avgt    5  1.988 ± 0.151  ns/op
   ```
   
   Blackhole, measurement enabled:
   
   ```java
     @Setup(Level.Trial)
     public void setup() {
       ThreadTimer.setThreadCpuTimeMeasurementEnabled(true);
     }
   
     @Benchmark
     public long benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       return threadTimer.getThreadTimeNs();
     }
   ```
   
   ```
   Benchmark                               Mode  Cnt     Score     Error  Units
   BenchmarkTimer.benchmarkThreadCpuTimer  avgt    5  2382.225 ± 304.219  ns/op
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on issue #7866: Benchmark of ThreadTimer, System.currentTimeMillis() and System.nanoTime()

Posted by GitBox <gi...@apache.org>.
richardstartin commented on issue #7866:
URL: https://github.com/apache/pinot/issues/7866#issuecomment-985996820


   In fact there's a simpler problem with the benchmark (once you prevent dead code elimination you should see ~2-5ns for `benchmarkThreadCpuTimer`) but you need to call `ThreadTimer.setThreadCpuTimeMeasurementEnabled(true)` to actually measure anything:
   
   ```java
   @BenchmarkMode(Mode.AverageTime)
   @OutputTimeUnit(TimeUnit.NANOSECONDS)
   @Warmup(iterations = 5, time = 1)
   @Measurement(iterations = 5, time = 1)
   @Fork(1)
   @State(Scope.Benchmark)
   public class BenchmarkTimer {
   
     public static void main(String[] args)
         throws Exception {
       ChainedOptionsBuilder opt = new OptionsBuilder().include(BenchmarkTimer.class.getSimpleName());
       new Runner(opt.build()).run();
     }
   
     @Setup(Level.Trial)
     public void setup() {
       ThreadTimer.setThreadCpuTimeMeasurementEnabled(true);
     }
   
     @Benchmark
     public long benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       return threadTimer.getThreadTimeNs();
     }
   
     @Benchmark
     public long benchmarkSystemCurrentTimeMillis() {
       long startWallClockTimeMs = System.currentTimeMillis();
       long totalWallClockTimeMs = System.currentTimeMillis() - startWallClockTimeMs;
       return TimeUnit.MILLISECONDS.toNanos(totalWallClockTimeMs);
     }
   
     @Benchmark
     public long benchmarkSystemNanoTime() {
       long startWallClockTimeNs = System.nanoTime();
       return System.nanoTime() - startWallClockTimeNs;
     }
   }
   ```
   
   I ran this on my MacBook Pro
   ```
   Benchmark                                        Mode  Cnt     Score     Error  Units
   BenchmarkTimer.benchmarkSystemCurrentTimeMillis  avgt    5    52.619 ±   4.729  ns/op
   BenchmarkTimer.benchmarkSystemNanoTime           avgt    5    61.667 ±   6.172  ns/op
   BenchmarkTimer.benchmarkThreadCpuTimer           avgt    5  2504.818 ± 181.180  ns/op
   ```
   
   Again, if you want to see this really blow up, run it on Xen.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin edited a comment on issue #7866: Benchmark of ThreadTimer, System.currentTimeMillis() and System.nanoTime()

Posted by GitBox <gi...@apache.org>.
richardstartin edited a comment on issue #7866:
URL: https://github.com/apache/pinot/issues/7866#issuecomment-985992967


   Hi @mqliang, there’s a problem with your benchmark: unless you present the produced value to a blackhole (or return the value if there’s only one) the JIT compiler may notice that the value is never used, and then eliminate all the code used to produce the value. That is, the code you’re measuring may not execute. There may be a side effect in the invocation of the method which prevents dead code elimination.
   
   A good way to spot this is to ask yourself how many cycles must the measured code have taken at your processor frequency. Assuming your processor frequency is 4GHz, 0.5ns is 2 cycles, which isn’t enough to have made even a userspace function call to to read a clock.
   
   You will get different results if you return the value:
   
    @Benchmark
     public long benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       long totalThreadCpuTimeNs = threadTimer.getThreadTimeNs();
       return totalThreadCpuTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemCurrentTimeMillis() {
       long startWallClockTimeMs = System.currentTimeMillis();
       long totalWallClockTimeMs = System.currentTimeMillis() - startWallClockTimeMs;
       long totalWallClockTimeNs = TimeUnit.MILLISECONDS.toNanos(totalWallClockTimeMs);
      return totalWallClockTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemNanoTime() {
       long startWallClockTimeNs = System.nanoTime();
       long totalWallClockTimeNs = System.nanoTime() - startWallClockTimeNs;
       return startWallClockTimeNs;
     }
   
   System.currentTimeMills() and System.nanoTime() generally do *not* require a syscall because they call in to clock_gettime(CLOCK_REALTIME) and clock_gettime(CLOCK_MONOTONIC) respectively which have been in vDSO for a very long time. When these clocks are userspace calls, you should expect latencies of 20-30ns. When they are syscalls expect hundreds of nanoseconds. To see what actual syscall overhead looks like, run the fixed benchmark on a virtualised instance using a ‘xen’ clocksource and you will see the difference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin edited a comment on issue #7866: Benchmark of ThreadTimer, System.currentTimeMillis() and System.nanoTime()

Posted by GitBox <gi...@apache.org>.
richardstartin edited a comment on issue #7866:
URL: https://github.com/apache/pinot/issues/7866#issuecomment-985992967


   Hi @mqliang, there’s a problem with your benchmark: unless you present the produced value to a blackhole (or return the value if there’s only one) the JIT compiler may notice that the value is never used, and then eliminate all the code used to produce the value. That is, the code you’re measuring may not execute. There may be a side effect in the invocation of the method which prevents dead code elimination.
   
   A good way to spot this is to ask yourself how many cycles must the measured code have taken at your processor frequency. Assuming your processor frequency is 4GHz, 0.5ns is 2 cycles, which isn’t enough to have made a even a userspace function call to to read a clock.
   
   You will get different results if you return the value:
   
    @Benchmark
     public long benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       long totalThreadCpuTimeNs = threadTimer.getThreadTimeNs();
       return totalThreadCpuTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemCurrentTimeMillis() {
       long startWallClockTimeMs = System.currentTimeMillis();
       long totalWallClockTimeMs = System.currentTimeMillis() - startWallClockTimeMs;
       long totalWallClockTimeNs = TimeUnit.MILLISECONDS.toNanos(totalWallClockTimeMs);
      return totalWallClockTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemNanoTime() {
       long startWallClockTimeNs = System.nanoTime();
       long totalWallClockTimeNs = System.nanoTime() - startWallClockTimeNs;
       return startWallClockTimeNs;
     }
   
   System.currentTimeMills() and System.nanoTime() generally do *not* require a syscall because they call in to clock_gettime(CLOCK_REALTIME) and clock_gettime(CLOCK_MONOTONIC) respectively which have been in vDSO for a very long time. When these clocks are userspace calls, you should expect latencies of 20-30ns. When they are syscalls expect hundreds of nanoseconds. To see what actual syscall overhead looks like, run the fixed benchmark on a virtualised instance using a ‘xen’ clocksource and you will see the difference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin edited a comment on issue #7866: Benchmark of ThreadTimer, System.currentTimeMillis() and System.nanoTime()

Posted by GitBox <gi...@apache.org>.
richardstartin edited a comment on issue #7866:
URL: https://github.com/apache/pinot/issues/7866#issuecomment-985992967


   Hi @mqliang, there’s a problem with your benchmark: unless you present the produced value to a blackhole (or return the value if there’s only one) the JIT compiler may notice that the value is never used, and then eliminate all the code used to produce the value. That is, the code you’re measuring may not execute. There may be a side effect in the invocation of the method which prevents dead code elimination.
   
   A good way to spot this is to ask yourself how many cycles must the measured code have taken at your processor frequency. Assuming your processor frequency is 4GHz, 0.5ns is 2 cycles, which isn’t enough to have made even a userspace function call to to read a clock.
   
   You will get different results if you return the value:
   
   ```java
    @Benchmark
     public long benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       long totalThreadCpuTimeNs = threadTimer.getThreadTimeNs();
       return totalThreadCpuTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemCurrentTimeMillis() {
       long startWallClockTimeMs = System.currentTimeMillis();
       long totalWallClockTimeMs = System.currentTimeMillis() - startWallClockTimeMs;
       long totalWallClockTimeNs = TimeUnit.MILLISECONDS.toNanos(totalWallClockTimeMs);
      return totalWallClockTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemNanoTime() {
       long startWallClockTimeNs = System.nanoTime();
       long totalWallClockTimeNs = System.nanoTime() - startWallClockTimeNs;
       return startWallClockTimeNs;
     }
   ```
   
   System.currentTimeMills() and System.nanoTime() generally do *not* require a syscall because they call in to clock_gettime(CLOCK_REALTIME) and clock_gettime(CLOCK_MONOTONIC) respectively which have been in vDSO for a very long time. When these clocks are userspace calls, you should expect latencies of 20-30ns. When they are syscalls expect hundreds of nanoseconds. To see what actual syscall overhead looks like, run the fixed benchmark on a virtualised instance using a ‘xen’ clocksource and you will see the difference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on issue #7866: Benchmark of ThreadTimer, System.currentTimeMillis() and System.nanoTime()

Posted by GitBox <gi...@apache.org>.
richardstartin commented on issue #7866:
URL: https://github.com/apache/pinot/issues/7866#issuecomment-985992967


   Hi @mqliang, there’s a problem with your benchmark: unless you present the produced value to a blackhole (or return the value if there’s only one) the JIT compiler may notice that the value is never used, and then eliminate all the code used to produce the value. That is, the code you’re measuring may not execute. There may be some a side effect in the invocation of the method which prevents dead code elimination.
   
   A good way to spot this is to ask yourself how many cycles must the measured code have taken at your processor frequency. Assuming your processor frequency is 4GHz, 0.5ns is 2 cycles, which isn’t enough to have made a even a userspace function call to to read a clock.
   
   You will get different results if you return the value:
   
    @Benchmark
     public long benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       long totalThreadCpuTimeNs = threadTimer.getThreadTimeNs();
       return totalThreadCpuTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemCurrentTimeMillis() {
       long startWallClockTimeMs = System.currentTimeMillis();
       long totalWallClockTimeMs = System.currentTimeMillis() - startWallClockTimeMs;
       long totalWallClockTimeNs = TimeUnit.MILLISECONDS.toNanos(totalWallClockTimeMs);
      return totalWallClockTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemNanoTime() {
       long startWallClockTimeNs = System.nanoTime();
       long totalWallClockTimeNs = System.nanoTime() - startWallClockTimeNs;
       return startWallClockTimeNs;
     }
   
   System.currentTimeMills() and System.nanoTime() generally do *not* require a syscall because they call in to clock_gettime(CLOCK_REALTIME) and clock_gettime(CLOCK_MONOTONIC) respectively which have been in vDSO for a very long time. When these clocks are userspace calls, you should expect latencies of 20-30ns. When they are syscalls expect hundreds of nanoseconds. To see what actual syscall overhead looks like, run the fixed benchmark on a virtualised instance using a ‘xen’ clocksource and you will see the difference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mqliang commented on issue #7866: Benchmark of ThreadTimer, System.currentTimeMillis() and System.nanoTime()

Posted by GitBox <gi...@apache.org>.
mqliang commented on issue #7866:
URL: https://github.com/apache/pinot/issues/7866#issuecomment-985981705


   A good doc about currentTimeMillis(),nanoTime() and vDSO : https://pzemtsov.github.io/2017/07/23/the-slow-currenttimemillis.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org