You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/03/16 00:06:06 UTC

[GitHub] [kafka] jsancio opened a new pull request #10324: MINOR: Add a few more benchmark for the timeline map

jsancio opened a new pull request #10324:
URL: https://github.com/apache/kafka/pull/10324


   Improve the benchmark tests for TimelineHashMap by adding tests for adding entries, removing entries and Scala's immutable hash map.
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio commented on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio commented on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-799843745


   Benchmark results:
   
   ```
   # Run complete. Total time: 00:26:54
   
   Benchmark                                                 Mode  Cnt    Score    Error  Units
   TimelineHashMapBenchmark.testAddEntriesInHashMap          avgt   10  238.332 ±  4.554  ms/op
   TimelineHashMapBenchmark.testAddEntriesInImmutableMap     avgt   10  366.732 ±  6.463  ms/op
   TimelineHashMapBenchmark.testAddEntriesInTimelineMap      avgt   10  277.197 ±  4.699  ms/op
   TimelineHashMapBenchmark.testAddEntriesWithSnapshots      avgt   10  302.747 ±  4.959  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInHashMap       avgt   10  201.004 ±  2.675  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInImmutableMap  avgt   10  479.964 ±  7.254  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInTimelineMap   avgt   10  195.382 ±  1.917  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesWithSnapshots   avgt   10  427.747 ± 12.865  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInHashMap       avgt   10  267.895 ± 20.143  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInImmutableMap  avgt   10  532.843 ±  5.766  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInTimelineMap   avgt   10  364.766 ± 25.154  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesWithSnapshots   avgt   10  488.308 ± 43.992  ms/op
   JMH benchmarks done
   ```
   
   Benchmark configuration:
   
   ```
   # JMH version: 1.27
   # VM version: JDK 11.0.10, OpenJDK 64-Bit Server VM, 11.0.10+9-Ubuntu-0ubuntu1.20.10
   # VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
   # VM options: <none>
   # JMH blackhole mode: full blackhole + dont-inline hint
   # Warmup: 3 iterations, 10 s each
   # Measurement: 10 iterations, 10 s each
   # Timeout: 10 min per iteration
   # Threads: 1 thread, will synchronize iterations
   # Benchmark mode: Average time, time/op
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio edited a comment on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio edited a comment on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-799843745


   Benchmark results:
   
   ```
   # Run complete. Total time: 00:26:54
   
   Benchmark                                                 Mode  Cnt    Score    Error  Units
   TimelineHashMapBenchmark.testAddEntriesInHashMap          avgt   10  238.332 ±  4.554  ms/op
   TimelineHashMapBenchmark.testAddEntriesInImmutableMap     avgt   10  366.732 ±  6.463  ms/op
   TimelineHashMapBenchmark.testAddEntriesInTimelineMap      avgt   10  277.197 ±  4.699  ms/op
   TimelineHashMapBenchmark.testAddEntriesWithSnapshots      avgt   10  302.747 ±  4.959  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInHashMap       avgt   10  201.004 ±  2.675  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInImmutableMap  avgt   10  479.964 ±  7.254  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInTimelineMap   avgt   10  195.382 ±  1.917  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesWithSnapshots   avgt   10  427.747 ± 12.865  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInHashMap       avgt   10  267.895 ± 20.143  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInImmutableMap  avgt   10  532.843 ±  5.766  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInTimelineMap   avgt   10  364.766 ± 25.154  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesWithSnapshots   avgt   10  488.308 ± 43.992  ms/op
   TimelineHashMapBenchmark.testIterateEntriesInHashMap       avgt   10  42.297 ± 1.655  ms/op
   TimelineHashMapBenchmark.testIterateEntriesInImmutableMap  avgt   10  34.913 ± 0.903  ms/op
   JMH benchmarks done
   ```
   
   Benchmark configuration:
   
   ```
   # JMH version: 1.27
   # VM version: JDK 11.0.10, OpenJDK 64-Bit Server VM, 11.0.10+9-Ubuntu-0ubuntu1.20.10
   # VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
   # VM options: <none>
   # JMH blackhole mode: full blackhole + dont-inline hint
   # Warmup: 3 iterations, 10 s each
   # Measurement: 10 iterations, 10 s each
   # Timeout: 10 min per iteration
   # Threads: 1 thread, will synchronize iterations
   # Benchmark mode: Average time, time/op
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio edited a comment on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio edited a comment on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-799843745


   Benchmark results:
   
   ```
   Benchmark                                                  Mode  Cnt    Score    Error  Units
   TimelineHashMapBenchmark.testAddEntriesInHashMap           avgt   10  238.332 ±  4.554  ms/op
   TimelineHashMapBenchmark.testAddEntriesInImmutableMap      avgt   10  366.732 ±  6.463  ms/op
   TimelineHashMapBenchmark.testAddEntriesInTimelineMap       avgt   10  277.197 ±  4.699  ms/op
   TimelineHashMapBenchmark.testAddEntriesWithSnapshots       avgt   10  302.747 ±  4.959  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInHashMap        avgt   10  201.004 ±  2.675  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInImmutableMap   avgt   10  479.964 ±  7.254  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInTimelineMap    avgt   10  195.382 ±  1.917  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesWithSnapshots    avgt   10  427.747 ± 12.865  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInHashMap        avgt   10  267.895 ± 20.143  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInImmutableMap   avgt   10  532.843 ±  5.766  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInTimelineMap    avgt   10  364.766 ± 25.154  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesWithSnapshots    avgt   10  488.308 ± 43.992  ms/op
   TimelineHashMapBenchmark.testIterateEntriesInHashMap       avgt   10   42.297 ±  1.655  ms/op
   TimelineHashMapBenchmark.testIterateEntriesInImmutableMap  avgt   10   34.913 ±  0.903  ms/op
   JMH benchmarks done
   ```
   
   Benchmark configuration:
   
   ```
   # JMH version: 1.27
   # VM version: JDK 11.0.10, OpenJDK 64-Bit Server VM, 11.0.10+9-Ubuntu-0ubuntu1.20.10
   # VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
   # VM options: <none>
   # JMH blackhole mode: full blackhole + dont-inline hint
   # Warmup: 3 iterations, 10 s each
   # Measurement: 10 iterations, 10 s each
   # Timeout: 10 min per iteration
   # Threads: 1 thread, will synchronize iterations
   # Benchmark mode: Average time, time/op
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio commented on a change in pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio commented on a change in pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#discussion_r595483066



##########
File path: jmh-benchmarks/src/main/java/org/apache/kafka/jmh/timeline/TimelineHashMapBenchmark.java
##########
@@ -44,33 +49,126 @@
 public class TimelineHashMapBenchmark {
     private final static int NUM_ENTRIES = 1_000_000;
 
+    @State(Scope.Thread)
+    public static class HashMapInput {
+        public HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new HashMap<>(keys.size());
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class ImmutableMapInput {
+        scala.collection.immutable.HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new scala.collection.immutable.HashMap<>();
+            for (Integer key : keys) {
+                map = map.updated(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapSnapshotInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            int count = 0;
+            for (Integer key : keys) {
+                if (count % 1_000 == 0) {
+                    snapshotRegistry.deleteSnapshotsUpTo(count - 10_000);
+                    snapshotRegistry.createSnapshot(count);
+                }
+                map.put(key, String.valueOf(key));
+                count++;
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+
     @Benchmark
     public Map<Integer, String> testAddEntriesInHashMap() {
-        HashMap<Integer, String> map = new HashMap<>(NUM_ENTRIES);
+        HashMap<Integer, String> map = new HashMap<>();
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
             map.put(key, String.valueOf(key));
         }
+
+        return map;
+    }
+
+    @Benchmark
+    public scala.collection.immutable.HashMap<Integer, String> testAddEntriesInImmutableMap() {
+        scala.collection.immutable.HashMap<Integer, String> map = new scala.collection.immutable.HashMap<>();
+        for (int i = 0; i < NUM_ENTRIES; i++) {
+            int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
+            map = map.updated(key, String.valueOf(key));

Review comment:
       Good catch. I looks like we were mostly measuring converting an int to a String!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio edited a comment on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio edited a comment on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-799843745


   Benchmark results:
   
   ```
   # Run complete. Total time: 00:26:54
   
   Benchmark                                                  Mode  Cnt    Score    Error  Units
   TimelineHashMapBenchmark.testAddEntriesInHashMap           avgt   10  238.332 ±  4.554  ms/op
   TimelineHashMapBenchmark.testAddEntriesInImmutableMap      avgt   10  366.732 ±  6.463  ms/op
   TimelineHashMapBenchmark.testAddEntriesInTimelineMap       avgt   10  277.197 ±  4.699  ms/op
   TimelineHashMapBenchmark.testAddEntriesWithSnapshots       avgt   10  302.747 ±  4.959  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInHashMap        avgt   10  201.004 ±  2.675  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInImmutableMap   avgt   10  479.964 ±  7.254  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesInTimelineMap    avgt   10  195.382 ±  1.917  ms/op
   TimelineHashMapBenchmark.testRemoveEntriesWithSnapshots    avgt   10  427.747 ± 12.865  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInHashMap        avgt   10  267.895 ± 20.143  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInImmutableMap   avgt   10  532.843 ±  5.766  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesInTimelineMap    avgt   10  364.766 ± 25.154  ms/op
   TimelineHashMapBenchmark.testUpdateEntriesWithSnapshots    avgt   10  488.308 ± 43.992  ms/op
   TimelineHashMapBenchmark.testIterateEntriesInHashMap       avgt   10   42.297 ±  1.655  ms/op
   TimelineHashMapBenchmark.testIterateEntriesInImmutableMap  avgt   10   34.913 ±  0.903  ms/op
   JMH benchmarks done
   ```
   
   Benchmark configuration:
   
   ```
   # JMH version: 1.27
   # VM version: JDK 11.0.10, OpenJDK 64-Bit Server VM, 11.0.10+9-Ubuntu-0ubuntu1.20.10
   # VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
   # VM options: <none>
   # JMH blackhole mode: full blackhole + dont-inline hint
   # Warmup: 3 iterations, 10 s each
   # Measurement: 10 iterations, 10 s each
   # Timeout: 10 min per iteration
   # Threads: 1 thread, will synchronize iterations
   # Benchmark mode: Average time, time/op
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio commented on a change in pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio commented on a change in pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#discussion_r595670643



##########
File path: jmh-benchmarks/src/main/java/org/apache/kafka/jmh/timeline/TimelineHashMapBenchmark.java
##########
@@ -44,33 +49,126 @@
 public class TimelineHashMapBenchmark {
     private final static int NUM_ENTRIES = 1_000_000;
 
+    @State(Scope.Thread)
+    public static class HashMapInput {
+        public HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new HashMap<>(keys.size());
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class ImmutableMapInput {
+        scala.collection.immutable.HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new scala.collection.immutable.HashMap<>();
+            for (Integer key : keys) {
+                map = map.updated(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapSnapshotInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            int count = 0;
+            for (Integer key : keys) {
+                if (count % 1_000 == 0) {
+                    snapshotRegistry.deleteSnapshotsUpTo(count - 10_000);
+                    snapshotRegistry.createSnapshot(count);
+                }
+                map.put(key, String.valueOf(key));
+                count++;
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+
     @Benchmark
     public Map<Integer, String> testAddEntriesInHashMap() {
-        HashMap<Integer, String> map = new HashMap<>(NUM_ENTRIES);
+        HashMap<Integer, String> map = new HashMap<>();
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
             map.put(key, String.valueOf(key));
         }
+
+        return map;
+    }
+
+    @Benchmark
+    public scala.collection.immutable.HashMap<Integer, String> testAddEntriesInImmutableMap() {
+        scala.collection.immutable.HashMap<Integer, String> map = new scala.collection.immutable.HashMap<>();
+        for (int i = 0; i < NUM_ENTRIES; i++) {
+            int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
+            map = map.updated(key, String.valueOf(key));
+        }
+
         return map;
     }
 
     @Benchmark
     public Map<Integer, String> testAddEntriesInTimelineMap() {
         SnapshotRegistry snapshotRegistry = new SnapshotRegistry(new LogContext());
-        TimelineHashMap<Integer, String> map =
-            new TimelineHashMap<>(snapshotRegistry, NUM_ENTRIES);
+        TimelineHashMap<Integer, String> map = new TimelineHashMap<>(snapshotRegistry, 16);
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ijuma commented on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
ijuma commented on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-800284467


   Thanks. It would help to explain the goal of these benchmarks so that we can better review the results. Is it to avoid future regressions or also to compare with existing map implementations?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ijuma commented on a change in pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
ijuma commented on a change in pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#discussion_r595472914



##########
File path: jmh-benchmarks/src/main/java/org/apache/kafka/jmh/timeline/TimelineHashMapBenchmark.java
##########
@@ -44,33 +49,126 @@
 public class TimelineHashMapBenchmark {
     private final static int NUM_ENTRIES = 1_000_000;
 
+    @State(Scope.Thread)
+    public static class HashMapInput {
+        public HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new HashMap<>(keys.size());
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class ImmutableMapInput {
+        scala.collection.immutable.HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new scala.collection.immutable.HashMap<>();
+            for (Integer key : keys) {
+                map = map.updated(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapSnapshotInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            int count = 0;
+            for (Integer key : keys) {
+                if (count % 1_000 == 0) {
+                    snapshotRegistry.deleteSnapshotsUpTo(count - 10_000);
+                    snapshotRegistry.createSnapshot(count);
+                }
+                map.put(key, String.valueOf(key));
+                count++;
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+
     @Benchmark
     public Map<Integer, String> testAddEntriesInHashMap() {
-        HashMap<Integer, String> map = new HashMap<>(NUM_ENTRIES);
+        HashMap<Integer, String> map = new HashMap<>();
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
             map.put(key, String.valueOf(key));
         }
+
+        return map;
+    }
+
+    @Benchmark
+    public scala.collection.immutable.HashMap<Integer, String> testAddEntriesInImmutableMap() {
+        scala.collection.immutable.HashMap<Integer, String> map = new scala.collection.immutable.HashMap<>();
+        for (int i = 0; i < NUM_ENTRIES; i++) {
+            int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
+            map = map.updated(key, String.valueOf(key));
+        }
+
         return map;
     }
 
     @Benchmark
     public Map<Integer, String> testAddEntriesInTimelineMap() {
         SnapshotRegistry snapshotRegistry = new SnapshotRegistry(new LogContext());
-        TimelineHashMap<Integer, String> map =
-            new TimelineHashMap<>(snapshotRegistry, NUM_ENTRIES);
+        TimelineHashMap<Integer, String> map = new TimelineHashMap<>(snapshotRegistry, 16);
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));

Review comment:
       Hmm, I'd just generate the randoms during set-up and add them to an array.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio commented on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio commented on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-800747908


   @cmccabe the `testGetEntries` throw an exception when used against a snapshot:
   ```
   java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0
           at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
           at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
           at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
           at java.base/java.util.Objects.checkIndex(Objects.java:372)
           at java.base/java.util.ArrayList.get(ArrayList.java:459)
           at org.apache.kafka.jmh.timeline.TimelineHashMapBenchmark.testGetEntries(TimelineHashMapBenchmark.java:221)
           at org.apache.kafka.jmh.timeline.jmh_generated.TimelineHashMapBenchmark_testGetEntries_jmhTest.testGetEntries_avgt_jmhStub(TimelineHashMapBenchmark_testGetEntries_jmhTest.java:246)
           at org.apache.kafka.jmh.timeline.jmh_generated.TimelineHashMapBenchmark_testGetEntries_jmhTest.testGetEntries_AverageTime(TimelineHashMapBenchmark_testGetEntries_jmhTest.java:183)
           at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.base/java.lang.reflect.Method.invoke(Method.java:566)
           at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453)
           at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437)
           at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
           at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
           at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
           at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
           at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
           at java.base/java.lang.Thread.run(Thread.java:834)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ijuma commented on a change in pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
ijuma commented on a change in pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#discussion_r595396880



##########
File path: jmh-benchmarks/src/main/java/org/apache/kafka/jmh/timeline/TimelineHashMapBenchmark.java
##########
@@ -44,33 +49,126 @@
 public class TimelineHashMapBenchmark {
     private final static int NUM_ENTRIES = 1_000_000;
 
+    @State(Scope.Thread)
+    public static class HashMapInput {
+        public HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new HashMap<>(keys.size());
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class ImmutableMapInput {
+        scala.collection.immutable.HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new scala.collection.immutable.HashMap<>();
+            for (Integer key : keys) {
+                map = map.updated(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapSnapshotInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            int count = 0;
+            for (Integer key : keys) {
+                if (count % 1_000 == 0) {
+                    snapshotRegistry.deleteSnapshotsUpTo(count - 10_000);
+                    snapshotRegistry.createSnapshot(count);
+                }
+                map.put(key, String.valueOf(key));
+                count++;
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+
     @Benchmark
     public Map<Integer, String> testAddEntriesInHashMap() {
-        HashMap<Integer, String> map = new HashMap<>(NUM_ENTRIES);
+        HashMap<Integer, String> map = new HashMap<>();
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
             map.put(key, String.valueOf(key));
         }
+
+        return map;
+    }
+
+    @Benchmark
+    public scala.collection.immutable.HashMap<Integer, String> testAddEntriesInImmutableMap() {
+        scala.collection.immutable.HashMap<Integer, String> map = new scala.collection.immutable.HashMap<>();
+        for (int i = 0; i < NUM_ENTRIES; i++) {
+            int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
+            map = map.updated(key, String.valueOf(key));

Review comment:
       We don't want to be converting from int to string in the benchmark code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio commented on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio commented on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-800429068


   > Thanks. It would help to explain the goal of these benchmarks so that we can better review the results. Is it to avoid future regressions or also to compare with existing map implementations?
   
   Both.
   1. I suspect that we are going to make improvements to TimelineHashMap. It would be good to track those improvements.
   2. I was interested to see how Scala's immutable hash map compared to these maps.
   3. I wanted to better understand the performance of some of the other operations: remove, delete and iteration.
   
   I decided to upstream it as it may be useful in the future. What do you think @ijuma?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio commented on a change in pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio commented on a change in pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#discussion_r594806269



##########
File path: jmh-benchmarks/src/main/java/org/apache/kafka/jmh/timeline/TimelineHashMapBenchmark.java
##########
@@ -87,4 +189,129 @@
         }
         return map;
     }
+
+    @Benchmark
+    public Map<Integer, String> testUpdateEntriesInHashMap(HashMapInput input) {
+        for (Integer key : input.keys) {
+            input.map.put(key, String.valueOf(key));
+        }
+        return input.map;
+    }
+
+    @Benchmark
+    public scala.collection.Map testUpdateEntriesInImmutableMap(ImmutableMapInput input) {
+        scala.collection.immutable.HashMap<Integer, String> map = input.map;
+        for (Integer key : input.keys) {
+            map = map.updated(key, String.valueOf(key));
+        }
+        return map;
+    }
+
+    @Benchmark
+    public Map<Integer, String> testUpdateEntriesInTimelineMap(TimelineMapInput input) {
+        for (Integer key : input.keys) {
+            input.map.put(key, String.valueOf(key));
+        }
+        return input.map;
+    }
+
+    @Benchmark
+    public Map<Integer, String> testUpdateEntriesWithSnapshots(TimelineMapInput input) {
+        long epoch = 0;
+        int j = 0;
+        for (Integer key : input.keys) {
+            if (j > 1_000) {
+                input.snapshotRegistry.deleteSnapshotsUpTo(epoch - 10_000);
+                input.snapshotRegistry.createSnapshot(epoch);
+                j = 0;
+            } else {
+                j++;
+            }
+            input.map.put(key, String.valueOf(key));
+            epoch++;
+        }
+        return input.map;
+    }
+
+    @Benchmark
+    public Map<Integer, String> testRemoveEntriesInHashMap(HashMapInput input) {
+        for (Integer key : input.keys) {
+            input.map.remove(key);
+        }
+        return input.map;
+    }
+
+    @Benchmark
+    public scala.collection.Map testRemoveEntriesInImmutableMap(ImmutableMapInput input) {
+        scala.collection.immutable.HashMap<Integer, String> map = input.map;
+        for (Integer key : input.keys) {
+            map = map.removed(key);
+        }
+        return map;
+    }
+
+    @Benchmark
+    public Map<Integer, String> testRemoveEntriesInTimelineMap(TimelineMapInput input) {
+        for (Integer key : input.keys) {
+            input.map.remove(key);
+        }
+        return input.map;
+    }
+
+    @Benchmark
+    public Map<Integer, String> testRemoveEntriesWithSnapshots(TimelineMapInput input) {
+        long epoch = 0;
+        int j = 0;
+        for (Integer key : input.keys) {
+            if (j > 1_000) {
+                input.snapshotRegistry.deleteSnapshotsUpTo(epoch - 10_000);
+                input.snapshotRegistry.createSnapshot(epoch);
+                j = 0;
+            } else {
+                j++;
+            }
+            input.map.remove(key, String.valueOf(key));
+            epoch++;
+        }
+        return input.map;
+    }
+
+    @Benchmark
+    public int testIterateEntriesInHashMap(HashMapInput input) {
+        int count = 0;
+        for (HashMap.Entry<Integer, String> entry : input.map.entrySet()) {
+            count++;
+        }
+        return count;
+    }
+
+    @Benchmark
+    public int testIterateEntriesInImmutableMap(ImmutableMapInput input) {
+        int count = 0;
+        scala.collection.Iterator<scala.Tuple2<Integer, String>> iterator = input.map.iterator();
+        while (iterator.hasNext()) {
+            iterator.next();
+            count++;
+        }
+        return count;
+    }
+
+    @Benchmark
+    public int testIterateEntriesWithSnapshots(TimelineMapSnapshotInput input) {
+        int count = 0;
+        for (TimelineHashMap.Entry<Integer, String> entry : input.map.entrySet(input.epoch)) {

Review comment:
       @cmccabe It looks like this benchmark fails with the following exception. Any idea on what's the issue?
   ```
   java.lang.ArrayIndexOutOfBoundsException: Index 1024 out of bounds for length 1024
           at org.apache.kafka.timeline.BaseHashTable.unpackSlot(BaseHashTable.java:210)
           at org.apache.kafka.timeline.SnapshottableHashTable$HistoricalIterator.hasNext(SnapshottableHashTable.java:255)
           at org.apache.kafka.timeline.TimelineHashMap$EntryIterator.hasNext(TimelineHashMap.java:359)
           at org.apache.kafka.jmh.timeline.TimelineHashMapBenchmark.testIterateEntriesWithSnapshots(TimelineHashMapBenchmark.java:303)
           at org.apache.kafka.jmh.timeline.jmh_generated.TimelineHashMapBenchmark_testIterateEntriesWithSnapshots_jmhTest.testIterateEntriesWithSnapshots_avgt_jmhStub(TimelineHashMapBenchmark_testIterateEntriesWithSnapshots_jmhTest.java:204)
           at org.apache.kafka.jmh.timeline.jmh_generated.TimelineHashMapBenchmark_testIterateEntriesWithSnapshots_jmhTest.testIterateEntriesWithSnapshots_AverageTime(TimelineHashMapBenchmark_testIterateEntriesWithSnapshots_jmhTest.java:162)
           at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.base/java.lang.reflect.Method.invoke(Method.java:566)
           at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453)
           at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437)
           at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
           at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
           at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
           at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
           at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
           at java.base/java.lang.Thread.run(Thread.java:834)
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ijuma commented on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
ijuma commented on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-800538930


   Btw, you can find openjdk benchmarks here: https://github.com/openjdk/jdk/tree/master/test/micro/org/openjdk/bench/java/util


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ijuma commented on a change in pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
ijuma commented on a change in pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#discussion_r595397258



##########
File path: jmh-benchmarks/src/main/java/org/apache/kafka/jmh/timeline/TimelineHashMapBenchmark.java
##########
@@ -44,33 +49,126 @@
 public class TimelineHashMapBenchmark {
     private final static int NUM_ENTRIES = 1_000_000;
 
+    @State(Scope.Thread)
+    public static class HashMapInput {
+        public HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new HashMap<>(keys.size());
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class ImmutableMapInput {
+        scala.collection.immutable.HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new scala.collection.immutable.HashMap<>();
+            for (Integer key : keys) {
+                map = map.updated(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapSnapshotInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            int count = 0;
+            for (Integer key : keys) {
+                if (count % 1_000 == 0) {
+                    snapshotRegistry.deleteSnapshotsUpTo(count - 10_000);
+                    snapshotRegistry.createSnapshot(count);
+                }
+                map.put(key, String.valueOf(key));
+                count++;
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+
     @Benchmark
     public Map<Integer, String> testAddEntriesInHashMap() {
-        HashMap<Integer, String> map = new HashMap<>(NUM_ENTRIES);
+        HashMap<Integer, String> map = new HashMap<>();
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
             map.put(key, String.valueOf(key));
         }
+
+        return map;
+    }
+
+    @Benchmark
+    public scala.collection.immutable.HashMap<Integer, String> testAddEntriesInImmutableMap() {
+        scala.collection.immutable.HashMap<Integer, String> map = new scala.collection.immutable.HashMap<>();
+        for (int i = 0; i < NUM_ENTRIES; i++) {
+            int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
+            map = map.updated(key, String.valueOf(key));
+        }
+
         return map;
     }
 
     @Benchmark
     public Map<Integer, String> testAddEntriesInTimelineMap() {
         SnapshotRegistry snapshotRegistry = new SnapshotRegistry(new LogContext());
-        TimelineHashMap<Integer, String> map =
-            new TimelineHashMap<>(snapshotRegistry, NUM_ENTRIES);
+        TimelineHashMap<Integer, String> map = new TimelineHashMap<>(snapshotRegistry, 16);
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));

Review comment:
       Why are we doing these things?

##########
File path: jmh-benchmarks/src/main/java/org/apache/kafka/jmh/timeline/TimelineHashMapBenchmark.java
##########
@@ -44,33 +49,126 @@
 public class TimelineHashMapBenchmark {
     private final static int NUM_ENTRIES = 1_000_000;
 
+    @State(Scope.Thread)
+    public static class HashMapInput {
+        public HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new HashMap<>(keys.size());
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class ImmutableMapInput {
+        scala.collection.immutable.HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new scala.collection.immutable.HashMap<>();
+            for (Integer key : keys) {
+                map = map.updated(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapSnapshotInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            int count = 0;
+            for (Integer key : keys) {
+                if (count % 1_000 == 0) {
+                    snapshotRegistry.deleteSnapshotsUpTo(count - 10_000);
+                    snapshotRegistry.createSnapshot(count);
+                }
+                map.put(key, String.valueOf(key));
+                count++;
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+
     @Benchmark
     public Map<Integer, String> testAddEntriesInHashMap() {
-        HashMap<Integer, String> map = new HashMap<>(NUM_ENTRIES);
+        HashMap<Integer, String> map = new HashMap<>();
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
             map.put(key, String.valueOf(key));
         }
+
+        return map;
+    }
+
+    @Benchmark
+    public scala.collection.immutable.HashMap<Integer, String> testAddEntriesInImmutableMap() {

Review comment:
       Can we make it clear that this is ScalaImmutableMap?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio commented on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio commented on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-800747379


   @ijuma Ready for review. I changed the benchmark structure to remove duplicate code.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio commented on a change in pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio commented on a change in pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#discussion_r595464237



##########
File path: jmh-benchmarks/src/main/java/org/apache/kafka/jmh/timeline/TimelineHashMapBenchmark.java
##########
@@ -44,33 +49,126 @@
 public class TimelineHashMapBenchmark {
     private final static int NUM_ENTRIES = 1_000_000;
 
+    @State(Scope.Thread)
+    public static class HashMapInput {
+        public HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new HashMap<>(keys.size());
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class ImmutableMapInput {
+        scala.collection.immutable.HashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            map = new scala.collection.immutable.HashMap<>();
+            for (Integer key : keys) {
+                map = map.updated(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+    @State(Scope.Thread)
+    public static class TimelineMapSnapshotInput {
+        public SnapshotRegistry snapshotRegistry;
+        public TimelineHashMap<Integer, String> map;
+        public final List<Integer> keys = createKeys(NUM_ENTRIES);
+
+        @Setup(Level.Invocation)
+        public void setup() {
+            snapshotRegistry = new SnapshotRegistry(new LogContext());
+            map = new TimelineHashMap<>(snapshotRegistry, keys.size());
+
+            for (Integer key : keys) {
+                map.put(key, String.valueOf(key));
+            }
+
+            int count = 0;
+            for (Integer key : keys) {
+                if (count % 1_000 == 0) {
+                    snapshotRegistry.deleteSnapshotsUpTo(count - 10_000);
+                    snapshotRegistry.createSnapshot(count);
+                }
+                map.put(key, String.valueOf(key));
+                count++;
+            }
+
+            Collections.shuffle(keys);
+        }
+    }
+
+
     @Benchmark
     public Map<Integer, String> testAddEntriesInHashMap() {
-        HashMap<Integer, String> map = new HashMap<>(NUM_ENTRIES);
+        HashMap<Integer, String> map = new HashMap<>();
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
             map.put(key, String.valueOf(key));
         }
+
+        return map;
+    }
+
+    @Benchmark
+    public scala.collection.immutable.HashMap<Integer, String> testAddEntriesInImmutableMap() {
+        scala.collection.immutable.HashMap<Integer, String> map = new scala.collection.immutable.HashMap<>();
+        for (int i = 0; i < NUM_ENTRIES; i++) {
+            int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));
+            map = map.updated(key, String.valueOf(key));
+        }
+
         return map;
     }
 
     @Benchmark
     public Map<Integer, String> testAddEntriesInTimelineMap() {
         SnapshotRegistry snapshotRegistry = new SnapshotRegistry(new LogContext());
-        TimelineHashMap<Integer, String> map =
-            new TimelineHashMap<>(snapshotRegistry, NUM_ENTRIES);
+        TimelineHashMap<Integer, String> map = new TimelineHashMap<>(snapshotRegistry, 16);
         for (int i = 0; i < NUM_ENTRIES; i++) {
             int key = (int) (0xffffffff & ((i * 2862933555777941757L) + 3037000493L));

Review comment:
       I think this is an algorithm for generating pseudo random number. I think it relates to https://nuclear.llnl.gov/CNP/rng/rngman/node4.html.
   
   If this is true, let me fix the expression as it is supposed to multiply by `key` not `i`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jsancio commented on pull request #10324: MINOR: Add a few more benchmark for the timeline map

Posted by GitBox <gi...@apache.org>.
jsancio commented on pull request #10324:
URL: https://github.com/apache/kafka/pull/10324#issuecomment-800745760


   Test result after fixing the benchmark tests:
   ```
   Benchmark                                                (mapType)   (size)  Mode  Cnt    Score    Error  Units
   TimelineHashMapBenchmark.testAddEntries                   HASH_MAP  1000000  avgt   10  184.183 ± 12.318  ms/op
   TimelineHashMapBenchmark.testAddEntries             SCALA_HASH_MAP  1000000  avgt   10  350.935 ±  4.801  ms/op
   TimelineHashMapBenchmark.testAddEntries               TIMELINE_MAP  1000000  avgt   10  340.839 ± 15.397  ms/op
   TimelineHashMapBenchmark.testAddEntries      TIMELINE_SNAPSHOT_MAP  1000000  avgt   10  332.535 ± 36.350  ms/op
   TimelineHashMapBenchmark.testGetEntries                   HASH_MAP  1000000  avgt   10   37.772 ±  4.717  ms/op
   TimelineHashMapBenchmark.testGetEntries             SCALA_HASH_MAP  1000000  avgt   10  248.350 ±  4.445  ms/op
   TimelineHashMapBenchmark.testGetEntries               TIMELINE_MAP  1000000  avgt   10   83.487 ±  6.952  ms/op
   TimelineHashMapBenchmark.testIterateEntries               HASH_MAP  1000000  avgt   10   42.743 ±  1.184  ms/op
   TimelineHashMapBenchmark.testIterateEntries         SCALA_HASH_MAP  1000000  avgt   10   36.030 ±  0.937  ms/op
   TimelineHashMapBenchmark.testIterateEntries           TIMELINE_MAP  1000000  avgt   10   54.760 ±  2.866  ms/op
   TimelineHashMapBenchmark.testRemoveEntries                HASH_MAP  1000000  avgt   10   26.246 ±  1.141  ms/op
   TimelineHashMapBenchmark.testRemoveEntries          SCALA_HASH_MAP  1000000  avgt   10  430.861 ± 13.864  ms/op
   TimelineHashMapBenchmark.testRemoveEntries            TIMELINE_MAP  1000000  avgt   10   79.832 ± 12.833  ms/op
   TimelineHashMapBenchmark.testRemoveEntries   TIMELINE_SNAPSHOT_MAP  1000000  avgt   10  185.170 ± 13.464  ms/op
   TimelineHashMapBenchmark.testUpdateEntries                HASH_MAP  1000000  avgt   10   84.963 ± 10.411  ms/op
   TimelineHashMapBenchmark.testUpdateEntries          SCALA_HASH_MAP  1000000  avgt   10  426.490 ±  6.468  ms/op
   TimelineHashMapBenchmark.testUpdateEntries            TIMELINE_MAP  1000000  avgt   10  160.341 ± 13.799  ms/op
   TimelineHashMapBenchmark.testUpdateEntries   TIMELINE_SNAPSHOT_MAP  1000000  avgt   10  300.875 ± 35.965  ms/op
   JMH benchmarks done
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org