You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/02/04 02:01:23 UTC

[GitHub] [druid] gianm opened a new pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

gianm opened a new pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308

With some key differences to improve speed and design simplicity:

1) Uses Memory rather than ByteBuffer for its backing storage.
2) Uses faster hashing and comparison routines (see HashTableUtils).
3) Capacity is always a power of two, allowing simpler design and more
efficient implementation of findBucket.
4) Does not implement growability; instead, leaves that to its callers.
The idea is this removes the need for subclasses, while still giving
callers flexibility in how to handle table-full scenarios.

The combination of these techniques above can boost performance of
per-segment groupBy processing on realistic queries by 30%+ (in some
cases I tested, it was even more extreme: 2–4x when grouping by single
long dimensions).

The idea is that in time, users of ByteBufferHashTable will be migrated
to this implementation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374975563
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/MemoryOpenHashTable.java
 ##########
 @@ -0,0 +1,433 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import it.unimi.dsi.fastutil.ints.IntIterator;
+import org.apache.datasketches.memory.Memory;
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.groupby.epinephelinae.Groupers;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.NoSuchElementException;
+
+/**
+ * An open-addressed hash table with linear probing backed by {@link WritableMemory}. Does not offer a similar
+ * interface to {@link java.util.Map} because this is meant to be useful to lower-level, high-performance callers.
+ * There is no copying or serde of keys and values: callers access the backing memory of the table directly.
+ *
+ * This table will not grow itself. Callers must handle growing if required; the {@link #copyTo} method is provided
+ * to assist.
+ */
+public class MemoryOpenHashTable
+{
+  private static final byte USED_BYTE = 1;
+  private static final int USED_BYTE_SIZE = Byte.BYTES;
+
+  private final WritableMemory tableMemory;
+  private final int keySize;
+  private final int valueSize;
+  private final int bucketSize;
+
+  // Maximum number of elements in the table (based on numBuckets and maxLoadFactor).
+  private final int maxSize;
+
+  // Number of available/used buckets in the table. Always a power of two.
+  private final int numBuckets;
+
+  // Mask that clips a number to [0, numBuckets). Used when searching through buckets.
+  private final int bucketMask;
+
+  // Number of elements in the table right now.
+  private int size;
+
+  /**
+   * Create a new table.
+   *
+   * @param tableMemory backing memory for the table; must be exactly large enough to hold "numBuckets"
+   * @param numBuckets  number of buckets for the table
+   * @param maxSize     maximum number of elements for the table; must be less than numBuckets
+   * @param keySize     key size in bytes
+   * @param valueSize   value size in bytes
+   */
+  public MemoryOpenHashTable(
+      final WritableMemory tableMemory,
+      final int numBuckets,
+      final int maxSize,
+      final int keySize,
+      final int valueSize
+  )
+  {
+    this.tableMemory = tableMemory;
+    this.numBuckets = numBuckets;
+    this.bucketMask = numBuckets - 1;
+    this.maxSize = maxSize;
+    this.keySize = keySize;
+    this.valueSize = valueSize;
+    this.bucketSize = bucketSize(keySize, valueSize);
+
+    // Our main user today (HashVectorGrouper) needs the tableMemory to be backed by a big-endian ByteBuffer that is
+    // coterminous with the tableMemory, since it's going to feed that buffer into VectorAggregators instead of
+    // interacting with our WritableMemory directly. Nothing about this class actually requires that the Memory be
+    // backed by a ByteBuffer, but we'll check it here anyway for the benefit of our biggest customer.
+    verifyMemoryIsByteBuffer(tableMemory);
+
+    if (!tableMemory.getTypeByteOrder().equals(ByteOrder.nativeOrder())) {
+      throw new ISE("tableMemory must be native byte order");
+    }
+
+    if (tableMemory.getCapacity() != memoryNeeded(numBuckets, bucketSize)) {
+      throw new ISE(
+          "tableMemory must be size[%,d] but was[%,d]",
+          memoryNeeded(numBuckets, bucketSize),
+          tableMemory.getCapacity()
+      );
+    }
+
+    if (maxSize >= numBuckets) {
+      throw new ISE("maxSize must be less than numBuckets");
+    }
+
+    if (Integer.bitCount(numBuckets) != 1) {
+      throw new ISE("numBuckets must be a power of two but was[%,d]", numBuckets);
+    }
+
+    clear();
+  }
+
+  /**
+   * Returns the amount of memory needed for a table.
+   *
+   * This is just a multiplication, which is easy enough to do on your own, but sometimes it's nice for clarity's sake
+   * to call a function with a name that indicates why the multiplication is happening.
+   *
+   * @param numBuckets number of buckets
+   * @param bucketSize size per bucket (in bytes)
+   *
+   * @return size of table (in bytes)
+   */
+  public static int memoryNeeded(final int numBuckets, final int bucketSize)
+  {
+    return numBuckets * bucketSize;
+  }
+
+  /**
+   * Returns the size of each bucket in a table.
+   *
+   * @param keySize   size of keys (in bytes)
+   * @param valueSize size of values (in bytes)
+   *
+   * @return size of buckets (in bytes)
+   */
+  public static int bucketSize(final int keySize, final int valueSize)
+  {
+    return USED_BYTE_SIZE + keySize + valueSize;
 
 Review comment:
   Ah, we could, if we could introspect into the key. But we can't. It might be a nice improvement for a future version of this code. (The specific API might be that you specify keySize in _bits_ not _bytes_ and the table can re-use any unused bits.)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374951659
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/MemoryOpenHashTable.java
 ##########
 @@ -0,0 +1,433 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import it.unimi.dsi.fastutil.ints.IntIterator;
+import org.apache.datasketches.memory.Memory;
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.groupby.epinephelinae.Groupers;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.NoSuchElementException;
+
+/**
+ * An open-addressed hash table with linear probing backed by {@link WritableMemory}. Does not offer a similar
+ * interface to {@link java.util.Map} because this is meant to be useful to lower-level, high-performance callers.
+ * There is no copying or serde of keys and values: callers access the backing memory of the table directly.
+ *
+ * This table will not grow itself. Callers must handle growing if required; the {@link #copyTo} method is provided
+ * to assist.
+ */
+public class MemoryOpenHashTable
+{
+  private static final byte USED_BYTE = 1;
+  private static final int USED_BYTE_SIZE = Byte.BYTES;
+
+  private final WritableMemory tableMemory;
+  private final int keySize;
+  private final int valueSize;
+  private final int bucketSize;
+
+  // Maximum number of elements in the table (based on numBuckets and maxLoadFactor).
+  private final int maxSize;
+
+  // Number of available/used buckets in the table. Always a power of two.
+  private final int numBuckets;
+
+  // Mask that clips a number to [0, numBuckets). Used when searching through buckets.
+  private final int bucketMask;
+
+  // Number of elements in the table right now.
+  private int size;
+
+  /**
+   * Create a new table.
+   *
+   * @param tableMemory backing memory for the table; must be exactly large enough to hold "numBuckets"
+   * @param numBuckets  number of buckets for the table
+   * @param maxSize     maximum number of elements for the table; must be less than numBuckets
+   * @param keySize     key size in bytes
+   * @param valueSize   value size in bytes
+   */
+  public MemoryOpenHashTable(
+      final WritableMemory tableMemory,
+      final int numBuckets,
+      final int maxSize,
+      final int keySize,
+      final int valueSize
+  )
+  {
+    this.tableMemory = tableMemory;
+    this.numBuckets = numBuckets;
+    this.bucketMask = numBuckets - 1;
+    this.maxSize = maxSize;
+    this.keySize = keySize;
+    this.valueSize = valueSize;
+    this.bucketSize = bucketSize(keySize, valueSize);
+
+    // Our main user today (HashVectorGrouper) needs the tableMemory to be backed by a big-endian ByteBuffer that is
+    // coterminous with the tableMemory, since it's going to feed that buffer into VectorAggregators instead of
+    // interacting with our WritableMemory directly. Nothing about this class actually requires that the Memory be
+    // backed by a ByteBuffer, but we'll check it here anyway for the benefit of our biggest customer.
+    verifyMemoryIsByteBuffer(tableMemory);
+
+    if (!tableMemory.getTypeByteOrder().equals(ByteOrder.nativeOrder())) {
+      throw new ISE("tableMemory must be native byte order");
+    }
+
+    if (tableMemory.getCapacity() != memoryNeeded(numBuckets, bucketSize)) {
+      throw new ISE(
+          "tableMemory must be size[%,d] but was[%,d]",
+          memoryNeeded(numBuckets, bucketSize),
+          tableMemory.getCapacity()
+      );
+    }
+
+    if (maxSize >= numBuckets) {
+      throw new ISE("maxSize must be less than numBuckets");
+    }
+
+    if (Integer.bitCount(numBuckets) != 1) {
+      throw new ISE("numBuckets must be a power of two but was[%,d]", numBuckets);
+    }
+
+    clear();
+  }
+
+  /**
+   * Returns the amount of memory needed for a table.
+   *
+   * This is just a multiplication, which is easy enough to do on your own, but sometimes it's nice for clarity's sake
+   * to call a function with a name that indicates why the multiplication is happening.
+   *
+   * @param numBuckets number of buckets
+   * @param bucketSize size per bucket (in bytes)
+   *
+   * @return size of table (in bytes)
+   */
+  public static int memoryNeeded(final int numBuckets, final int bucketSize)
+  {
+    return numBuckets * bucketSize;
+  }
+
+  /**
+   * Returns the size of each bucket in a table.
+   *
+   * @param keySize   size of keys (in bytes)
+   * @param valueSize size of values (in bytes)
+   *
+   * @return size of buckets (in bytes)
+   */
+  public static int bucketSize(final int keySize, final int valueSize)
+  {
+    return USED_BYTE_SIZE + keySize + valueSize;
 
 Review comment:
   When the key is nullable, I'm wondering whether we can share the last byte as both the null flag and the used bucket flag to save memory space. For example, the used bucket flag and the null flag can be stored in the upper 4 bits and the lower 4 bits, respectively. This way will need an extra branch to extract the key properly from the bucket, but I guess it would be ok?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] ccaominh commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

ccaominh commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582119737
 
 
   > > [WARNING] Unused declared dependencies found:
   > > [WARNING]    org.easymock:easymock:jar:4.0.2:compile
   > 
   > Hmm, this is a lie. It's required by SqlBenchmark:
   > 
   > ```
   > java.lang.NoClassDefFoundError: org/easymock/EasyMock
   > 	at org.apache.druid.sql.calcite.util.CalciteTests.createMockSystemSchema(CalciteTests.java:876)
   > 	at org.apache.druid.benchmark.query.SqlBenchmark.setup(SqlBenchmark.java:197)
   > 	at org.apache.druid.benchmark.query.generated.SqlBenchmark_querySql_jmhTest._jmh_tryInit_f_sqlbenchmark0_G(SqlBenchmark_querySql_jmhTest.java:448)
   > 	at org.apache.druid.benchmark.query.generated.SqlBenchmark_querySql_jmhTest.querySql_AverageTime(SqlBenchmark_querySql_jmhTest.java:162)
   > 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   > 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   > 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   > 	at java.lang.reflect.Method.invoke(Method.java:498)
   > 	at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453)
   > 	at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437)
   > 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   > ```
   > 
   > Maybe the analyzer is getting confused since EasyMock isn't being used _directly_. It's being used by `CalciteTests.createMockSystemSchema` from the druid-sql test jar. For some reason it doesn't get pulled in as a dependency unless explicitly declared in the benchmarks pom.
   > 
   > I'm not sure what to do about this...
   
   The dependency analysis plugin has configuration options for suppressing false positives:
   https://maven.apache.org/plugins/maven-dependency-plugin/examples/exclude-dependencies-from-dependency-analysis.html

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm merged pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm merged pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] b-slim commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

b-slim commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582104463
 
 
   @gianm wondering where the perf increase is coming from ? is it the overhead of BB bound checks ? or other factors ? wondering if it make sense to wait for https://openjdk.java.net/jeps/370 that IMO has a cleaner API and will be part of the JDK thus no low level unsafe stuff.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374947772
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/HashTableUtils.java
 ##########
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import org.apache.datasketches.memory.Memory;
+
+public class HashTableUtils
+{
+  private HashTableUtils()
+  {
+    // No instantiation.
+  }
+
+  /**
+   * Computes the previous power of two less than or equal to a given "n".
+   *
+   * The integer should be between 1 (inclusive) and {@link Integer#MAX_VALUE} for best results. Other parameters will
+   * return {@link Integer#MIN_VALUE}.
+   */
+  public static int previousPowerOfTwo(final int n)
+  {
+    if (n > 0) {
+      return Integer.highestOneBit(n);
+    } else {
+      return Integer.MIN_VALUE;
+    }
+  }
+
+  /**
+   * Compute a simple, fast hash code of some memory range.
+   *
+   * @param memory   a region of memory
+   * @param position position within the memory region
+   * @param length   length of memory to hash, starting at the position
+   */
+  public static int hashMemory(final Memory memory, final long position, final int length)
+  {
+    // Special cases for small, common key sizes to speed them up: e.g. one int key, two int keys, one long key, etc.
+    // The plus-one sizes (9, 13) are for nullable dimensions. The specific choices of special cases were chosen based
+    // on benchmarking (see MemoryBenchmark) on a Skylake-based cloud instance.
+
+    switch (length) {
+      case 4:
+        return 31 + memory.getInt(position);
+
+      case 8:
+        return 31 * (31 + memory.getInt(position)) + memory.getInt(position + Integer.BYTES);
+
+      case 9:
+        return 31 * (31 * (31 + memory.getInt(position)) + memory.getInt(position + Integer.BYTES))
+               + memory.getByte(position + 2 * Integer.BYTES);
 
 Review comment:
   I guess the last byte is the flag indicating whether the key is null or not. If the key is null, should its hash value be same no matter what the remaining 8 bytes are?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374974430
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/HashTableUtils.java
 ##########
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import org.apache.datasketches.memory.Memory;
+
+public class HashTableUtils
+{
+  private HashTableUtils()
+  {
+    // No instantiation.
+  }
+
+  /**
+   * Computes the previous power of two less than or equal to a given "n".
+   *
+   * The integer should be between 1 (inclusive) and {@link Integer#MAX_VALUE} for best results. Other parameters will
+   * return {@link Integer#MIN_VALUE}.
+   */
+  public static int previousPowerOfTwo(final int n)
+  {
+    if (n > 0) {
+      return Integer.highestOneBit(n);
+    } else {
+      return Integer.MIN_VALUE;
+    }
+  }
+
+  /**
+   * Compute a simple, fast hash code of some memory range.
+   *
+   * @param memory   a region of memory
+   * @param position position within the memory region
+   * @param length   length of memory to hash, starting at the position
+   */
+  public static int hashMemory(final Memory memory, final long position, final int length)
+  {
+    // Special cases for small, common key sizes to speed them up: e.g. one int key, two int keys, one long key, etc.
+    // The plus-one sizes (9, 13) are for nullable dimensions. The specific choices of special cases were chosen based
+    // on benchmarking (see MemoryBenchmark) on a Skylake-based cloud instance.
+
+    switch (length) {
+      case 4:
+        return 31 + memory.getInt(position);
 
 Review comment:
   It's not, I was just trying to be consistent with the other cases. Let me try benchmarking both and see if it makes a difference. If not, I'll go with the simpler one (no `+ 31`).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582121126
 
 
   This pull request **introduces 6 alerts** when merging 9a5cc44ac85d2a99570fb62ad3d19c85783e7a6e into 768d60c7b405bca3ff71ac7fc062a0d0015fdb2d - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-b58159b475f074b495c12a31cbca982514ae26f1)
   
   **new alerts:**
   
   * 6 for Result of multiplication cast to wider type

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582215902
 
 
   This pull request **introduces 6 alerts** when merging e9ffe625531d3af9d22f25a336ff9935acb5df34 into 0d2b16c1d06a46c4d862f7637f55bfc1dd1c7f41 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-35de356ca6e4643db972e1e0f0e8531d9781b6c7)
   
   **new alerts:**
   
   * 6 for Result of multiplication cast to wider type

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374936575
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/MemoryOpenHashTable.java
 ##########
 @@ -0,0 +1,433 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import it.unimi.dsi.fastutil.ints.IntIterator;
+import org.apache.datasketches.memory.Memory;
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.groupby.epinephelinae.Groupers;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.NoSuchElementException;
+
+/**
+ * An open-addressed hash table with linear probing backed by {@link WritableMemory}. Does not offer a similar
+ * interface to {@link java.util.Map} because this is meant to be useful to lower-level, high-performance callers.
+ * There is no copying or serde of keys and values: callers access the backing memory of the table directly.
+ *
+ * This table will not grow itself. Callers must handle growing if required; the {@link #copyTo} method is provided
+ * to assist.
+ */
+public class MemoryOpenHashTable
+{
+  private static final byte USED_BYTE = 1;
+  private static final int USED_BYTE_SIZE = Byte.BYTES;
+
+  private final WritableMemory tableMemory;
+  private final int keySize;
+  private final int valueSize;
+  private final int bucketSize;
+
+  // Maximum number of elements in the table (based on numBuckets and maxLoadFactor).
+  private final int maxSize;
+
+  // Number of available/used buckets in the table. Always a power of two.
+  private final int numBuckets;
+
+  // Mask that clips a number to [0, numBuckets). Used when searching through buckets.
+  private final int bucketMask;
+
+  // Number of elements in the table right now.
+  private int size;
+
+  /**
+   * Create a new table.
+   *
+   * @param tableMemory backing memory for the table; must be exactly large enough to hold "numBuckets"
+   * @param numBuckets  number of buckets for the table
+   * @param maxSize     maximum number of elements for the table; must be less than numBuckets
+   * @param keySize     key size in bytes
+   * @param valueSize   value size in bytes
+   */
+  public MemoryOpenHashTable(
+      final WritableMemory tableMemory,
+      final int numBuckets,
+      final int maxSize,
+      final int keySize,
+      final int valueSize
+  )
+  {
+    this.tableMemory = tableMemory;
+    this.numBuckets = numBuckets;
+    this.bucketMask = numBuckets - 1;
+    this.maxSize = maxSize;
+    this.keySize = keySize;
+    this.valueSize = valueSize;
+    this.bucketSize = bucketSize(keySize, valueSize);
+
+    // Our main user today (HashVectorGrouper) needs the tableMemory to be backed by a big-endian ByteBuffer that is
 
 Review comment:
   `HashVectorGrouper` -> `BufferHashGrouper`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374951659
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/MemoryOpenHashTable.java
 ##########
 @@ -0,0 +1,433 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import it.unimi.dsi.fastutil.ints.IntIterator;
+import org.apache.datasketches.memory.Memory;
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.groupby.epinephelinae.Groupers;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.NoSuchElementException;
+
+/**
+ * An open-addressed hash table with linear probing backed by {@link WritableMemory}. Does not offer a similar
+ * interface to {@link java.util.Map} because this is meant to be useful to lower-level, high-performance callers.
+ * There is no copying or serde of keys and values: callers access the backing memory of the table directly.
+ *
+ * This table will not grow itself. Callers must handle growing if required; the {@link #copyTo} method is provided
+ * to assist.
+ */
+public class MemoryOpenHashTable
+{
+  private static final byte USED_BYTE = 1;
+  private static final int USED_BYTE_SIZE = Byte.BYTES;
+
+  private final WritableMemory tableMemory;
+  private final int keySize;
+  private final int valueSize;
+  private final int bucketSize;
+
+  // Maximum number of elements in the table (based on numBuckets and maxLoadFactor).
+  private final int maxSize;
+
+  // Number of available/used buckets in the table. Always a power of two.
+  private final int numBuckets;
+
+  // Mask that clips a number to [0, numBuckets). Used when searching through buckets.
+  private final int bucketMask;
+
+  // Number of elements in the table right now.
+  private int size;
+
+  /**
+   * Create a new table.
+   *
+   * @param tableMemory backing memory for the table; must be exactly large enough to hold "numBuckets"
+   * @param numBuckets  number of buckets for the table
+   * @param maxSize     maximum number of elements for the table; must be less than numBuckets
+   * @param keySize     key size in bytes
+   * @param valueSize   value size in bytes
+   */
+  public MemoryOpenHashTable(
+      final WritableMemory tableMemory,
+      final int numBuckets,
+      final int maxSize,
+      final int keySize,
+      final int valueSize
+  )
+  {
+    this.tableMemory = tableMemory;
+    this.numBuckets = numBuckets;
+    this.bucketMask = numBuckets - 1;
+    this.maxSize = maxSize;
+    this.keySize = keySize;
+    this.valueSize = valueSize;
+    this.bucketSize = bucketSize(keySize, valueSize);
+
+    // Our main user today (HashVectorGrouper) needs the tableMemory to be backed by a big-endian ByteBuffer that is
+    // coterminous with the tableMemory, since it's going to feed that buffer into VectorAggregators instead of
+    // interacting with our WritableMemory directly. Nothing about this class actually requires that the Memory be
+    // backed by a ByteBuffer, but we'll check it here anyway for the benefit of our biggest customer.
+    verifyMemoryIsByteBuffer(tableMemory);
+
+    if (!tableMemory.getTypeByteOrder().equals(ByteOrder.nativeOrder())) {
+      throw new ISE("tableMemory must be native byte order");
+    }
+
+    if (tableMemory.getCapacity() != memoryNeeded(numBuckets, bucketSize)) {
+      throw new ISE(
+          "tableMemory must be size[%,d] but was[%,d]",
+          memoryNeeded(numBuckets, bucketSize),
+          tableMemory.getCapacity()
+      );
+    }
+
+    if (maxSize >= numBuckets) {
+      throw new ISE("maxSize must be less than numBuckets");
+    }
+
+    if (Integer.bitCount(numBuckets) != 1) {
+      throw new ISE("numBuckets must be a power of two but was[%,d]", numBuckets);
+    }
+
+    clear();
+  }
+
+  /**
+   * Returns the amount of memory needed for a table.
+   *
+   * This is just a multiplication, which is easy enough to do on your own, but sometimes it's nice for clarity's sake
+   * to call a function with a name that indicates why the multiplication is happening.
+   *
+   * @param numBuckets number of buckets
+   * @param bucketSize size per bucket (in bytes)
+   *
+   * @return size of table (in bytes)
+   */
+  public static int memoryNeeded(final int numBuckets, final int bucketSize)
+  {
+    return numBuckets * bucketSize;
+  }
+
+  /**
+   * Returns the size of each bucket in a table.
+   *
+   * @param keySize   size of keys (in bytes)
+   * @param valueSize size of values (in bytes)
+   *
+   * @return size of buckets (in bytes)
+   */
+  public static int bucketSize(final int keySize, final int valueSize)
+  {
+    return USED_BYTE_SIZE + keySize + valueSize;
 
 Review comment:
   When the key is nullable, I'm wondering whether we can share the last byte as both the null flag and the used bucket flag to save memory space. For example, the used bucket flag and the null flag can be stored in the upper 4 bits and the lower 4 bits, respectively. This way will need an extra branch (or some bit shifts) to extract the key properly from the bucket, but I guess it would be ok?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374972811
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/HashTableUtils.java
 ##########
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import org.apache.datasketches.memory.Memory;
+
+public class HashTableUtils
+{
+  private HashTableUtils()
+  {
+    // No instantiation.
+  }
+
+  /**
+   * Computes the previous power of two less than or equal to a given "n".
+   *
+   * The integer should be between 1 (inclusive) and {@link Integer#MAX_VALUE} for best results. Other parameters will
+   * return {@link Integer#MIN_VALUE}.
+   */
+  public static int previousPowerOfTwo(final int n)
+  {
+    if (n > 0) {
+      return Integer.highestOneBit(n);
+    } else {
+      return Integer.MIN_VALUE;
+    }
+  }
+
+  /**
+   * Compute a simple, fast hash code of some memory range.
+   *
+   * @param memory   a region of memory
+   * @param position position within the memory region
+   * @param length   length of memory to hash, starting at the position
+   */
+  public static int hashMemory(final Memory memory, final long position, final int length)
+  {
+    // Special cases for small, common key sizes to speed them up: e.g. one int key, two int keys, one long key, etc.
+    // The plus-one sizes (9, 13) are for nullable dimensions. The specific choices of special cases were chosen based
+    // on benchmarking (see MemoryBenchmark) on a Skylake-based cloud instance.
+
+    switch (length) {
+      case 4:
+        return 31 + memory.getInt(position);
+
+      case 8:
+        return 31 * (31 + memory.getInt(position)) + memory.getInt(position + Integer.BYTES);
 
 Review comment:
   I was doing the same kind of calculation that `Arrays.hashCode` would do:
   
   ```java
           int result = 1;
           for (int element : a)
               result = 31 * result + element;
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374973596
 
 

 ##########
 File path: benchmarks/src/main/java/org/apache/druid/benchmark/MemoryBenchmark.java
 ##########
 @@ -0,0 +1,169 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.benchmark;
+
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.query.groupby.epinephelinae.collection.HashTableUtils;
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.BenchmarkMode;
+import org.openjdk.jmh.annotations.Fork;
+import org.openjdk.jmh.annotations.Measurement;
+import org.openjdk.jmh.annotations.Mode;
+import org.openjdk.jmh.annotations.OutputTimeUnit;
+import org.openjdk.jmh.annotations.Param;
+import org.openjdk.jmh.annotations.Scope;
+import org.openjdk.jmh.annotations.Setup;
+import org.openjdk.jmh.annotations.State;
+import org.openjdk.jmh.annotations.Warmup;
+import org.openjdk.jmh.infra.Blackhole;
+
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.Random;
+import java.util.concurrent.TimeUnit;
+
+@State(Scope.Benchmark)
+@Fork(value = 1)
+@Warmup(iterations = 10)
+@Measurement(iterations = 15)
+public class MemoryBenchmark
+{
+  static {
+    NullHandling.initializeForTests();
+  }
+
+  @Param({"4", "5", "8", "9", "12", "16", "31", "32", "64", "128"})
+  public int numBytes;
+
+  @Param({"offheap"})
+  public String where;
+
+  private ByteBuffer buffer1;
+  private ByteBuffer buffer2;
+  private ByteBuffer buffer3;
+  private WritableMemory memory1;
+  private WritableMemory memory2;
+  private WritableMemory memory3;
+
+  @Setup
+  public void setUp()
+  {
+    if ("onheap".equals(where)) {
+      buffer1 = ByteBuffer.allocate(numBytes).order(ByteOrder.nativeOrder());
+      buffer2 = ByteBuffer.allocate(numBytes).order(ByteOrder.nativeOrder());
+      buffer3 = ByteBuffer.allocate(numBytes).order(ByteOrder.nativeOrder());
+    } else if ("offheap".equals(where)) {
+      buffer1 = ByteBuffer.allocateDirect(numBytes).order(ByteOrder.nativeOrder());
+      buffer2 = ByteBuffer.allocateDirect(numBytes).order(ByteOrder.nativeOrder());
+      buffer3 = ByteBuffer.allocateDirect(numBytes).order(ByteOrder.nativeOrder());
+    }
+
+    memory1 = WritableMemory.wrap(buffer1, ByteOrder.nativeOrder());
+    memory2 = WritableMemory.wrap(buffer2, ByteOrder.nativeOrder());
+    memory3 = WritableMemory.wrap(buffer3, ByteOrder.nativeOrder());
+
+    // Scribble in some randomy but consistent (same seed) garbage.
 
 Review comment:
   Ha, I actually did mean "randomy", but it's not a real word. I meant, like, data generated by a random generator that isn't actually random (due to using the same seed). Random-ish. I'll rephrase it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374954252
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/MemoryOpenHashTable.java
 ##########
 @@ -0,0 +1,433 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import it.unimi.dsi.fastutil.ints.IntIterator;
+import org.apache.datasketches.memory.Memory;
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.groupby.epinephelinae.Groupers;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.NoSuchElementException;
+
+/**
+ * An open-addressed hash table with linear probing backed by {@link WritableMemory}. Does not offer a similar
+ * interface to {@link java.util.Map} because this is meant to be useful to lower-level, high-performance callers.
+ * There is no copying or serde of keys and values: callers access the backing memory of the table directly.
+ *
+ * This table will not grow itself. Callers must handle growing if required; the {@link #copyTo} method is provided
+ * to assist.
+ */
+public class MemoryOpenHashTable
+{
+  private static final byte USED_BYTE = 1;
+  private static final int USED_BYTE_SIZE = Byte.BYTES;
+
+  private final WritableMemory tableMemory;
+  private final int keySize;
+  private final int valueSize;
+  private final int bucketSize;
+
+  // Maximum number of elements in the table (based on numBuckets and maxLoadFactor).
+  private final int maxSize;
+
+  // Number of available/used buckets in the table. Always a power of two.
+  private final int numBuckets;
+
+  // Mask that clips a number to [0, numBuckets). Used when searching through buckets.
+  private final int bucketMask;
+
+  // Number of elements in the table right now.
+  private int size;
+
+  /**
+   * Create a new table.
+   *
+   * @param tableMemory backing memory for the table; must be exactly large enough to hold "numBuckets"
+   * @param numBuckets  number of buckets for the table
+   * @param maxSize     maximum number of elements for the table; must be less than numBuckets
+   * @param keySize     key size in bytes
+   * @param valueSize   value size in bytes
+   */
+  public MemoryOpenHashTable(
+      final WritableMemory tableMemory,
+      final int numBuckets,
+      final int maxSize,
+      final int keySize,
+      final int valueSize
+  )
+  {
+    this.tableMemory = tableMemory;
+    this.numBuckets = numBuckets;
+    this.bucketMask = numBuckets - 1;
+    this.maxSize = maxSize;
+    this.keySize = keySize;
+    this.valueSize = valueSize;
+    this.bucketSize = bucketSize(keySize, valueSize);
+
+    // Our main user today (HashVectorGrouper) needs the tableMemory to be backed by a big-endian ByteBuffer that is
+    // coterminous with the tableMemory, since it's going to feed that buffer into VectorAggregators instead of
+    // interacting with our WritableMemory directly. Nothing about this class actually requires that the Memory be
+    // backed by a ByteBuffer, but we'll check it here anyway for the benefit of our biggest customer.
+    verifyMemoryIsByteBuffer(tableMemory);
+
+    if (!tableMemory.getTypeByteOrder().equals(ByteOrder.nativeOrder())) {
+      throw new ISE("tableMemory must be native byte order");
+    }
+
+    if (tableMemory.getCapacity() != memoryNeeded(numBuckets, bucketSize)) {
+      throw new ISE(
+          "tableMemory must be size[%,d] but was[%,d]",
+          memoryNeeded(numBuckets, bucketSize),
+          tableMemory.getCapacity()
+      );
+    }
+
+    if (maxSize >= numBuckets) {
+      throw new ISE("maxSize must be less than numBuckets");
+    }
+
+    if (Integer.bitCount(numBuckets) != 1) {
+      throw new ISE("numBuckets must be a power of two but was[%,d]", numBuckets);
+    }
+
+    clear();
+  }
+
+  /**
+   * Returns the amount of memory needed for a table.
+   *
+   * This is just a multiplication, which is easy enough to do on your own, but sometimes it's nice for clarity's sake
+   * to call a function with a name that indicates why the multiplication is happening.
+   *
+   * @param numBuckets number of buckets
+   * @param bucketSize size per bucket (in bytes)
+   *
+   * @return size of table (in bytes)
+   */
+  public static int memoryNeeded(final int numBuckets, final int bucketSize)
+  {
+    return numBuckets * bucketSize;
+  }
+
+  /**
+   * Returns the size of each bucket in a table.
+   *
+   * @param keySize   size of keys (in bytes)
+   * @param valueSize size of values (in bytes)
+   *
+   * @return size of buckets (in bytes)
+   */
+  public static int bucketSize(final int keySize, final int valueSize)
+  {
+    return USED_BYTE_SIZE + keySize + valueSize;
+  }
+
+  /**
+   * Clear the table, resetting size to zero.
+   */
+  public void clear()
+  {
+    size = 0;
+
+    // Clear used flags.
+    for (int bucket = 0; bucket < numBuckets; bucket++) {
+      tableMemory.putByte((long) bucket * bucketSize, (byte) 0);
+    }
+  }
+
+  /**
+   * Copy this table into another one. The other table must be large enough to hold all the copied buckets. The other
+   * table will be cleared before the copy takes place.
+   *
+   * @param other       the other table
+   * @param copyHandler a callback that is notified for each copied bucket
+   */
+  public void copyTo(final MemoryOpenHashTable other, @Nullable final BucketCopyHandler copyHandler)
+  {
+    if (other.size() > 0) {
+      other.clear();
+    }
+
+    for (int bucket = 0; bucket < numBuckets; bucket++) {
+      final int bucketOffset = bucket * bucketSize;
+      if (isOffsetUsed(bucketOffset)) {
+        final int keyPosition = bucketOffset + USED_BYTE_SIZE;
+        final int keyHash = Groupers.smear(HashTableUtils.hashMemory(tableMemory, keyPosition, keySize));
+        final int newBucket = other.findBucket(keyHash, tableMemory, keyPosition);
+
+        if (newBucket >= 0) {
+          // Not expected to happen, since we cleared the other table first.
+          throw new ISE("Found already-used bucket while copying");
+        }
+
+        if (!other.canInsertNewBucket()) {
+          throw new ISE("Unable to copy bucket to new table, size[%,d]", other.size());
+        }
+
+        final int newBucketOffset = -(newBucket + 1) * bucketSize;
+        assert !other.isOffsetUsed(newBucketOffset);
+        tableMemory.copyTo(bucketOffset, other.tableMemory, newBucketOffset, bucketSize);
+        other.size++;
+
+        if (copyHandler != null) {
+          copyHandler.bucketCopied(bucket, -(newBucket + 1), this, other);
+        }
+      }
+    }
+
+    // Sanity check.
+    if (other.size() != size) {
+      throw new ISE("New table size[%,d] != old table size[%,d] after copying", other.size(), size);
+    }
+  }
+
+  /**
+   * Finds the bucket for a particular key.
+   *
+   * @param keyHash          result of calling {@link HashTableUtils#hashMemory} on this key
+   * @param keySpace         memory containing the key
+   * @param keySpacePosition position of the key within keySpace
+   *
+   * @return bucket number if currently occupied, or {@code -bucket - 1} if not occupied (yet)
+   */
+  public int findBucket(final int keyHash, final Memory keySpace, final int keySpacePosition)
+  {
+    int bucket = keyHash & bucketMask;
+
+    while (true) {
+      final int bucketOffset = bucket * bucketSize;
+
+      if (tableMemory.getByte(bucketOffset) == 0) {
+        // Found unused bucket before finding our key.
+        return -bucket - 1;
+      }
+
+      final boolean keyFound = HashTableUtils.memoryEquals(
+          tableMemory,
+          bucketOffset + USED_BYTE_SIZE,
+          keySpace,
+          keySpacePosition,
+          keySize
+      );
+
+      if (keyFound) {
+        return bucket;
+      }
+
+      bucket = (bucket + 1) & bucketMask;
+    }
+  }
+
+  /**
+   * Returns whether this table can accept a new bucket.
+   */
+  public boolean canInsertNewBucket()
+  {
+    return size < maxSize;
+  }
+
+  /**
+   * Initialize a bucket with a particular key.
+   *
+   * Do not call this method unless the bucket is currently unused and {@link #canInsertNewBucket()} returns true.
+   *
+   * @param bucket           bucket number
+   * @param keySpace         memory containing the key
+   * @param keySpacePosition position of the key within keySpace
+   */
+  public void initBucket(final int bucket, final Memory keySpace, final int keySpacePosition)
+  {
+    final int bucketOffset = bucket * bucketSize;
+
+    // Method preconditions.
+    assert canInsertNewBucket() && !isOffsetUsed(bucketOffset);
+
+    // Mark the bucket used and write in the key.
+    tableMemory.putByte(bucketOffset, USED_BYTE);
+    keySpace.copyTo(keySpacePosition, tableMemory, bucketOffset + USED_BYTE_SIZE, keySize);
+    size++;
+  }
+
+  /**
+   * Returns the number of elements currently in the table.
+   */
+  public int size()
+  {
+    return size;
+  }
+
+  /**
+   * Returns the number of buckets in this table. Note that not all of these can actually be used. The amount that
+   * can be used depends on the "maxSize" parameter provided during construction.
+   */
+  public int numBuckets()
+  {
+    return numBuckets;
+  }
+
+  /**
+   * Returns the size of keys, in bytes.
+   */
+  public int keySize()
+  {
+    return keySize;
+  }
+
+  /**
+   * Returns the size of values, in bytes.
+   */
+  public int valueSize()
+  {
+    return valueSize;
+  }
+
+  /**
+   * Returns the offset within each bucket where the key starts.
+   */
+  public int bucketKeyOffset()
+  {
+    return USED_BYTE_SIZE;
+  }
+
+  /**
+   * Returns the offset within each bucket where the value starts.
+   */
+  public int bucketValueOffset()
+  {
+    return USED_BYTE_SIZE + keySize;
+  }
+
+  /**
+   * Returns the size in bytes of each bucket.
+   */
+  public int bucketSize()
+  {
+    return bucketSize;
+  }
+
+  /**
+   * Returns the position within {@link #memory()} where a particular bucket starts.
+   */
+  public int bucketMemoryPosition(final int bucket)
+  {
+    return bucket * bucketSize;
+  }
+
+  /**
+   * Returns the memory backing this table.
+   */
+  public WritableMemory memory()
+  {
+    return tableMemory;
+  }
+
+  /**
+   * Iterates over all used buckets, returning bucket numbers for each one.
+   *
+   * The intent is that callers will pass the bucket numbers to {@link #bucketMemoryPosition} and then use
+   * {@link #bucketKeyOffset()} and {@link #bucketValueOffset()} to extract keys and values from the buckets as needed.
+   */
+  public IntIterator bucketIterator()
+  {
+    return new IntIterator()
+    {
+      private int curr = 0;
+      private int currBucket = -1;
+
+      @Override
+      public boolean hasNext()
+      {
+        return curr < size;
+      }
+
+      @Override
+      public int nextInt()
+      {
+        if (curr >= size) {
+          throw new NoSuchElementException();
+        }
+
+        currBucket++;
+
+        while (!isOffsetUsed(currBucket * bucketSize)) {
+          currBucket++;
+        }
+
+        curr++;
+        return currBucket;
+      }
+    };
+  }
+
+  /**
+   * Returns whether the bucket at position "bucketOffset" is used or not. Note that this is a bucket position (in
+   * bytes), not a bucket number.
+   */
+  private boolean isOffsetUsed(final int bucketOffset)
+  {
+    return tableMemory.getByte(bucketOffset) == 1;
 
 Review comment:
   `== USED_BYTE`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374891624
 
 

 ##########
 File path: benchmarks/src/main/java/org/apache/druid/benchmark/MemoryBenchmark.java
 ##########
 @@ -0,0 +1,169 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.benchmark;
+
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.query.groupby.epinephelinae.collection.HashTableUtils;
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.BenchmarkMode;
+import org.openjdk.jmh.annotations.Fork;
+import org.openjdk.jmh.annotations.Measurement;
+import org.openjdk.jmh.annotations.Mode;
+import org.openjdk.jmh.annotations.OutputTimeUnit;
+import org.openjdk.jmh.annotations.Param;
+import org.openjdk.jmh.annotations.Scope;
+import org.openjdk.jmh.annotations.Setup;
+import org.openjdk.jmh.annotations.State;
+import org.openjdk.jmh.annotations.Warmup;
+import org.openjdk.jmh.infra.Blackhole;
+
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.Random;
+import java.util.concurrent.TimeUnit;
+
+@State(Scope.Benchmark)
+@Fork(value = 1)
+@Warmup(iterations = 10)
+@Measurement(iterations = 15)
+public class MemoryBenchmark
+{
+  static {
+    NullHandling.initializeForTests();
+  }
+
+  @Param({"4", "5", "8", "9", "12", "16", "31", "32", "64", "128"})
+  public int numBytes;
+
+  @Param({"offheap"})
+  public String where;
+
+  private ByteBuffer buffer1;
+  private ByteBuffer buffer2;
+  private ByteBuffer buffer3;
+  private WritableMemory memory1;
+  private WritableMemory memory2;
+  private WritableMemory memory3;
+
+  @Setup
+  public void setUp()
+  {
+    if ("onheap".equals(where)) {
+      buffer1 = ByteBuffer.allocate(numBytes).order(ByteOrder.nativeOrder());
+      buffer2 = ByteBuffer.allocate(numBytes).order(ByteOrder.nativeOrder());
+      buffer3 = ByteBuffer.allocate(numBytes).order(ByteOrder.nativeOrder());
+    } else if ("offheap".equals(where)) {
+      buffer1 = ByteBuffer.allocateDirect(numBytes).order(ByteOrder.nativeOrder());
+      buffer2 = ByteBuffer.allocateDirect(numBytes).order(ByteOrder.nativeOrder());
+      buffer3 = ByteBuffer.allocateDirect(numBytes).order(ByteOrder.nativeOrder());
+    }
+
+    memory1 = WritableMemory.wrap(buffer1, ByteOrder.nativeOrder());
+    memory2 = WritableMemory.wrap(buffer2, ByteOrder.nativeOrder());
+    memory3 = WritableMemory.wrap(buffer3, ByteOrder.nativeOrder());
+
+    // Scribble in some randomy but consistent (same seed) garbage.
 
 Review comment:
   randomy -> randomly?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374983251
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/HashTableUtils.java
 ##########
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import org.apache.datasketches.memory.Memory;
+
+public class HashTableUtils
+{
+  private HashTableUtils()
+  {
+    // No instantiation.
+  }
+
+  /**
+   * Computes the previous power of two less than or equal to a given "n".
+   *
+   * The integer should be between 1 (inclusive) and {@link Integer#MAX_VALUE} for best results. Other parameters will
+   * return {@link Integer#MIN_VALUE}.
+   */
+  public static int previousPowerOfTwo(final int n)
+  {
+    if (n > 0) {
+      return Integer.highestOneBit(n);
+    } else {
+      return Integer.MIN_VALUE;
+    }
+  }
+
+  /**
+   * Compute a simple, fast hash code of some memory range.
+   *
+   * @param memory   a region of memory
+   * @param position position within the memory region
+   * @param length   length of memory to hash, starting at the position
+   */
+  public static int hashMemory(final Memory memory, final long position, final int length)
+  {
+    // Special cases for small, common key sizes to speed them up: e.g. one int key, two int keys, one long key, etc.
+    // The plus-one sizes (9, 13) are for nullable dimensions. The specific choices of special cases were chosen based
+    // on benchmarking (see MemoryBenchmark) on a Skylake-based cloud instance.
+
+    switch (length) {
+      case 4:
+        return 31 + memory.getInt(position);
 
 Review comment:
   I can't discern a difference in the benchmarks. I'll go with the simpler one, no `+ 31`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582081707
 
 
   Please check the dependency check failure.
   
   ```
   [WARNING] Used undeclared dependencies found:
   [WARNING]    org.apache.datasketches:datasketches-memory:jar:1.2.0-incubating:compile
   [WARNING] Unused declared dependencies found:
   [WARNING]    org.easymock:easymock:jar:4.0.2:compile
   [INFO] Add the following to your pom to correct the missing dependencies: 
   [INFO] 
   <dependency>
     <groupId>org.apache.datasketches</groupId>
     <artifactId>datasketches-memory</artifactId>
     <version>1.2.0-incubating</version>
   </dependency>
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582077195
 
 
   > 6 for Result of multiplication cast to wider type
   
   I think the LGTM alerts are false positives. It looks like it's flagging a potential overflow in calls like `2 * Integer.BYTES` before being passed to `memory.getByte(position + 2 * Integer.BYTES)`. But it won't overflow because `Integer.BYTES` is a small constant.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582128976
 
 
   > The dependency analysis plugin has configuration options for suppressing false positives:
   > https://maven.apache.org/plugins/maven-dependency-plugin/examples/exclude-dependencies-from-dependency-analysis.html
   
   I just removed easymock from druid-benchmarks for now, which will stop SqlBenchmark from working. I also raised this PR to make the CalciteTests stuff not need EasyMock anymore: https://github.com/apache/druid/pull/9310
   
   (I did it this way instead of via a suppression because I'd prefer to get rid of EasyMock anyway.)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582136796

> @gianm wondering where the perf increase is coming from ? is it the overhead of BB bound checks ? or other factors ? wondering if it make sense to wait for https://openjdk.java.net/jeps/370 that IMO has a cleaner API and will be part of the JDK thus no low level unsafe stuff.

Some of the increase is because a couple of important Memory APIs are implemented more efficiently than the corresponding BB API. For example, `Memory.equalTo` at a 32-bit key size is 3x faster than `BB.equals` according to MemoryBenchmark. `Memory.copyTo` is another example.

Some of the increase is from things that would have been possible to do with BB as well. The key-length-specific hashing and equality code could have been implemented on top of BB. So could the power-of-two optimizations. This actually could get a BB implementation pretty close to a Memory based implementation, but it wouldn't get all of the way there.

Even so another justification for using Memory is that IMO the Memory API is better than the ByteBuffer API for the kind of stuff we do. There isn't this position / limit stuff that tends to get in our way. And it doesn't continually reset the byte order to big-endian whenever you duplicate or slice it. The JEP 370 stuff will probably be an even better fit, but it will be a while before we can count on all of our users to have it.

For these two reasons (Memory does have a couple of APIs that beat BB, and Memory is a nicer API) I thought going with Memory was a good idea.

Btw, Memory still does do bound checks when you enter its code, but once it does them, it uses Unsafe internally. We could be even faster by using Unsafe directly and doing our own bounds checks, but I didn't want to go that far at this time.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374975608
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/MemoryOpenHashTable.java
 ##########
 @@ -0,0 +1,433 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import it.unimi.dsi.fastutil.ints.IntIterator;
+import org.apache.datasketches.memory.Memory;
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.groupby.epinephelinae.Groupers;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.NoSuchElementException;
+
+/**
+ * An open-addressed hash table with linear probing backed by {@link WritableMemory}. Does not offer a similar
+ * interface to {@link java.util.Map} because this is meant to be useful to lower-level, high-performance callers.
+ * There is no copying or serde of keys and values: callers access the backing memory of the table directly.
+ *
+ * This table will not grow itself. Callers must handle growing if required; the {@link #copyTo} method is provided
+ * to assist.
+ */
+public class MemoryOpenHashTable
+{
+  private static final byte USED_BYTE = 1;
+  private static final int USED_BYTE_SIZE = Byte.BYTES;
+
+  private final WritableMemory tableMemory;
+  private final int keySize;
+  private final int valueSize;
+  private final int bucketSize;
+
+  // Maximum number of elements in the table (based on numBuckets and maxLoadFactor).
+  private final int maxSize;
+
+  // Number of available/used buckets in the table. Always a power of two.
+  private final int numBuckets;
+
+  // Mask that clips a number to [0, numBuckets). Used when searching through buckets.
+  private final int bucketMask;
+
+  // Number of elements in the table right now.
+  private int size;
+
+  /**
+   * Create a new table.
+   *
+   * @param tableMemory backing memory for the table; must be exactly large enough to hold "numBuckets"
+   * @param numBuckets  number of buckets for the table
+   * @param maxSize     maximum number of elements for the table; must be less than numBuckets
+   * @param keySize     key size in bytes
+   * @param valueSize   value size in bytes
+   */
+  public MemoryOpenHashTable(
+      final WritableMemory tableMemory,
+      final int numBuckets,
+      final int maxSize,
+      final int keySize,
+      final int valueSize
+  )
+  {
+    this.tableMemory = tableMemory;
+    this.numBuckets = numBuckets;
+    this.bucketMask = numBuckets - 1;
+    this.maxSize = maxSize;
+    this.keySize = keySize;
+    this.valueSize = valueSize;
+    this.bucketSize = bucketSize(keySize, valueSize);
+
+    // Our main user today (HashVectorGrouper) needs the tableMemory to be backed by a big-endian ByteBuffer that is
+    // coterminous with the tableMemory, since it's going to feed that buffer into VectorAggregators instead of
+    // interacting with our WritableMemory directly. Nothing about this class actually requires that the Memory be
+    // backed by a ByteBuffer, but we'll check it here anyway for the benefit of our biggest customer.
+    verifyMemoryIsByteBuffer(tableMemory);
+
+    if (!tableMemory.getTypeByteOrder().equals(ByteOrder.nativeOrder())) {
+      throw new ISE("tableMemory must be native byte order");
+    }
+
+    if (tableMemory.getCapacity() != memoryNeeded(numBuckets, bucketSize)) {
+      throw new ISE(
+          "tableMemory must be size[%,d] but was[%,d]",
+          memoryNeeded(numBuckets, bucketSize),
+          tableMemory.getCapacity()
+      );
+    }
+
+    if (maxSize >= numBuckets) {
+      throw new ISE("maxSize must be less than numBuckets");
+    }
+
+    if (Integer.bitCount(numBuckets) != 1) {
+      throw new ISE("numBuckets must be a power of two but was[%,d]", numBuckets);
+    }
+
+    clear();
+  }
+
+  /**
+   * Returns the amount of memory needed for a table.
+   *
+   * This is just a multiplication, which is easy enough to do on your own, but sometimes it's nice for clarity's sake
+   * to call a function with a name that indicates why the multiplication is happening.
+   *
+   * @param numBuckets number of buckets
+   * @param bucketSize size per bucket (in bytes)
+   *
+   * @return size of table (in bytes)
+   */
+  public static int memoryNeeded(final int numBuckets, final int bucketSize)
+  {
+    return numBuckets * bucketSize;
+  }
+
+  /**
+   * Returns the size of each bucket in a table.
+   *
+   * @param keySize   size of keys (in bytes)
+   * @param valueSize size of values (in bytes)
+   *
+   * @return size of buckets (in bytes)
+   */
+  public static int bucketSize(final int keySize, final int valueSize)
+  {
+    return USED_BYTE_SIZE + keySize + valueSize;
+  }
+
+  /**
+   * Clear the table, resetting size to zero.
+   */
+  public void clear()
+  {
+    size = 0;
+
+    // Clear used flags.
+    for (int bucket = 0; bucket < numBuckets; bucket++) {
+      tableMemory.putByte((long) bucket * bucketSize, (byte) 0);
+    }
+  }
+
+  /**
+   * Copy this table into another one. The other table must be large enough to hold all the copied buckets. The other
+   * table will be cleared before the copy takes place.
+   *
+   * @param other       the other table
+   * @param copyHandler a callback that is notified for each copied bucket
+   */
+  public void copyTo(final MemoryOpenHashTable other, @Nullable final BucketCopyHandler copyHandler)
+  {
+    if (other.size() > 0) {
+      other.clear();
+    }
+
+    for (int bucket = 0; bucket < numBuckets; bucket++) {
+      final int bucketOffset = bucket * bucketSize;
+      if (isOffsetUsed(bucketOffset)) {
+        final int keyPosition = bucketOffset + USED_BYTE_SIZE;
+        final int keyHash = Groupers.smear(HashTableUtils.hashMemory(tableMemory, keyPosition, keySize));
+        final int newBucket = other.findBucket(keyHash, tableMemory, keyPosition);
+
+        if (newBucket >= 0) {
+          // Not expected to happen, since we cleared the other table first.
+          throw new ISE("Found already-used bucket while copying");
+        }
+
+        if (!other.canInsertNewBucket()) {
+          throw new ISE("Unable to copy bucket to new table, size[%,d]", other.size());
+        }
+
+        final int newBucketOffset = -(newBucket + 1) * bucketSize;
+        assert !other.isOffsetUsed(newBucketOffset);
+        tableMemory.copyTo(bucketOffset, other.tableMemory, newBucketOffset, bucketSize);
+        other.size++;
+
+        if (copyHandler != null) {
+          copyHandler.bucketCopied(bucket, -(newBucket + 1), this, other);
+        }
+      }
+    }
+
+    // Sanity check.
+    if (other.size() != size) {
+      throw new ISE("New table size[%,d] != old table size[%,d] after copying", other.size(), size);
+    }
+  }
+
+  /**
+   * Finds the bucket for a particular key.
+   *
+   * @param keyHash          result of calling {@link HashTableUtils#hashMemory} on this key
+   * @param keySpace         memory containing the key
+   * @param keySpacePosition position of the key within keySpace
+   *
+   * @return bucket number if currently occupied, or {@code -bucket - 1} if not occupied (yet)
+   */
+  public int findBucket(final int keyHash, final Memory keySpace, final int keySpacePosition)
+  {
+    int bucket = keyHash & bucketMask;
+
+    while (true) {
+      final int bucketOffset = bucket * bucketSize;
+
+      if (tableMemory.getByte(bucketOffset) == 0) {
+        // Found unused bucket before finding our key.
+        return -bucket - 1;
+      }
+
+      final boolean keyFound = HashTableUtils.memoryEquals(
+          tableMemory,
+          bucketOffset + USED_BYTE_SIZE,
+          keySpace,
+          keySpacePosition,
+          keySize
+      );
+
+      if (keyFound) {
+        return bucket;
+      }
+
+      bucket = (bucket + 1) & bucketMask;
+    }
+  }
+
+  /**
+   * Returns whether this table can accept a new bucket.
+   */
+  public boolean canInsertNewBucket()
+  {
+    return size < maxSize;
+  }
+
+  /**
+   * Initialize a bucket with a particular key.
+   *
+   * Do not call this method unless the bucket is currently unused and {@link #canInsertNewBucket()} returns true.
+   *
+   * @param bucket           bucket number
+   * @param keySpace         memory containing the key
+   * @param keySpacePosition position of the key within keySpace
+   */
+  public void initBucket(final int bucket, final Memory keySpace, final int keySpacePosition)
+  {
+    final int bucketOffset = bucket * bucketSize;
+
+    // Method preconditions.
+    assert canInsertNewBucket() && !isOffsetUsed(bucketOffset);
+
+    // Mark the bucket used and write in the key.
+    tableMemory.putByte(bucketOffset, USED_BYTE);
+    keySpace.copyTo(keySpacePosition, tableMemory, bucketOffset + USED_BYTE_SIZE, keySize);
+    size++;
+  }
+
+  /**
+   * Returns the number of elements currently in the table.
+   */
+  public int size()
+  {
+    return size;
+  }
+
+  /**
+   * Returns the number of buckets in this table. Note that not all of these can actually be used. The amount that
+   * can be used depends on the "maxSize" parameter provided during construction.
+   */
+  public int numBuckets()
+  {
+    return numBuckets;
+  }
+
+  /**
+   * Returns the size of keys, in bytes.
+   */
+  public int keySize()
+  {
+    return keySize;
+  }
+
+  /**
+   * Returns the size of values, in bytes.
+   */
+  public int valueSize()
+  {
+    return valueSize;
+  }
+
+  /**
+   * Returns the offset within each bucket where the key starts.
+   */
+  public int bucketKeyOffset()
+  {
+    return USED_BYTE_SIZE;
+  }
+
+  /**
+   * Returns the offset within each bucket where the value starts.
+   */
+  public int bucketValueOffset()
+  {
+    return USED_BYTE_SIZE + keySize;
+  }
+
+  /**
+   * Returns the size in bytes of each bucket.
+   */
+  public int bucketSize()
+  {
+    return bucketSize;
+  }
+
+  /**
+   * Returns the position within {@link #memory()} where a particular bucket starts.
+   */
+  public int bucketMemoryPosition(final int bucket)
+  {
+    return bucket * bucketSize;
+  }
+
+  /**
+   * Returns the memory backing this table.
+   */
+  public WritableMemory memory()
+  {
+    return tableMemory;
+  }
+
+  /**
+   * Iterates over all used buckets, returning bucket numbers for each one.
+   *
+   * The intent is that callers will pass the bucket numbers to {@link #bucketMemoryPosition} and then use
+   * {@link #bucketKeyOffset()} and {@link #bucketValueOffset()} to extract keys and values from the buckets as needed.
+   */
+  public IntIterator bucketIterator()
+  {
+    return new IntIterator()
+    {
+      private int curr = 0;
+      private int currBucket = -1;
+
+      @Override
+      public boolean hasNext()
+      {
+        return curr < size;
+      }
+
+      @Override
+      public int nextInt()
+      {
+        if (curr >= size) {
+          throw new NoSuchElementException();
+        }
+
+        currBucket++;
+
+        while (!isOffsetUsed(currBucket * bucketSize)) {
+          currBucket++;
+        }
+
+        curr++;
+        return currBucket;
+      }
+    };
+  }
+
+  /**
+   * Returns whether the bucket at position "bucketOffset" is used or not. Note that this is a bucket position (in
+   * bytes), not a bucket number.
+   */
+  private boolean isOffsetUsed(final int bucketOffset)
+  {
+    return tableMemory.getByte(bucketOffset) == 1;
 
 Review comment:
   Yes, I'll change it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374974801
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/MemoryOpenHashTable.java
 ##########
 @@ -0,0 +1,433 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import it.unimi.dsi.fastutil.ints.IntIterator;
+import org.apache.datasketches.memory.Memory;
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.groupby.epinephelinae.Groupers;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.NoSuchElementException;
+
+/**
+ * An open-addressed hash table with linear probing backed by {@link WritableMemory}. Does not offer a similar
+ * interface to {@link java.util.Map} because this is meant to be useful to lower-level, high-performance callers.
+ * There is no copying or serde of keys and values: callers access the backing memory of the table directly.
+ *
+ * This table will not grow itself. Callers must handle growing if required; the {@link #copyTo} method is provided
+ * to assist.
+ */
+public class MemoryOpenHashTable
+{
+  private static final byte USED_BYTE = 1;
+  private static final int USED_BYTE_SIZE = Byte.BYTES;
+
+  private final WritableMemory tableMemory;
+  private final int keySize;
+  private final int valueSize;
+  private final int bucketSize;
+
+  // Maximum number of elements in the table (based on numBuckets and maxLoadFactor).
+  private final int maxSize;
+
+  // Number of available/used buckets in the table. Always a power of two.
+  private final int numBuckets;
+
+  // Mask that clips a number to [0, numBuckets). Used when searching through buckets.
+  private final int bucketMask;
+
+  // Number of elements in the table right now.
+  private int size;
+
+  /**
+   * Create a new table.
+   *
+   * @param tableMemory backing memory for the table; must be exactly large enough to hold "numBuckets"
+   * @param numBuckets  number of buckets for the table
+   * @param maxSize     maximum number of elements for the table; must be less than numBuckets
+   * @param keySize     key size in bytes
+   * @param valueSize   value size in bytes
+   */
+  public MemoryOpenHashTable(
+      final WritableMemory tableMemory,
+      final int numBuckets,
+      final int maxSize,
+      final int keySize,
+      final int valueSize
+  )
+  {
+    this.tableMemory = tableMemory;
+    this.numBuckets = numBuckets;
+    this.bucketMask = numBuckets - 1;
+    this.maxSize = maxSize;
+    this.keySize = keySize;
+    this.valueSize = valueSize;
+    this.bucketSize = bucketSize(keySize, valueSize);
+
+    // Our main user today (HashVectorGrouper) needs the tableMemory to be backed by a big-endian ByteBuffer that is
 
 Review comment:
   Ah, this should probably say "VectorGrouper implementations". I'll change it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-581775131
 
 
   This pull request **introduces 7 alerts** when merging 63d13d294d18e6da4b53ff63e522fe1de89f63bc into a085685182d62e5dd1b716f1bbb9bcbbaeb1c661 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-46d9c318cae76d30016635bdec4c2f665940b8df)
   
   **new alerts:**
   
   * 7 for Result of multiplication cast to wider type

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374975127
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/HashTableUtils.java
 ##########
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import org.apache.datasketches.memory.Memory;
+
+public class HashTableUtils
+{
+  private HashTableUtils()
+  {
+    // No instantiation.
+  }
+
+  /**
+   * Computes the previous power of two less than or equal to a given "n".
+   *
+   * The integer should be between 1 (inclusive) and {@link Integer#MAX_VALUE} for best results. Other parameters will
+   * return {@link Integer#MIN_VALUE}.
+   */
+  public static int previousPowerOfTwo(final int n)
+  {
+    if (n > 0) {
+      return Integer.highestOneBit(n);
+    } else {
+      return Integer.MIN_VALUE;
+    }
+  }
+
+  /**
+   * Compute a simple, fast hash code of some memory range.
+   *
+   * @param memory   a region of memory
+   * @param position position within the memory region
+   * @param length   length of memory to hash, starting at the position
+   */
+  public static int hashMemory(final Memory memory, final long position, final int length)
+  {
+    // Special cases for small, common key sizes to speed them up: e.g. one int key, two int keys, one long key, etc.
+    // The plus-one sizes (9, 13) are for nullable dimensions. The specific choices of special cases were chosen based
+    // on benchmarking (see MemoryBenchmark) on a Skylake-based cloud instance.
+
+    switch (length) {
+      case 4:
+        return 31 + memory.getInt(position);
+
+      case 8:
+        return 31 * (31 + memory.getInt(position)) + memory.getInt(position + Integer.BYTES);
+
+      case 9:
+        return 31 * (31 * (31 + memory.getInt(position)) + memory.getInt(position + Integer.BYTES))
+               + memory.getByte(position + 2 * Integer.BYTES);
 
 Review comment:
   You're right when it comes to nulls, but, I didn't want this general purpose hashing function to be so specific to the data. In fact HashTableUtilsTest has a test that verifies that changing a single bit (any bit) will change the hashcode.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582151018
 
 
   This pull request **introduces 6 alerts** when merging bdad6e919c6a7ad73ab30e81780ee1c23013ee5a into 768d60c7b405bca3ff71ac7fc062a0d0015fdb2d - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-a78ffbfa2a7503798e9863418e8c40fb4e65fc7a)
   
   **new alerts:**
   
   * 6 for Result of multiplication cast to wider type

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

gianm commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582094589
 
 
   > [WARNING] Unused declared dependencies found:
   > [WARNING]    org.easymock:easymock:jar:4.0.2:compile
   
   Hmm, this is a lie. It's required by SqlBenchmark:
   
   ```
   java.lang.NoClassDefFoundError: org/easymock/EasyMock
   	at org.apache.druid.sql.calcite.util.CalciteTests.createMockSystemSchema(CalciteTests.java:876)
   	at org.apache.druid.benchmark.query.SqlBenchmark.setup(SqlBenchmark.java:197)
   	at org.apache.druid.benchmark.query.generated.SqlBenchmark_querySql_jmhTest._jmh_tryInit_f_sqlbenchmark0_G(SqlBenchmark_querySql_jmhTest.java:448)
   	at org.apache.druid.benchmark.query.generated.SqlBenchmark_querySql_jmhTest.querySql_AverageTime(SqlBenchmark_querySql_jmhTest.java:162)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453)
   	at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   ```
   
   Maybe the analyzer is getting confused since EasyMock isn't being used _directly_. It's being used by `CalciteTests.createMockSystemSchema` from the druid-sql test jar. For some reason it doesn't get pulled in as a dependency unless explicitly declared in the benchmarks pom.
   
   I'm not sure what to do about this...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374951659
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/MemoryOpenHashTable.java
 ##########
 @@ -0,0 +1,433 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import it.unimi.dsi.fastutil.ints.IntIterator;
+import org.apache.datasketches.memory.Memory;
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.query.groupby.epinephelinae.Groupers;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.NoSuchElementException;
+
+/**
+ * An open-addressed hash table with linear probing backed by {@link WritableMemory}. Does not offer a similar
+ * interface to {@link java.util.Map} because this is meant to be useful to lower-level, high-performance callers.
+ * There is no copying or serde of keys and values: callers access the backing memory of the table directly.
+ *
+ * This table will not grow itself. Callers must handle growing if required; the {@link #copyTo} method is provided
+ * to assist.
+ */
+public class MemoryOpenHashTable
+{
+  private static final byte USED_BYTE = 1;
+  private static final int USED_BYTE_SIZE = Byte.BYTES;
+
+  private final WritableMemory tableMemory;
+  private final int keySize;
+  private final int valueSize;
+  private final int bucketSize;
+
+  // Maximum number of elements in the table (based on numBuckets and maxLoadFactor).
+  private final int maxSize;
+
+  // Number of available/used buckets in the table. Always a power of two.
+  private final int numBuckets;
+
+  // Mask that clips a number to [0, numBuckets). Used when searching through buckets.
+  private final int bucketMask;
+
+  // Number of elements in the table right now.
+  private int size;
+
+  /**
+   * Create a new table.
+   *
+   * @param tableMemory backing memory for the table; must be exactly large enough to hold "numBuckets"
+   * @param numBuckets  number of buckets for the table
+   * @param maxSize     maximum number of elements for the table; must be less than numBuckets
+   * @param keySize     key size in bytes
+   * @param valueSize   value size in bytes
+   */
+  public MemoryOpenHashTable(
+      final WritableMemory tableMemory,
+      final int numBuckets,
+      final int maxSize,
+      final int keySize,
+      final int valueSize
+  )
+  {
+    this.tableMemory = tableMemory;
+    this.numBuckets = numBuckets;
+    this.bucketMask = numBuckets - 1;
+    this.maxSize = maxSize;
+    this.keySize = keySize;
+    this.valueSize = valueSize;
+    this.bucketSize = bucketSize(keySize, valueSize);
+
+    // Our main user today (HashVectorGrouper) needs the tableMemory to be backed by a big-endian ByteBuffer that is
+    // coterminous with the tableMemory, since it's going to feed that buffer into VectorAggregators instead of
+    // interacting with our WritableMemory directly. Nothing about this class actually requires that the Memory be
+    // backed by a ByteBuffer, but we'll check it here anyway for the benefit of our biggest customer.
+    verifyMemoryIsByteBuffer(tableMemory);
+
+    if (!tableMemory.getTypeByteOrder().equals(ByteOrder.nativeOrder())) {
+      throw new ISE("tableMemory must be native byte order");
+    }
+
+    if (tableMemory.getCapacity() != memoryNeeded(numBuckets, bucketSize)) {
+      throw new ISE(
+          "tableMemory must be size[%,d] but was[%,d]",
+          memoryNeeded(numBuckets, bucketSize),
+          tableMemory.getCapacity()
+      );
+    }
+
+    if (maxSize >= numBuckets) {
+      throw new ISE("maxSize must be less than numBuckets");
+    }
+
+    if (Integer.bitCount(numBuckets) != 1) {
+      throw new ISE("numBuckets must be a power of two but was[%,d]", numBuckets);
+    }
+
+    clear();
+  }
+
+  /**
+   * Returns the amount of memory needed for a table.
+   *
+   * This is just a multiplication, which is easy enough to do on your own, but sometimes it's nice for clarity's sake
+   * to call a function with a name that indicates why the multiplication is happening.
+   *
+   * @param numBuckets number of buckets
+   * @param bucketSize size per bucket (in bytes)
+   *
+   * @return size of table (in bytes)
+   */
+  public static int memoryNeeded(final int numBuckets, final int bucketSize)
+  {
+    return numBuckets * bucketSize;
+  }
+
+  /**
+   * Returns the size of each bucket in a table.
+   *
+   * @param keySize   size of keys (in bytes)
+   * @param valueSize size of values (in bytes)
+   *
+   * @return size of buckets (in bytes)
+   */
+  public static int bucketSize(final int keySize, final int valueSize)
+  {
+    return USED_BYTE_SIZE + keySize + valueSize;
 
 Review comment:
   When the key is nullable, I'm wondering whether we can share the last byte as both the null flag and the used bucket flag to save memory space. For example, the used bucket flag and the null flag can be stored in the upper 4 bits and the lower 4 bits, respectively. This way will need an extra branch (or perhaps 2 shifts) to extract the key properly from the bucket, but I guess it would be ok?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374896340
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/HashTableUtils.java
 ##########
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import org.apache.datasketches.memory.Memory;
+
+public class HashTableUtils
+{
+  private HashTableUtils()
+  {
+    // No instantiation.
+  }
+
+  /**
+   * Computes the previous power of two less than or equal to a given "n".
+   *
+   * The integer should be between 1 (inclusive) and {@link Integer#MAX_VALUE} for best results. Other parameters will
+   * return {@link Integer#MIN_VALUE}.
+   */
+  public static int previousPowerOfTwo(final int n)
+  {
+    if (n > 0) {
+      return Integer.highestOneBit(n);
+    } else {
+      return Integer.MIN_VALUE;
+    }
+  }
+
+  /**
+   * Compute a simple, fast hash code of some memory range.
+   *
+   * @param memory   a region of memory
+   * @param position position within the memory region
+   * @param length   length of memory to hash, starting at the position
+   */
+  public static int hashMemory(final Memory memory, final long position, final int length)
+  {
+    // Special cases for small, common key sizes to speed them up: e.g. one int key, two int keys, one long key, etc.
+    // The plus-one sizes (9, 13) are for nullable dimensions. The specific choices of special cases were chosen based
+    // on benchmarking (see MemoryBenchmark) on a Skylake-based cloud instance.
+
+    switch (length) {
+      case 4:
+        return 31 + memory.getInt(position);
 
 Review comment:
   Seems like `+ 31` isn't necessary.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582051627
 
 
   This pull request **introduces 6 alerts** when merging f1e5047eec64e49347fb6530fd95587b1e0b84bb into a085685182d62e5dd1b716f1bbb9bcbbaeb1c661 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-95ec21e67b93b99ca82411aa6d6284501af3a503)
   
   **new alerts:**
   
   * 6 for Result of multiplication cast to wider type

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

lgtm-com[bot] commented on issue #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#issuecomment-582186380
 
 
   This pull request **introduces 6 alerts** when merging e85219dd60b8bd594ed8cdd4c2efcd91ab2b2386 into 556a3861ed6eca91c49e5827ddaca121a313943f - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-fd29b64fe548dc95c7996751d7958ca7c8063692)
   
   **new alerts:**
   
   * 6 for Result of multiplication cast to wider type

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable.
URL: https://github.com/apache/druid/pull/9308#discussion_r374958517
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/collection/HashTableUtils.java
 ##########
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.groupby.epinephelinae.collection;
+
+import org.apache.datasketches.memory.Memory;
+
+public class HashTableUtils
+{
+  private HashTableUtils()
+  {
+    // No instantiation.
+  }
+
+  /**
+   * Computes the previous power of two less than or equal to a given "n".
+   *
+   * The integer should be between 1 (inclusive) and {@link Integer#MAX_VALUE} for best results. Other parameters will
+   * return {@link Integer#MIN_VALUE}.
+   */
+  public static int previousPowerOfTwo(final int n)
+  {
+    if (n > 0) {
+      return Integer.highestOneBit(n);
+    } else {
+      return Integer.MIN_VALUE;
+    }
+  }
+
+  /**
+   * Compute a simple, fast hash code of some memory range.
+   *
+   * @param memory   a region of memory
+   * @param position position within the memory region
+   * @param length   length of memory to hash, starting at the position
+   */
+  public static int hashMemory(final Memory memory, final long position, final int length)
+  {
+    // Special cases for small, common key sizes to speed them up: e.g. one int key, two int keys, one long key, etc.
+    // The plus-one sizes (9, 13) are for nullable dimensions. The specific choices of special cases were chosen based
+    // on benchmarking (see MemoryBenchmark) on a Skylake-based cloud instance.
+
+    switch (length) {
+      case 4:
+        return 31 + memory.getInt(position);
+
+      case 8:
+        return 31 * (31 + memory.getInt(position)) + memory.getInt(position + Integer.BYTES);
 
 Review comment:
   Did you intend `31 * (31 * memory.getInt(position)) + memory.getInt(position + Integer.BYTES);`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org