You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/07/15 20:03:45 UTC

[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #5708: Support BYTES type for group-by expression

Jackie-Jiang opened a new pull request #5708:
URL: https://github.com/apache/incubator-pinot/pull/5708


   ## Description
   Support BYTES type for group-by expression
   
   Changes:
   - Add ValueToIdMap (on-the-fly dictionary) for BYTES type
   - Re-order the operation to save the per-value switch case in `NoDictionaryMultiColumnGroupKeyGenerator. generateKeysForBlock()`
   - Add type specific group key iterator in `NoDictionaryMultiColumnGroupKeyGenerator` for performance improvement
   - Enhance `NoDictionaryGroupKeyGeneratorTest` to test BYTES type


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5708: Support BYTES type for group-by expression

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5708:
URL: https://github.com/apache/incubator-pinot/pull/5708#discussion_r456656111



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionarySingleColumnGroupKeyGenerator.java
##########
@@ -163,84 +164,201 @@ public int getCurrentGroupKeyUpperBound() {
 
   @Override
   public Iterator<GroupKey> getUniqueGroupKeys() {
-    return new GroupKeyIterator(_groupKeyMap);
+    switch (_dataType) {

Review comment:
       This may sound like an over-optimization but adding type specific GroupKeyGenerator has the potential to add overhead for runtime dispatch which Java doesn't handle well.
   
   What is the advantage of having a type specific next method like the following for Double
   
   ```
   @Override
       public GroupKey next() {
         Double2IntMap.Entry entry = _iterator.next();
         _groupKey._groupId = entry.getIntValue();
         _groupKey._stringKey = Double.toString(entry.getDoubleKey());
         return _groupKey;
       }
   ```
   v/s the existing generic method
   
   ```
   @Override
       public GroupKey next() {
         Map.Entry<Object, Integer> entry = _iterator.next();
         _groupKey._groupId = entry.getValue();
         _groupKey._stringKey = entry.getKey().toString();
         return _groupKey;
       }
   ```
   
   Both are doing toString()




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5708: Support BYTES type for group-by expression

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5708:
URL: https://github.com/apache/incubator-pinot/pull/5708#discussion_r456713649



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionarySingleColumnGroupKeyGenerator.java
##########
@@ -163,84 +164,201 @@ public int getCurrentGroupKeyUpperBound() {
 
   @Override
   public Iterator<GroupKey> getUniqueGroupKeys() {
-    return new GroupKeyIterator(_groupKeyMap);
+    switch (_dataType) {

Review comment:
       The main difference is from the entry-set and iterator implementation of the fastutil maps. For an example, `Int2IntMap.entrySet()` is marked deprecated and the class suggest using the type-specific method instead. Also, we are able to use the `fastIterator()` provided by the `FastEntrySet`, and also the unboxed values to reduce the garbage, and the unnecessary boxing/unboxing.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #5708: Support BYTES type for group-by expression

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged pull request #5708:
URL: https://github.com/apache/incubator-pinot/pull/5708


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org