You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/04/22 17:16:58 UTC

[GitHub] [druid] gianm opened a new pull request, #12474: GroupBy: Reduce allocations by reusing entry and key holders.

gianm opened a new pull request, #12474:
URL: https://github.com/apache/druid/pull/12474

   I did some testing with this patch + a relatively simple groupBy query run 5,000 times at high concurrency. Allocations were reduced by about 18%.
   
   Pre-patch:
   
   ![image](https://user-images.githubusercontent.com/1214075/164762807-84c057f5-4ca1-4e61-91d6-65cfdbdfb0e6.png)
   
   Post-patch:
   
   ![image](https://user-images.githubusercontent.com/1214075/164762825-d82f0fc1-e40c-448b-b4c9-31a98e52aff0.png)
   
   Two main changes:
   
   1) Reuse Entry objects returned by various implementations of
      Grouper.iterator.
   
   2) Reuse key objects contained within those Entry objects.
   
   This is allowed by the contract, which states that entries must be
   processed and immediately discarded. However, not all call sites
   respected this, so this patch also updates those call sites.
   
   One particularly sneaky way that the old code retained entries too long
   is due to Guava's MergingIterator and CombiningIterator. Internally,
   these both advance to the next value prior to returning the current
   value. So, this patch addresses that in two ways:
   
   1) For merging, we have our own implementation MergeIterator already,
      although it had the same problem. So, this patch updates our
      implementation to return the current item prior to advancing to the
      next item. It also adds a forbidden-api entry to ensure that this
      safer implementation is used instead of Guava's.
   
   2) For combining, we address the problem in a different way: by copying
      the key when creating the new, combined entry.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis merged pull request #12474: GroupBy: Reduce allocations by reusing entry and key holders.

Posted by GitBox <gi...@apache.org>.
clintropolis merged PR #12474:
URL: https://github.com/apache/druid/pull/12474


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 closed pull request #12474: GroupBy: Reduce allocations by reusing entry and key holders.

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 closed pull request #12474: GroupBy: Reduce allocations by reusing entry and key holders.
URL: https://github.com/apache/druid/pull/12474


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org