You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/04/26 02:45:32 UTC

[GitHub] [incubator-doris] mrhhsg opened a new pull request, #9226: [Enhancement]Avoid unnecessary calculations in HashTableGrower

mrhhsg opened a new pull request, #9226:
URL: https://github.com/apache/incubator-doris/pull/9226

   # Proposed changes
   
   Issue Number: close #9224 
   
   ## Problem Summary:
   
   After using this enhancement, the time to build the hash table is slightly reduced.
   I test this with TPC-H Q1 with 10GB and run Q1 4 times separately.
   |seq|Without the enhancement|With the enhancement|
   |--|---------|----------|
   |1|3s480ms|3s449ms|
   |2|3s274ms|3s277ms|
   |3|3s283ms|3s273ms|
   |4|3s331ms|3s281ms|
   
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on pull request #9226: [Enhancement]Avoid unnecessary calculations in HashTableGrower

Posted by GitBox <gi...@apache.org>.
yiguolei commented on PR #9226:
URL: https://github.com/apache/incubator-doris/pull/9226#issuecomment-1119332126

   I will close this PR since it only have 0.02% improvement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] mrhhsg commented on pull request #9226: [Enhancement]Avoid unnecessary calculations in HashTableGrower

Posted by GitBox <gi...@apache.org>.
mrhhsg commented on PR #9226:
URL: https://github.com/apache/incubator-doris/pull/9226#issuecomment-1109590057

   > The query performance change looks very small, more like normal query time consuming jitter, are you sure this pr has a performance improvement?
   
   hi @yangzhg 
   In fact, the impact is very small, but a bit operation will be performed each time when data is inserted into the hash table.
   For example, it may take tens of milliseconds to perform 150,000 bit operations.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9226: [Enhancement]Avoid unnecessary calculations in HashTableGrower

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9226:
URL: https://github.com/apache/incubator-doris/pull/9226#discussion_r858506680


##########
be/src/vec/common/hash_table/hash_table.h:
##########
@@ -245,11 +245,14 @@ struct HashTableGrower {
     /// The state of this structure is enough to get the buffer size of the hash table.
     doris::vectorized::UInt8 size_degree = initial_size_degree;
 
+    size_t _buf_size = 1ULL << initial_size_degree;
+    size_t _max_fill = 1ULL << (initial_size_degree - 1);

Review Comment:
   I think we'd better not change this. I find that it only improved about 0.02%. But the readability of the code is affected. Maybe @HappenLee  could read this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on pull request #9226: [Enhancement]Avoid unnecessary calculations in HashTableGrower

Posted by GitBox <gi...@apache.org>.
yangzhg commented on PR #9226:
URL: https://github.com/apache/incubator-doris/pull/9226#issuecomment-1109580682

   The query performance change looks very small, more like normal query time consuming jitter, are you sure this pr has a performance improvement?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9226: [Enhancement]Avoid unnecessary calculations in HashTableGrower

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9226:
URL: https://github.com/apache/incubator-doris/pull/9226#discussion_r858489546


##########
be/src/vec/common/hash_table/hash_table.h:
##########
@@ -245,11 +245,14 @@ struct HashTableGrower {
     /// The state of this structure is enough to get the buffer size of the hash table.
     doris::vectorized::UInt8 size_degree = initial_size_degree;
 
+    size_t _buf_size = 1ULL << initial_size_degree;
+    size_t _max_fill = 1ULL << (initial_size_degree - 1);
+
     /// The size of the hash table in the cells.
-    size_t buf_size() const { return 1ULL << size_degree; }
+    size_t buf_size() const { return _buf_size; }

Review Comment:
   if the size_degree is modified in increase_size method. The buffer size should changed but in your code, the buffer size is const.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei closed pull request #9226: [Enhancement]Avoid unnecessary calculations in HashTableGrower

Posted by GitBox <gi...@apache.org>.
yiguolei closed pull request #9226: [Enhancement]Avoid unnecessary calculations in HashTableGrower
URL: https://github.com/apache/incubator-doris/pull/9226


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org