You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/04/18 01:17:49 UTC

[GitHub] [incubator-doris] englefly opened a new pull request, #9069: resize hash table before building

englefly opened a new pull request, #9069:
URL: https://github.com/apache/incubator-doris/pull/9069

   # Proposed changes
   Initialize hash table size by the tuple number instead of fixed number 1024 to reduce BuildTableExpanseTime.
   After initialize table size, the total build time decreased by 8.9% on tpch 10G, 
   `select count(*) from lineitem join orders on l_orderkey = o_orderkey`
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei closed pull request #9069: resize hash table before building

Posted by GitBox <gi...@apache.org>.
yiguolei closed pull request #9069: resize hash table before building
URL: https://github.com/apache/incubator-doris/pull/9069


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] zbtzbtzbt commented on a diff in pull request #9069: resize hash table before building

Posted by GitBox <gi...@apache.org>.
zbtzbtzbt commented on code in PR #9069:
URL: https://github.com/apache/incubator-doris/pull/9069#discussion_r851835048


##########
be/src/vec/common/hash_table/hash_table.h:
##########
@@ -612,6 +555,62 @@ class HashTable : private boost::noncopyable,
         return *this;
     }
 
+        /// Increase the size of the buffer.
+    void resize(size_t for_num_elems = 0, size_t for_buf_size = 0) {
+        SCOPED_RAW_TIMER(&_resize_timer_ns);

Review Comment:
   why move this function? it seems no changes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] zbtzbtzbt commented on a diff in pull request #9069: resize hash table before building

Posted by GitBox <gi...@apache.org>.
zbtzbtzbt commented on code in PR #9069:
URL: https://github.com/apache/incubator-doris/pull/9069#discussion_r851835048


##########
be/src/vec/common/hash_table/hash_table.h:
##########
@@ -612,6 +555,62 @@ class HashTable : private boost::noncopyable,
         return *this;
     }
 
+        /// Increase the size of the buffer.
+    void resize(size_t for_num_elems = 0, size_t for_buf_size = 0) {
+        SCOPED_RAW_TIMER(&_resize_timer_ns);

Review Comment:
   why move this function? it seems no changes
   
   An effective change is only  `hash_table_ctx.hash_table.resize(_rows);` right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] zbtzbtzbt commented on pull request #9069: resize hash table before building

Posted by GitBox <gi...@apache.org>.
zbtzbtzbt commented on PR #9069:
URL: https://github.com/apache/incubator-doris/pull/9069#issuecomment-1101017485

   what is the hashtable build time? before and after your pr.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] englefly commented on a diff in pull request #9069: resize hash table before building

Posted by GitBox <gi...@apache.org>.
englefly commented on code in PR #9069:
URL: https://github.com/apache/incubator-doris/pull/9069#discussion_r852153017


##########
be/src/vec/common/hash_table/hash_table.h:
##########
@@ -612,6 +555,62 @@ class HashTable : private boost::noncopyable,
         return *this;
     }
 
+        /// Increase the size of the buffer.
+    void resize(size_t for_num_elems = 0, size_t for_buf_size = 0) {
+        SCOPED_RAW_TIMER(&_resize_timer_ns);

Review Comment:
   HashTable::resize  moved from protected to public, so that it could be invoked by ProcessHashTableBuild in vhash_join_node.cpp



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] englefly commented on a diff in pull request #9069: resize hash table before building

Posted by GitBox <gi...@apache.org>.
englefly commented on code in PR #9069:
URL: https://github.com/apache/incubator-doris/pull/9069#discussion_r852155480


##########
be/src/vec/common/hash_table/hash_table.h:
##########
@@ -612,6 +555,62 @@ class HashTable : private boost::noncopyable,
         return *this;
     }
 
+        /// Increase the size of the buffer.
+    void resize(size_t for_num_elems = 0, size_t for_buf_size = 0) {
+        SCOPED_RAW_TIMER(&_resize_timer_ns);

Review Comment:
   the test result: 1419 ms -> 1293 ms
   tested 3 times, the average BuildTableInsertTime decreased by 8.9% (=  (1419-1293) / 1419 )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] englefly commented on a diff in pull request #9069: resize hash table before building

Posted by GitBox <gi...@apache.org>.
englefly commented on code in PR #9069:
URL: https://github.com/apache/incubator-doris/pull/9069#discussion_r852155480


##########
be/src/vec/common/hash_table/hash_table.h:
##########
@@ -612,6 +555,62 @@ class HashTable : private boost::noncopyable,
         return *this;
     }
 
+        /// Increase the size of the buffer.
+    void resize(size_t for_num_elems = 0, size_t for_buf_size = 0) {
+        SCOPED_RAW_TIMER(&_resize_timer_ns);

Review Comment:
   the test result: 1419 ms -> 1293 ms
   tested 3 times, the average BuildTableInsertTime decreased by 8.8% (=  (1419-1293) / 1419 )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9069: resize hash table before building

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9069:
URL: https://github.com/apache/incubator-doris/pull/9069#issuecomment-1134096229

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9069: resize hash table before building

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9069:
URL: https://github.com/apache/incubator-doris/pull/9069#issuecomment-1134096252

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org