You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/10/25 10:34:17 UTC

[GitHub] [doris] jacktengg opened a new issue, #13653: [Enhancement]

jacktengg opened a new issue, #13653:
URL: https://github.com/apache/doris/issues/13653

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Description
   
   In vhash join node, if there are a lot of unique build side rows, hash table resize will alloc big consecutive memory and alloc will easily fail:
   ```
   #0  0x00007fabaebd0207 in raise () from /lib64/libc.so.6
   #1  0x00007fabaebd18f8 in abort () from /lib64/libc.so.6
   #2  0x000055ef7322b1ec in __gnu_cxx::__verbose_terminate_handler() [clone .cold] ()
   #3  0x000055ef7758dfa6 in __cxxabiv1::__terminate(void (*)()) ()
   #4  0x000055ef7758e011 in std::terminate() ()
   #5  0x000055ef7758e165 in __cxa_throw ()
   #6  0x000055ef7307a95b in doris::vectorized::throwFromErrno (s=..., code=code@entry=54, e=e@entry=12) at /mnt/disk1/dafeng/build/ldb/include/c++/11/optional:361
   #7  0x000055ef74209d49 in Allocator<true, true>::alloc_no_track () at /mnt/disk1/dafeng/build/ldb/include/c++/11/bits/char_traits.h:371
   #8  0x000055ef74209f24 in alloc (alignment=0, size=2147483648, this=<optimized out>) at /mnt/disk1/dafeng/binary/1.1.2-lts/doris/be/src/vec/common/allocator.h:103
   #9  Allocator<true, true>::realloc () at /mnt/disk1/dafeng/binary/1.1.2-lts/doris/be/src/vec/common/allocator.h:171
   #10 0x000055ef7420c98a in HashTable<unsigned long, HashMapCell<unsigned long, doris::vectorized::RowRefList, HashCRC32<unsigned long>, HashTableNoState>, HashCRC32<unsigned long>, HashTableGrower<10ul>, Allocator<true, true> >::resize (this=0x55f196b423b0, for_num_elems=26787725, for_buf_size=<optimized out>)
       at /mnt/disk1/dafeng/binary/1.1.2-lts/doris/be/src/vec/common/hash_table/hash_table.h:932
   #11 0x000055ef743086a8 in expanse_for_add_elem (num_elem=<optimized out>, this=0x55f196b423b0) at /mnt/disk1/dafeng/binary/1.1.2-lts/doris/be/src/vec/common/hash_table/hash_table.h:250
   #12 doris::vectorized::ProcessHashTableBuild<doris::vectorized::PrimaryTypeHashTableContext<unsigned long>, true, false>::operator() (this=this@entry=0x7faa28a60d00, hash_table_ctx=...,
       null_map=0x7faa28a61600, has_runtime_filter=has_runtime_filter@entry=false) at /mnt/disk1/dafeng/binary/1.1.2-lts/doris/be/src/vec/exec/join/vhash_join_node.cpp:73
   #13 0x000055ef7429512a in _ZZN5doris10vectorized12HashJoinNode20_process_build_blockEPNS_12RuntimeStateERNS0_5BlockEhENKUlOT_E0_clIRNS0_27PrimaryTypeHashTableContextImEEEEDaS7_ ()
       at /mnt/disk1/dafeng/binary/1.1.2-lts/doris/be/src/vec/exec/join/vhash_join_node.cpp:43
   #14 0x000055ef742a058e in __do_visit<std::__detail::__variant::__deduce_visit_result<void>, doris::vectorized::HashJoinNode::_process_build_block(doris::RuntimeState*, doris::vectorized::Block&, uint8_t)::<lambda(auto:38&&)>, std::variant<std::monostate, doris::vectorized::SerializedHashTableContext, doris::vectorized::PrimaryTypeHashTableContext<unsigned char>, doris::vectorized::PrimaryTypeHashTableContext<short unsigned int>, doris::vectorized::PrimaryTypeHashTableContext<unsigned int>, doris::vectorized::PrimaryTypeHashTableContext<long unsigned int>, doris::vectorized::PrimaryTypeHashTableContext<doris::vectorized::UInt128>, doris::vectorized::PrimaryTypeHashTableContext<doris::vectorized::UInt256>, doris::vectorized::FixedKeyHashTableContext<long unsigned int, true>, doris::vectorized::FixedKeyHashTableContext<long unsigned int, false>, doris::vectorized::FixedKeyHashTableContext<doris::vectorized::UInt128, true>, doris::vectorized::FixedKeyHashTableContext<doris::vectorized
 ::UInt128, false>, doris::vectorized::FixedKeyHashTableContext<doris::vectorized::UInt256, true>, doris::vectorized::FixedKeyHashTableContext<doris::vectorized::UInt256, false> >&> (
       __visitor=<unknown type in /data/be/lib/doris_be, CU 0x1f27174d, DIE 0x1f4d5363>) at /mnt/disk1/dafeng/build/ldb/include/c++/11/variant:1708
   #15 visit<doris::vectorized::HashJoinNode::_process_build_block(doris::RuntimeState*, doris::vectorized::Block&, uint8_t)::<lambda(auto:38&&)>, std::variant<std::monostate, doris::vectorized::SerializedHashTableContext, doris::vectorized::PrimaryTypeHashTableContext<unsigned char>, doris::vectorized::PrimaryTypeHashTableContext<short unsigned int>, doris::vectorized::PrimaryTypeHashTableContext<unsigned int>, doris::vectorized::PrimaryTypeHashTableContext<long unsigned int>, doris::vectorized::PrimaryTypeHashTableContext<doris::vectorized::UInt128>, doris::vectorized::PrimaryTypeHashTableContext<doris::vectorized::UInt256>, doris::vectorized::FixedKeyHashTableContext<long unsigned int, true>, doris::vectorized::FixedKeyHashTableContext<long unsigned int, false>, doris::vectorized::FixedKeyHashTableContext<doris::vectorized::UInt128, true>, doris::vectorized::FixedKeyHashTableContext<doris::vectorized::UInt128, false>, doris::vectorized::FixedKeyHashTableContext<doris::vectorized::
 UInt256, true>, doris::vectorized::FixedKeyHashTableContext<doris::vectorized::UInt256, false> >&> (__visitor=<unknown type in /data/be/lib/doris_be, CU 0x1f27174d, DIE 0x1f4d5295>)
       at /mnt/disk1/dafeng/build/ldb/include/c++/11/variant:1764
   #16 doris::vectorized::HashJoinNode::_process_build_block(doris::RuntimeState*, doris::vectorized::Block&, unsigned char) ()
   ```
   
   We can consider using two level hash table in vhash join node, to reduce the possibility of alloc fail when doing hash table resize.
   
   ### Solution
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org