You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2019/09/18 03:29:37 UTC

[GitHub] [incubator-doris] imay opened a new issue #1824: All olap scanner hang at ShardedLRUCache's lock

imay opened a new issue #1824: All olap scanner hang at ShardedLRUCache's lock
URL: https://github.com/apache/incubator-doris/issues/1824
 
 
   Today, we found all backends have no response.
   When I login and check its stack. we found that all scanner are wait lock in ShardedLRUCache. The stack looks like following
   ```
   #0  0x00007f31aef69e24 in __lll_lock_wait () from /opt/compiler/gcc-4.8.2/lib64/libpthread.so.0
   #1  0x00007f31aef656d9 in _L_lock_535 () from /opt/compiler/gcc-4.8.2/lib64/libpthread.so.0
   #2  0x00007f31aef65500 in pthread_mutex_lock () from /opt/compiler/gcc-4.8.2/lib64/libpthread.so.0
   #3  0x0000000002192a52 in pthread_mutex_lock ()
   #4  0x0000000000d58630 in doris::Mutex::lock() ()
   #5  0x0000000000d950e3 in doris::ShardedLRUCache::lookup(doris::CacheKey const&) ()
   #6  0x0000000000d87d9c in doris::FileHandler::open_with_cache(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int) ()
   #7  0x0000000000da1c76 in doris::SegmentReader::_load_segment_file() ()
   #8  0x0000000000da926a in doris::SegmentReader::init(bool) ()
   #9  0x0000000000d68cca in doris::ColumnData::_seek_to_block(doris::RowBlockPosition const&, bool) ()
   #10 0x0000000000d69f42 in doris::ColumnData::_get_block ()
   #11 0x0000000000d6c25b in doris::ColumnData::_seek_to_row(doris::RowCursor const&, bool, bool) ()
   #12 0x0000000000d6c69c in doris::ColumnData::prepare_block_read(doris::RowCursor const*, bool, doris::RowCursor const*, bool, doris::RowBlock**) ()
   #13 0x0000000000d20326 in doris::Reader::_attach_data_to_merge_set(bool, bool*) ()
   #14 0x0000000000d23476 in doris::Reader::init(doris::ReaderParams const&) ()
   #15 0x000000000127d54a in doris::OlapScanner::open() ()
   #16 0x0000000001254f94 in doris::OlapScanNode::scanner_thread(doris::OlapScanner*) ()
   #17 0x0000000000dfea96 in doris::PriorityThreadPool::work_thread(int) ()
   #18 0x000000000171cb5d in thread_proxy ()
   #19 0x00007f31aef631c3 in start_thread () from /opt/compiler/gcc-4.8.2/lib64/libpthread.so.0
   #20 0x00007f31af26012d in clone () from /opt/compiler/gcc-4.8.2/lib64/libc.so.6
   ```
   And I found that this lock is held to prune FdCache, the stack looks like 
   ```
   #0  0x00007f31af25264d in close () from /opt/compiler/gcc-4.8.2/lib64/libc.so.6
   #1  0x0000000000d8a8b8 in doris::FileHandler::_delete_cache_file_descriptor(doris::CacheKey const&, void*) ()
   #2  0x0000000000d95b2a in doris::LRUCache::_unref(doris::LRUHandle*) ()
   #3  0x0000000000d95d3f in doris::ShardedLRUCache::prune() ()
   #4  0x0000000000cba0f8 in doris::OLAPEngine::start_clean_fd_cache() ()
   #5  0x0000000000cf125f in doris::OLAPEngine::_fd_cache_clean_callback(void*) ()
   #6  0x0000000000cf130f in _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN5doris10OLAPEngine16_start_bg_workerEvEUlvE4_EEEEE6_M_runEv ()
   #7  0x00000000026f72ef in execute_native_thread_routine ()
   #8  0x00007f31aef631c3 in start_thread () from /opt/compiler/gcc-4.8.2/lib64/libpthread.so.0
   #9  0x00007f31af26012d in clone () from /opt/compiler/gcc-4.8.2/lib64/libc.so.6
   ```
   
   So we should improve our fd cache to avoid such things.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org