You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2020/07/15 01:44:00 UTC

[jira] [Created] (IMPALA-9955) Internal error for a query with large rowss and spilling

Quanlong Huang created IMPALA-9955:
--------------------------------------

             Summary: Internal error for a query with large rowss and spilling
                 Key: IMPALA-9955
                 URL: https://issues.apache.org/jira/browse/IMPALA-9955
             Project: IMPALA
          Issue Type: Bug
            Reporter: Quanlong Huang


Encounter a query failure due to internal error:
{code:java}
create table bigstrs stored as parquet as select *, repeat(uuid(), cast(random() * 100000 as int)) as bigstr from functional.alltypes;
set MAX_ROW_SIZE=3.5MB;
set MEM_LIMIT=4GB;
set DISABLE_CODEGEN=true;
create table my_cnt stored as parquet as select count(*) as cnt, bigstr from bigstrs group by bigstr;
{code}
The error is
{code:java}
ERROR: Internal error: couldn't pin large page of 4194304 bytes, client only had 2097152 bytes of unused reservation:
<BufferPool::Client> 0xcf9dae0 internal state: {<BufferPool::Client> 0xbdf6ac0 name: GroupingAggregator id=3 ptr=0xcf9d900 write_status:  buffers allocated 2097152 num_pages: 2094 pinned_bytes: 41943040 dirty_unpinned_bytes: 0 in_flight_write_bytes: 0 reservation: {<ReservationTracker>: reservation_limit 9223372036854775807 reservation 46137344 used_reservation 44040192 child_reservations 0 parent:
<ReservationTracker>: reservation_limit 9223372036854775807 reservation 46137344 used_reservation 0 child_reservations 46137344 parent:
<ReservationTracker>: reservation_limit 9223372036854775807 reservation 46137344 used_reservation 0 child_reservations 46137344 parent:
<ReservationTracker>: reservation_limit 3435973836 reservation 46137344 used_reservation 0 child_reservations 46137344 parent:
<ReservationTracker>: reservation_limit 6647046144 reservation 46137344 used_reservation 0 child_reservations 46137344 parent:
NULL}
  12 pinned pages: <BufferPool::Page> 0xc9160a0 len: 2097152 pin_count: 1 buf: <BufferPool::BufferHandle> 0xc916118 client: 0xcf9dae0/0xbdf6ac0 data: 0x13200000 len: 2097152
<BufferPool::Page> 0xc919d40 len: 4194304 pin_count: 1 buf: <BufferPool::BufferHandle> 0xc919db8 client: 0xcf9dae0/0xbdf6ac0 data: 0x124600000 len: 4194304
<BufferPool::Page> 0xd42aaa0 len: 4194304 pin_count: 1 buf: <BufferPool::BufferHandle> 0xd42ab18 client: 0xcf9dae0/0xbdf6ac0 data: 0x12b200000 len: 4194304
<BufferPool::Page> 0xd42b900 len: 4194304 pin_count: 1 buf: <BufferPool::BufferHandle> 0xd42b978 client: 0xcf9dae0/0xbdf6ac0 data: 0x132a00000 len: 4194304
<BufferPool::Page> 0xd42d3e0 len: 2097152 pin_count: 1 buf: <BufferPool::BufferHandle> 0xd42d458 client: 0xcf9dae0/0xbdf6ac0 data: 0xc6a00000 len: 2097152
<BufferPool::Page> 0xd42dd40 len: 4194304 pin_count: 1 buf: <BufferPool::BufferHandle> 0xd42ddb8 client: 0xcf9dae0/0xbdf6ac0 data: 0x132e00000 len: 4194304
<BufferPool::Page> 0xd42de80 len: 4194304 pin_count: 1 buf: <BufferPool::BufferHandle> 0xd42def8 client: 0xcf9dae0/0xbdf6ac0 data: 0x137c00000 len: 4194304
<BufferPool::Page> 0x12d48320 len: 4194304 pin_count: 1 buf: <BufferPool::BufferHandle> 0x12d48398 client: 0xcf9dae0/0xbdf6ac0 data: 0x102c00000 len: 4194304
<BufferPool::Page> 0x12d483c0 len: 4194304 pin_count: 1 buf: <BufferPool::BufferHandle> 0x12d48438 client: 0xcf9dae0/0xbdf6ac0 data: 0x108a00000 len: 4194304
<BufferPool::Page> 0x12d48780 len: 4194304 pin_count: 1 buf: <BufferPool::BufferHandle> 0x12d487f8 client: 0xcf9dae0/0xbdf6ac0 data: 0x108e00000 len: 4194304
<BufferPool::Page> 0x12d492c0 len: 2097152 pin_count: 1 buf: <BufferPool::BufferHandle> 0x12d49338 client: 0xcf9dae0/0xbdf6ac0 data: 0x127600000 len: 2097152
<BufferPool::Page> 0x12d4a9e0 len: 2097152 pin_count: 1 buf: <BufferPool::BufferHandle> 0x12d4aa58 client: 0xcf9dae0/0xbdf6ac0 data: 0x12d200000 len: 2097152

  0 dirty unpinned pages: 
  0 in flight write pages: }
{code}
Found the stacktrace from the log:
{code}
    @          0x1c9dfbe  impala::Status::Status()
    @          0x1ca5a78  impala::Status::Status()
    @          0x2bfe4ec  impala::BufferedTupleStream::NextReadPage()
    @          0x2c04b72  impala::BufferedTupleStream::GetNextInternal<>()
    @          0x2c029e6  impala::BufferedTupleStream::GetNextInternal<>()
    @          0x2bffd19  impala::BufferedTupleStream::GetNext()
    @          0x28aa43f  impala::GroupingAggregator::ProcessStream<>()
    @          0x28a2e17  impala::GroupingAggregator::BuildSpilledPartition()
    @          0x28a2401  impala::GroupingAggregator::NextPartition()
    @          0x289df5a  impala::GroupingAggregator::GetRowsFromPartition()
    @          0x289db20  impala::GroupingAggregator::GetNext()
    @          0x28dbfc7  impala::AggregationNode::GetNext()
    @          0x2259dfc  impala::FragmentInstanceState::ExecInternal()
    @          0x22564a0  impala::FragmentInstanceState::Exec()
    @          0x22801ed  impala::QueryState::ExecFInstance()
    @          0x227e5ef  _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
    @          0x2281d8e  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
    @          0x204d7d1  boost::function0<>::operator()()
    @          0x26702d5  impala::Thread::SuperviseThread()
    @          0x2678272  boost::_bi::list5<>::operator()<>()
    @          0x2678196  boost::_bi::bind_t<>::operator()()
    @          0x2678157  boost::detail::thread_data<>::run()
    @          0x3e45d71  thread_proxy
    @     0x7fc8339656b9  start_thread
    @     0x7fc8305314dc  clone
{code}

From the codes, the comment indicates that there is a bug:
{code:c++}
Status BufferedTupleStream::NextReadPage(ReadIterator* read_iter) {
  ...
  int64_t read_page_len = read_iter->read_page_->len();
  if (!pinned_ && read_page_len > default_page_len_
      && buffer_pool_client_->GetUnusedReservation() < read_page_len) {
    // If we are iterating over an unpinned stream and encounter a page that is larger
    // than the default page length, then unpinning the previous page may not have
    // freed up enough reservation to pin the next one. The client is responsible for
    // ensuring the reservation is available, so this indicates a bug.
    return Status(TErrorCode::INTERNAL_ERROR, Substitute("Internal error: couldn't pin "
          "large page of $0 bytes, client only had $1 bytes of unused reservation:\n$2",
          read_page_len, buffer_pool_client_->GetUnusedReservation(),
          buffer_pool_client_->DebugString()));
  }
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)