You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2020/01/11 01:46:00 UTC

[jira] [Created] (IMPALA-9288) Crash due to memory leak in scan node of orc scanner

Quanlong Huang created IMPALA-9288:
--------------------------------------

             Summary: Crash due to memory leak in scan node of orc scanner
                 Key: IMPALA-9288
                 URL: https://issues.apache.org/jira/browse/IMPALA-9288
             Project: IMPALA
          Issue Type: Bug
            Reporter: Quanlong Huang
         Attachments: alltypes_leak.orc

Hit a crash when running test_fuzz_scanners on ORC.
{code:java}
tcmalloc: large alloc 18446744073709543424 bytes == (nil) @  0x5017e88 0x260445e 0x4e3f372
F0111 09:36:43.992420 24033 exec-node.cc:308] e540ee05cca8897e:4f0ba76500000001] Check failed: mem_tracker()->consumption() == 0 (11377 vs. 0) Leaked memory.
Fragment e540ee05cca8897e:4f0ba76500000001: Reservation=0 OtherMemory=25.77 KB Total=25.77 KB Peak=36.84 MB
  AGGREGATION_NODE (id=1): Reservation=0 OtherMemory=0 Total=0 Peak=34.09 MB
    GroupingAggregator 0: Reservation=0 OtherMemory=0 Total=0 Peak=34.09 MB
  HDFS_SCAN_NODE (id=0): Reservation=0 OtherMemory=11.11 KB Total=11.11 KB Peak=2.73 MB
  KrpcDataStreamSender (dst_id=3): Total=0 Peak=1.73 KB
  CodeGen: Total=14.66 KB Peak=1.54 MB
*** Check failure stack trace: ***
    @          0x4dca1dc  google::LogMessage::Fail()
    @          0x4dcba81  google::LogMessage::SendToLog()
    @          0x4dc9bb6  google::LogMessage::Flush()
    @          0x4dcd17d  google::LogMessageFatal::~LogMessageFatal()
    @          0x257f9f4  impala::ExecNode::Close()
    @          0x2681657  impala::ScanNode::Close()
    @          0x259bad9  impala::HdfsScanNodeBase::Close()
    @          0x2707584  impala::HdfsScanNode::Close()
    @          0x257f5b5  impala::ExecNode::Close()
    @          0x273d806  impala::StreamingAggregationNode::Close()
    @          0x2138745  impala::FragmentInstanceState::Close()
    @          0x2134777  impala::FragmentInstanceState::Exec()
    @          0x2148396  impala::QueryState::ExecFInstance()
    @          0x2146665  _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
    @          0x214a078  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
    @          0x1f40beb  boost::function0<>::operator()()
    @          0x24dd2a2  impala::Thread::SuperviseThread()
    @          0x24e5626  boost::_bi::list5<>::operator()<>()
    @          0x24e554a  boost::_bi::bind_t<>::operator()()
    @          0x24e550d  boost::detail::thread_data<>::run()
    @          0x3cf33c9  thread_proxy
    @     0x7fbb5323e6b9  start_thread
    @     0x7fbb4f9f741c  clone
{code}
I'm using the latest Impala (git-hash=d46f4a68fa86ca59b9066abbfe70a5d3c8d090a3) and latest ORC lib (git-hash=c26ff4c351d7c34c4272442a6874703f510282a8).

I also try with latest Impala and our current dependent ORC version (1.5.5-p1). It doesn't crash. So the leak may be introduced by ORC changes after 1.5.5.

*How to reproduce*
 Create and load the attached orc file into the table:
{code:sql}
CREATE TABLE default.alltypes_orc_leak (
  id INT,
  bool_col BOOLEAN, 
  tinyint_col TINYINT,
  smallint_col SMALLINT,
  int_col INT,
  bigint_col BIGINT,
  float_col FLOAT,
  double_col DOUBLE,
  date_string_col STRING,
  string_col STRING,
  timestamp_col TIMESTAMP
) STORED AS ORC;
{code}
Run the following query:
{code:sql}
select count(*) from (select distinct * from alltypes_orc_leak) q
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)