You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Zoltán Borók-Nagy (JIRA)" <ji...@apache.org> on 2019/03/14 13:54:00 UTC

[jira] [Resolved] (IMPALA-8257) Parquet writer sometimes hits DCHECK when handling empty string

     [ https://issues.apache.org/jira/browse/IMPALA-8257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zoltán Borók-Nagy resolved IMPALA-8257.
---------------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 3.2.0

> Parquet writer sometimes hits DCHECK when handling empty string
> ---------------------------------------------------------------
>
>                 Key: IMPALA-8257
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8257
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.13.0, Impala 3.1.0, Impala 3.2.0
>            Reporter: Tim Armstrong
>            Assignee: Zoltán Borók-Nagy
>            Priority: Blocker
>              Labels: crash, parquet
>             Fix For: Impala 3.2.0
>
>
> Encountered while doing a large insert into Parquet.
> {code}
> create table customer like tpcds_300_text.customer stored as parquetfile
> insert overwrite table customer select * from tpcds_300_text.customer
> {code}
> {noformat}
> F0227 01:34:53.052708 131295 parquet-column-stats.inline.h:213] 794c051ae3f3913c:71f00bc400000001] Check failed: static_cast<void*>(prev_page_min_value_.ptr) != static_cast<void*>(cs->min_value_.ptr) (0 vs. 0) 
> *** Check failure stack trace: ***
>     @          0x47ec7ec  google::LogMessage::Fail()
>     @          0x47ee091  google::LogMessage::SendToLog()
>     @          0x47ec1c6  google::LogMessage::Flush()
>     @          0x47ef78d  google::LogMessageFatal::~LogMessageFatal()
>     @          0x27e973c  impala::ColumnStats<>::Merge()
>     @          0x27e3c74  impala::HdfsParquetTableWriter::BaseColumnWriter::FinalizeCurrentPage()
>     @          0x27ee65f  impala::HdfsParquetTableWriter::BaseColumnWriter::AppendRow()
>     @          0x27e653b  impala::HdfsParquetTableWriter::AppendRows()
>     @          0x23177fc  impala::HdfsTableSink::WriteRowsToPartition()
>     @          0x231aeeb  impala::HdfsTableSink::Send()
>     @          0x1f53888  impala::FragmentInstanceState::ExecInternal()
>     @          0x1f4fefa  impala::FragmentInstanceState::Exec()
>     @          0x1f63333  impala::QueryState::ExecFInstance()
>     @          0x1f61615  _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
>     @          0x1f64774  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x1d76b9f  boost::function0<>::operator()()
>     @          0x22245ee  impala::Thread::SuperviseThread()
>     @          0x222c972  boost::_bi::list5<>::operator()<>()
>     @          0x222c896  boost::_bi::bind_t<>::operator()()
>     @          0x222c859  boost::detail::thread_data<>::run()
>     @          0x3716329  thread_proxy
>     @     0x7fba207e8dd4  start_thread
>     @     0x7fba20511eac  __clone
> {noformat}
> This actually happened on multiple machines at almost exactly the same time:
> {noformat}
> Running on machine: vc1328.halxg.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> F0227 01:34:53.025667 133025 parquet-column-stats.inline.h:213] 794c051ae3f3913c:71f00bc400000005] Check failed: static_cast<void*>(prev_page_min_value_.ptr) != static_cast<void*>(cs->min_value_.ptr) (0 vs. 0) 
> ...
> F0227 01:34:53.025352 131082 parquet-column-stats.inline.h:213] 794c051ae3f3913c:71f00bc400000007] Check failed: static_cast<void*>(prev_page_min_value_.ptr) != static_cast<void*>(cs->min_value_.ptr) (0 vs. 0) 
> {noformat}
> Coordinator log indicates it failed very fast:
> {noformat}
> I0227 01:34:48.157472 147928 impala-server.cc:1063] 794c051ae3f3913c:71f00bc400000000] Registered query query_id=794c051ae3f3913c:71f00bc400000000 session_id=3345ef7013ba6bb2:55d8d105d3690a8e
> I0227 01:34:48.157711 147928 Frontend.java:1251] 794c051ae3f3913c:71f00bc400000000] Analyzing query: insert overwrite table customer select * from tpcds_300_text.customer db: tpcds_300_decimal_parquet
> I0227 01:34:48.158025 147928 FeSupport.java:285] 794c051ae3f3913c:71f00bc400000000] Requesting prioritized load of table(s): tpcds_300_decimal_parquet.customer
> I0227 01:34:52.049566 147928 Frontend.java:1292] 794c051ae3f3913c:71f00bc400000000] Analysis finished.
> I0227 01:34:52.067458 147991 admission-controller.cc:627] 794c051ae3f3913c:71f00bc400000000] Schedule for id=794c051ae3f3913c:71f00bc400000000 in pool_name=root.systest per_host_mem_estimate=1.62 GB PoolConfig: max_requests=-1 max_queued=200 max_mem=-1.00 B
> I0227 01:34:52.067562 147991 admission-controller.cc:632] 794c051ae3f3913c:71f00bc400000000] Stats: agg_num_running=0, agg_num_queued=0, agg_mem_reserved=0,  local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0, backend_mem_reserved=0)
> I0227 01:34:52.067620 147991 admission-controller.cc:664] 794c051ae3f3913c:71f00bc400000000] Admitted query id=794c051ae3f3913c:71f00bc400000000
> I0227 01:34:52.067771 147991 coordinator.cc:93] 794c051ae3f3913c:71f00bc400000000] Exec() query_id=794c051ae3f3913c:71f00bc400000000 stmt=insert overwrite table customer select * from tpcds_300_text.customer
> I0227 01:34:52.068926 147991 coordinator.cc:359] 794c051ae3f3913c:71f00bc400000000] starting execution on 9 backends for query_id=794c051ae3f3913c:71f00bc400000000
> I0227 01:34:52.070919 47659 impala-internal-service.cc:50] 794c051ae3f3913c:71f00bc400000000] ExecQueryFInstances(): query_id=794c051ae3f3913c:71f00bc400000000 coord=vc1326.halxg.cloudera.com:22000 #instances=1
> I0227 01:34:52.071800 147994 query-state.cc:624] 794c051ae3f3913c:71f00bc400000003] Executing instance. instance_id=794c051ae3f3913c:71f00bc400000003 fragment_idx=0 per_fragment_instance_idx=3 coord_state_idx=0 #in-flight=1
> I0227 01:34:52.072952 147991 coordinator.cc:373] 794c051ae3f3913c:71f00bc400000000] started execution on 9 backends for query_id=794c051ae3f3913c:71f00bc400000000
> I0227 01:34:52.074553 147992 coordinator.cc:611] Coordinator waiting for backends to finish, 9 remaining. query_id=794c051ae3f3913c:71f00bc400000000
> F0227 01:34:52.949759 147994 parquet-column-stats.inline.h:213] 794c051ae3f3913c:71f00bc400000003] Check failed: static_cast<void*>(prev_page_min_value_.ptr) != static_cast<void*>(cs->min_value_.ptr) (0 vs. 0) 
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)