You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2020/04/08 00:16:00 UTC

[jira] [Commented] (IMPALA-9453) S3 build failed with many strange symptoms

    [ https://issues.apache.org/jira/browse/IMPALA-9453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077693#comment-17077693 ] 

Quanlong Huang commented on IMPALA-9453:
----------------------------------------

See the same issue in a non-S3 build:
{code}
I0407 10:42:38.805296  2418 impala-server.cc:1053] 9e43ce2548ae68b4:9c79604300000000] Registered query query_id=9e43ce2548ae68b4:9c79604300000000 session_id=5148c2a8f3baf5ab:327d73f93bc3beac
I0407 10:42:38.805493  2418 Frontend.java:1490] 9e43ce2548ae68b4:9c79604300000000] Analyzing query: select count(*) from (select distinct * from test_fuzz_alltypes_5ee47ec7.alltypes) q db: functional_parquet
I0407 10:42:38.806259  2418 BaseAuthorizationChecker.java:110] 9e43ce2548ae68b4:9c79604300000000] Authorization check took 0 ms
I0407 10:42:38.806308  2418 Frontend.java:1532] 9e43ce2548ae68b4:9c79604300000000] Analysis and authorization finished.
I0407 10:42:38.812016 23503 scheduler.cc:592] 9e43ce2548ae68b4:9c79604300000000] Exec at coord is false
I0407 10:42:38.812701 23503 admission-controller.cc:1296] 9e43ce2548ae68b4:9c79604300000000] Trying to admit id=9e43ce2548ae68b4:9c79604300000000 in pool_name=default-pool executor_group_name=default per_host_mem_estimate=452.02 MB dedicated_coord_mem_estimate=110.02 MB max_requests=-1 (configured statically) max_queued=200 (configured statically) max_mem=-1.00 B (configured statically)
I0407 10:42:38.812739 23503 admission-controller.cc:1308] 9e43ce2548ae68b4:9c79604300000000] Stats: agg_num_running=7, agg_num_queued=0, agg_mem_reserved=2.10 GB,  local_host(local_mem_admitted=1.76 GB, num_admitted_running=6, num_queued=0, backend_mem_reserved=544.26 MB)
I0407 10:42:38.812760 23503 admission-controller.cc:894] 9e43ce2548ae68b4:9c79604300000000] Admitting query id=9e43ce2548ae68b4:9c79604300000000
I0407 10:42:38.812821 23503 impala-server.cc:1716] 9e43ce2548ae68b4:9c79604300000000] Registering query locations
I0407 10:42:38.812863 23503 coordinator.cc:141] 9e43ce2548ae68b4:9c79604300000000] Exec() query_id=9e43ce2548ae68b4:9c79604300000000 stmt=select count(*) from (select distinct * from test_fuzz_alltypes_5ee47ec7.alltypes) q
I0407 10:42:38.813493 23503 coordinator.cc:461] 9e43ce2548ae68b4:9c79604300000000] starting execution on 1 backends for query_id=9e43ce2548ae68b4:9c79604300000000
I0407 10:42:38.815609 12543 control-service.cc:152] 9e43ce2548ae68b4:9c79604300000000] ExecQueryFInstances(): query_id=9e43ce2548ae68b4:9c79604300000000 coord=ip-172-31-44-169:22000 #instances=3
I0407 10:42:38.816548 23503 coordinator.cc:513] 9e43ce2548ae68b4:9c79604300000000] started execution on 1 backends for query_id=9e43ce2548ae68b4:9c79604300000000
I0407 10:42:38.817690 23506 query-state.cc:725] 9e43ce2548ae68b4:9c79604300000001] Executing instance. instance_id=9e43ce2548ae68b4:9c79604300000001 fragment_idx=2 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=10
I0407 10:42:38.817741 23507 query-state.cc:725] 9e43ce2548ae68b4:9c79604300000002] Executing instance. instance_id=9e43ce2548ae68b4:9c79604300000002 fragment_idx=1 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=11
I0407 10:42:38.817862 23508 query-state.cc:725] 9e43ce2548ae68b4:9c79604300000000] Executing instance. instance_id=9e43ce2548ae68b4:9c79604300000000 fragment_idx=0 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=12
......
F0407 10:42:39.233121 23537 parquet-page-reader.cc:67] 9e43ce2548ae68b4:9c79604300000001] Check failed: col_end < file_desc.file_length (6838 vs. 6838)
I0407 10:42:39.282228 13573 status.cc:129] 1e4d39fdf48ed21e:eb8a3bd600000000] UDF ERROR: Decimal expression overflowed
    @          0x1c9ddd8  impala::Status::Status()
    @          0x2490146  impala::RuntimeState::SetQueryStatus()
    @          0x248f1bc  impala_udf::FunctionContext::SetError()
    @          0x290102c  impala::DecimalOperators::Subtract_DecimalVal_DecimalVal()
    @          0x28d8f7b  impala::ScalarFnCall::InterpretEval<>()
    @          0x28b82dc  impala::ScalarFnCall::GetDecimalValInterpreted()
    @          0x287d532  impala::ScalarExpr::GetDecimalVal()
    @          0x287ab81  impala::ScalarExprEvaluator::GetValue()
    @          0x287a6fe  impala::ScalarExprEvaluator::GetValue()
    @          0x2337c5a  Java_org_apache_impala_service_FeSupport_NativeEvalExprsWithoutRow
    @     0x7f7863485623  (unknown)
    @     0x7f7866b38d07  (unknown)
I0407 10:42:39.315498 13573 status.cc:129] 1e4d39fdf48ed21e:eb8a3bd600000000] Decimal expression overflowed
    @          0x1c9ddd8  impala::Status::Status()
    @          0x287a009  impala::ScalarExprEvaluator::GetError()
    @          0x2337c7b  Java_org_apache_impala_service_FeSupport_NativeEvalExprsWithoutRow
    @     0x7f7863485623  (unknown)
    @     0x7f7866b38d07  (unknown)
E0407 10:42:39.315996 13573 LiteralExpr.java:212] 1e4d39fdf48ed21e:eb8a3bd600000000] Failed to evaluate expr '-44108181408604043683579919903326347174 - -935688780337620566297976.63917800000000': Decimal expression overflowed
I0407 10:42:39.353885 23539 status.cc:129] 9e43ce2548ae68b4:9c79604300000001] File 'hdfs://localhost:20500/test-warehouse/test_fuzz_alltypes_5ee47ec7.db/alltypes/year=2010/month=4/7447249c76c4ebb1-25bd688d00000004_902734205_data.0.parq': metadata is corrupt. Column 10 has invalid data page offset (offset=6443 file_size=5915).
    @          0x1c9ddd8  impala::Status::Status()
    @          0x2d889f3  impala::ParquetMetadataUtils::ValidateOffsetInFile()
    @          0x2d882ee  impala::ParquetMetadataUtils::ValidateColumnOffsets()
    @          0x2bd447e  impala::HdfsParquetScanner::NextRowGroup()
    @          0x2bd2ea2  impala::HdfsParquetScanner::GetNextInternal()
    @          0x2bd12b6  impala::HdfsParquetScanner::ProcessSplit()
    @          0x27f856e  impala::HdfsScanNode::ProcessSplit()
    @          0x27f7884  impala::HdfsScanNode::ScannerThread()
    @          0x27f6bd7  _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
    @          0x27f9053  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
    @          0x1fd2e43  boost::function0<>::operator()()
    @          0x25a18d8  impala::Thread::SuperviseThread()
    @          0x25a9b5c  boost::_bi::list5<>::operator()<>()
    @          0x25a9a80  boost::_bi::bind_t<>::operator()()
    @          0x25a9a43  boost::detail::thread_data<>::run()
    @          0x3df5729  thread_proxy
    @     0x7f787a4a56b9  start_thread
    @     0x7f787708741c  clone
{code}
Looks like the failed query come from fuzz_scanners_test.
Failed job link: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/10138/
Succeeded job link after retry: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/10142/

> S3 build failed with many strange symptoms
> ------------------------------------------
>
>                 Key: IMPALA-9453
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9453
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 3.4.0
>            Reporter: Tim Armstrong
>            Assignee: Sahil Takiar
>            Priority: Blocker
>              Labels: broken-build, crash
>
> There were a lot of incorrect results:
> {noformat}
> uery_test/test_mt_dop.py:49: in test_mt_dop     self.run_test_case('QueryTest/mt-dop', new_vector) common/impala_test_suite.py:690: in run_test_case     self.__verify_results_and_errors(vector, test_section, result, use_db) common/impala_test_suite.py:523: in __verify_results_and_errors     replace_filenames_with_placeholder) common/test_result_verifier.py:456: in verify_raw_results     VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:278: in verify_query_result_is_equal     assert expected_results == actual_results E   assert Comparing QueryTestResults (expected vs actual): E     7300 != 6990
> Stacktrace
> query_test/test_mt_dop.py:49: in test_mt_dop
>     self.run_test_case('QueryTest/mt-dop', new_vector)
> common/impala_test_suite.py:690: in run_test_case
>     self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
>     replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
>     VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
>     assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E     7300 != 6990
> Standard Error
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 7300 != 6990
> {noformat}
> The impalads eventually crashed:
> {noformat}
> F0302 00:50:55.607841   483 parquet-page-reader.cc:67] e24eb0839fa75423:8ac0bf7300000002] Check failed: col_end < file_desc.file_length (7010 vs. 7010) 
> *** Check failure stack trace: ***
>     @          0x4f7277c  google::LogMessage::Fail()
>     @          0x4f74021  google::LogMessage::SendToLog()
>     @          0x4f72156  google::LogMessage::Flush()
>     @          0x4f7571d  google::LogMessageFatal::~LogMessageFatal()
>     @          0x2e3a520  impala::ParquetPageReader::InitColumnChunk()
>     @          0x2e37dee  impala::ParquetColumnChunkReader::InitColumnChunk()
>     @          0x2cd8000  impala::BaseScalarColumnReader::Reset()
>     @          0x2c91239  impala::HdfsParquetScanner::InitScalarColumns()
>     @          0x2c8775a  impala::HdfsParquetScanner::NextRowGroup()
>     @          0x2c85c2a  impala::HdfsParquetScanner::GetNextInternal()
>     @          0x2c8403e  impala::HdfsParquetScanner::ProcessSplit()
>     @          0x28b826b  impala::HdfsScanNode::ProcessSplit()
>     @          0x28b7440  impala::HdfsScanNode::ScannerThread()
>     @          0x28b679d  _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
>     @          0x28b8d91  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x20aa1e9  boost::function0<>::operator()()
>     @          0x266a84a  impala::Thread::SuperviseThread()
>     @          0x2672ace  boost::_bi::list5<>::operator()<>()
>     @          0x26729f2  boost::_bi::bind_t<>::operator()()
>     @          0x26729b5  boost::detail::thread_data<>::run()
>     @          0x3e98b19  thread_proxy
>     @     0x7f8c2ccefe24  start_thread
>     @     0x7f8c2985b34c  __clone
> {noformat}
> {noformat}
> F0302 00:50:41.466794 32643 parquet-page-reader.cc:67] dd48a46583bea9c8:e3be641000000002] Check failed: col_end < file_desc.file_length (7010 vs. 7010) 
> *** Check failure stack trace: ***
>     @          0x4f7277c  google::LogMessage::Fail()
>     @          0x4f74021  google::LogMessage::SendToLog()
>     @          0x4f72156  google::LogMessage::Flush()
>     @          0x4f7571d  google::LogMessageFatal::~LogMessageFatal()
>     @          0x2e3a520  impala::ParquetPageReader::InitColumnChunk()
>     @          0x2e37dee  impala::ParquetColumnChunkReader::InitColumnChunk()
>     @          0x2cd8000  impala::BaseScalarColumnReader::Reset()
>     @          0x2c91239  impala::HdfsParquetScanner::InitScalarColumns()
>     @          0x2c8775a  impala::HdfsParquetScanner::NextRowGroup()
>     @          0x2c85c2a  impala::HdfsParquetScanner::GetNextInternal()
>     @          0x2c8403e  impala::HdfsParquetScanner::ProcessSplit()
>     @          0x28b826b  impala::HdfsScanNode::ProcessSplit()
>     @          0x28b7440  impala::HdfsScanNode::ScannerThread()
>     @          0x28b679d  _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
>     @          0x28b8d91  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x20aa1e9  boost::function0<>::operator()()
>     @          0x266a84a  impala::Thread::SuperviseThread()
>     @          0x2672ace  boost::_bi::list5<>::operator()<>()
>     @          0x26729f2  boost::_bi::bind_t<>::operator()()
>     @          0x26729b5  boost::detail::thread_data<>::run()
>     @          0x3e98b19  thread_proxy
>     @     0x7f16a5766e24  start_thread
>     @     0x7f16a22d234c  __clone
> Wrote minidump to /data/jenkins/workspace/impala-cdpd-master-core-s3/repos/Impala/logs/ee_tests/minidumps/impalad/49436f97-da0e-47c9-76b372a5-6b3fb146.dmp
> {noformat}
> If it makes a differnence, this was using CDP, so that version of the S3 connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org