You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Aman Sinha (Jira)" <ji...@apache.org> on 2020/08/08 18:20:00 UTC

[jira] [Created] (IMPALA-10063) Intermittent crash seen during ComputeCpuRatios

Aman Sinha created IMPALA-10063:
-----------------------------------

             Summary: Intermittent crash seen during ComputeCpuRatios
                 Key: IMPALA-10063
                 URL: https://issues.apache.org/jira/browse/IMPALA-10063
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 3.4.0
            Reporter: Aman Sinha
            Assignee: Aman Sinha


On my desktop running Ubuntu 18.04 I sometimes  (once in a few days) see impalad hit a DCHECK and crash in ComputeCpuRatios with the following stack:
{noformat}
system-state-info.cc:146] Check failed: total_tics > 0 (-9802454 vs. 0) 
*** Check failure stack trace: ***
    @          0x5140a1c  google::LogMessage::Fail()
    @          0x514230c  google::LogMessage::SendToLog()
    @          0x514037a  google::LogMessage::Flush()
    @          0x5143f78  google::LogMessageFatal::~LogMessageFatal()
    @          0x266a4f7  impala::SystemStateInfo::ComputeCpuRatios()
    @          0x2669c89  impala::SystemStateInfo::CaptureSystemStateSnapshot()
    @          0x2193dae  _ZZN6impala7ExecEnv19InitSystemStateInfoEvENKUlvE_clEv
    @          0x2194ab0  _ZNSt17_Function_handlerIFvvEZN6impala7ExecEnv19InitSystemStateInfoEvEUlvE_E9_M_invokeERKSt9_Any_data
    @          0x26244af  std::function<>::operator()()
    @          0x2622a97  impala::PeriodicCounterUpdater::UpdateLoop()
    @          0x262dc10  boost::_mfi::mf0<>::operator()()
    @          0x262db72  boost::_bi::list1<>::operator()<>()
    @          0x262db1a  boost::_bi::bind_t<>::operator()()
    @          0x262dadb  boost::detail::thread_data<>::run()
    @          0x3e47771  thread_proxy
    @     0x7f0a9bc6d6da  start_thread
    @     0x7f0a986a5a3e  clone
{noformat}

Since the total_tics calculation is dependent on the system clock, on further digging using the adjtimex utility, I found some odd  values for the offset and frequency on the machine .. they are negative instead of positive: 
{noformat}
$ adjtimex -p
         mode: 0
       offset: -1232794                        <--- seems wrong
    frequency: -109214                      <--- seems wrong
     maxerror: 236000
     esterror: 0
       status: 8193
time_constant: 5
    precision: 1
    tolerance: 32768000
         tick: 10000
     raw time:  1596910192s 365781168us = 1596910192.365781168
{noformat}

Regardless of this, we should prevent it from crashing impala. 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org