You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Riza Suminto (Jira)" <ji...@apache.org> on 2021/10/25 17:12:00 UTC

[jira] [Created] (IMPALA-10984) Improve performance of FROM_UNIXTIME function.

Riza Suminto created IMPALA-10984:
-------------------------------------

             Summary: Improve performance of FROM_UNIXTIME function.
                 Key: IMPALA-10984
                 URL: https://issues.apache.org/jira/browse/IMPALA-10984
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 4.0.0
            Reporter: Riza Suminto
            Assignee: Riza Suminto


FROM_UNIXTIME function is implemented by calling TimestampValue::ToString() in TimestampFunctions::FromUnix().

We found out that evaluation of TimestampValue::ToString() can get trapped in tcmalloc::CentralFreeList lock, as shown in this pstack

 
{code:java}
#0 0x000000000277d81a in base::internal::SpinLockDelay(int volatile*, int, int) ()
#1 0x00000000027d17f9 in SpinLock::SlowLock() ()
#2 0x000000000287a399 in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) ()
#3 0x00000000028882f3 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) ()
#4 0x00000000029c5e88 in tc_newarray ()
#5 0x00007faedc677169 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) () from /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p4948.16676264/lib/impala/lib/libstdc++.so.6
#6 0x0000000000f769de in impala::TimestampValue::ToString() const ()
#7 0x00007faeb317e08e in ?? ()
#8 0x00007fad62af6068 in ?? ()
#9 0x00007faedc8c20c0 in ?? () from /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p4948.16676264/lib/impala/lib/libstdc++.so.6
#10 0x0000000000000000 in ?? (){code}
 

This is presumably due to the combination use of stringstream, boost::gregorian::to_iso_extended_string and boost::posix_time::to_simple_string that involve multiple string allocation and copying.
This can be problematic when FROM_UNIXTIME is being evaluated for millions of rows.

We should come up with better implementation that involve less string allocation and copying.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org