You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Riza Suminto (Jira)" <ji...@apache.org> on 2021/11/09 16:44:00 UTC

[jira] [Resolved] (IMPALA-10984) Improve performance of FROM_UNIXTIME function.

     [ https://issues.apache.org/jira/browse/IMPALA-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Riza Suminto resolved IMPALA-10984.
-----------------------------------
    Resolution: Fixed

> Improve performance of FROM_UNIXTIME function.
> ----------------------------------------------
>
>                 Key: IMPALA-10984
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10984
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.0.0
>            Reporter: Riza Suminto
>            Assignee: Riza Suminto
>            Priority: Major
>
> FROM_UNIXTIME function is implemented by calling TimestampValue::ToString() in TimestampFunctions::FromUnix().
> We found out that evaluation of TimestampValue::ToString() can get trapped in tcmalloc::CentralFreeList lock, as shown in this pstack
>  
> {code:java}
> #0 0x000000000277d81a in base::internal::SpinLockDelay(int volatile*, int, int) ()
> #1 0x00000000027d17f9 in SpinLock::SlowLock() ()
> #2 0x000000000287a399 in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) ()
> #3 0x00000000028882f3 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) ()
> #4 0x00000000029c5e88 in tc_newarray ()
> #5 0x00007faedc677169 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) () from /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p4948.16676264/lib/impala/lib/libstdc++.so.6
> #6 0x0000000000f769de in impala::TimestampValue::ToString() const ()
> #7 0x00007faeb317e08e in ?? ()
> #8 0x00007fad62af6068 in ?? ()
> #9 0x00007faedc8c20c0 in ?? () from /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p4948.16676264/lib/impala/lib/libstdc++.so.6
> #10 0x0000000000000000 in ?? (){code}
>  
> This is presumably due to the combination use of stringstream, boost::gregorian::to_iso_extended_string and boost::posix_time::to_simple_string that involve multiple string allocation and copying.
> This can be problematic when FROM_UNIXTIME is being evaluated for millions of rows.
> We should come up with better implementation that involve less string allocation and copying.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org