You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Alexey Serbin (Jira)" <ji...@apache.org> on 2023/06/22 23:36:00 UTC

[jira] [Reopened] (KUDU-2707) Improve the performance of the block cache under contention

     [ https://issues.apache.org/jira/browse/KUDU-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Serbin reopened KUDU-2707:
---------------------------------

> Improve the performance of the block cache under contention
> -----------------------------------------------------------
>
>                 Key: KUDU-2707
>                 URL: https://issues.apache.org/jira/browse/KUDU-2707
>             Project: Kudu
>          Issue Type: Improvement
>    Affects Versions: 1.10.0
>            Reporter: William Berkeley
>            Priority: Major
>             Fix For: NA
>
>
> While looking at a random write workload where flushes outpace compactions (i.e. the typical case when inserting as fast as possible), there are occasional consensus service queue overflows. Analyzing the stacks of the service threads when this occurs (using the diagnostics log), I see many stacks like
> {noformat}
> 0x3b6720f710 <unknown>
>            0x1fb900a base::internal::SpinLockDelay()
>            0x1fb8ea7 base::SpinLock::SlowLock()
>            0x1ef7394 kudu::(anonymous namespace)::ShardedLRUCache::Lookup()
>            0x1ce379f kudu::cfile::BlockCache::Lookup()
>            0x1cec948 kudu::cfile::CFileReader::ReadBlock()
>            0x1ce5d36 kudu::cfile::BloomFileReader::CheckKeyPresent()
>             0xb311a1 kudu::tablet::CFileSet::CheckRowPresent()
>             0xac46c4 kudu::tablet::DiskRowSet::CheckRowPresent()
>             0xa6b017 _ZZN4kudu6tablet6Tablet17BulkCheckPresenceEPKNS_2fs9IOContextEPNS0_21WriteTransactionStateEENKUlvE1_clEv
>             0xa7427e _ZNSt17_Function_handlerIFvPN4kudu6tablet6RowSetEiEZNS1_6Tablet17BulkCheckPresenceEPKNS0_2fs9IOContextEPNS1_21WriteTransactionStateEEUlS3_iE2_E9_M_invokeERKSt9_Any_dataS3_i
>             0xaee074 _ZNK4kudu22interval_tree_internal6ITNodeINS_6tablet20RowSetIntervalTraitsEE31ForEachIntervalContainingPointsIZNKS2_10RowSetTree27ForEachRowSetContainingKeysERKSt6vectorINS_5SliceESaIS8_EERKSt8functionIFvPNS2_6RowSetEiEEEUlRKNS2_12_GLOBAL__N_111QueryStructEPNS2_16RowSetWithBoundsEE_N9__gnu_cxx17__normal_iteratorIPSM_S7_ISL_SaISL_EEEEEEvT0_SX_RKT_
>             0xaee1b3 _ZNK4kudu22interval_tree_internal6ITNodeINS_6tablet20RowSetIntervalTraitsEE31ForEachIntervalContainingPointsIZNKS2_10RowSetTree27ForEachRowSetContainingKeysERKSt6vectorINS_5SliceESaIS8_EERKSt8functionIFvPNS2_6RowSetEiEEEUlRKNS2_12_GLOBAL__N_111QueryStructEPNS2_16RowSetWithBoundsEE_N9__gnu_cxx17__normal_iteratorIPSM_S7_ISL_SaISL_EEEEEEvT0_SX_RKT_
>             0xaee3a3 kudu::tablet::RowSetTree::ForEachRowSetContainingKeys()
>             0xa80c17 kudu::tablet::Tablet::BulkCheckPresence()
>             0xa8108a kudu::tablet::Tablet::ApplyRowOperations()
> {noformat}
> Note that the slow step in writes for these workloads is generally CPU usage in the apply phase, once they have been running for a while.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)