You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (JIRA)" <ji...@apache.org> on 2018/01/05 23:00:05 UTC
[jira] [Resolved] (IMPALA-6364) Lock contention in FileHandleCache
results in >2x slowdown for remote HDFS reads
[ https://issues.apache.org/jira/browse/IMPALA-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joe McDonnell resolved IMPALA-6364.
-----------------------------------
Resolution: Fixed
Fix Version/s: Impala 2.12.0
commit d1a0510bfe0a168256d37904aca3a30994306454
Author: Joe McDonnell <jo...@cloudera.com>
Date: Wed Jan 3 19:02:19 2018 -0800
IMPALA-6364: Bypass file handle cache for ineligible files
Currently, all HdfsFileHandles are owned and constructed
by the file handle cache. When the file handle cache
is disabled or the file handle is not eligible for
caching, the HdfsFileHandle is stored exclusively in
ScanRange::exclusive_hdfs_fh_, but the HdfsFileHandle still
comes from the file handle cache. It is created via a call to
DiskIoMgr::GetCachedHdfsFileHandle() with 'require_new_handle'
set to true and destroyed via
DiskIoMgr::ReleaseCachedHdfsFileHandle() with 'destroy_handle'
set to true.
Recent testing has revealed that the lock on the file handle
cache is a bottleneck for workloads with many small remote
files. There is no benefit to storing these exclusive file
handles in the file handle cache, as they do not participate
in the caching.
This change introduces DiskIoMgr::GetExclusiveHdfsFileHandle()
and DiskIoMgr::ReleaseExclusiveHdfsFileHandle(). These are
equivalent to the Get/ReleaseCachedHdfsFileHandle() calls, except
they bypass the file handle cache and create/destroy the
file handle directly. ScanRange::Open()/Close(), which
populates and frees ScanRange::exclusive_hdfs_fh_, now uses
these new calls rather than accessing the file handle cache.
This avoids the locking entirely, solving the bottleneck.
To draw a distinction between the two codepaths, HdfsFileHandle
is now an abstract class with two subclasses:
- CachedHdfsFileHandles cover all handles that live in file handle
cache. Get/ReleaseCachedHdfsFileHandle() use this subclass.
- ExclusiveHdfsFileHandles cover all cases where a file handle
does not come from the cache. The new
Get/ReleaseExclusiveHdfsFileHandle() use this subclass.
Separately, testing revealed that increasing the number of
partitions for the file handle cache also fixes the contention
problem. This changes the file handle cache to make the number
of partitions configurable via startup parameter
num_file_handle_cache_partitions. This allows mitigation of
future bottlenecks without a patch.
Change-Id: I4ab52b0884a909a4faeb6692f32d45878ea2838f
Reviewed-on: http://gerrit.cloudera.org:8080/8945
Reviewed-by: Joe McDonnell <jo...@cloudera.com>
Tested-by: Impala Public Jenkins
> Lock contention in FileHandleCache results in >2x slowdown for remote HDFS reads
> --------------------------------------------------------------------------------
>
> Key: IMPALA-6364
> URL: https://issues.apache.org/jira/browse/IMPALA-6364
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 2.10.0, Impala 2.11.0
> Reporter: Mostafa Mokhtar
> Assignee: Joe McDonnell
> Priority: Blocker
> Fix For: Impala 2.12.0
>
> Attachments: d2402_cdh5.12_profile.txt, d2402_cdh5.13_profile.txt, remote_hdfs_scan_pstack.txt
>
>
> IMPALA-4623 introduced a locking schema to the file handle cache which has 16 buckets, this results in lock contention between IO threads which limits system throughput.
> Most IO threads end-up in one of these stacks.
> {code}
> #0 0x0000000002085d47 in base::internal::SpinLockDelay(int volatile*, int, int) ()
> #1 0x0000000002085c29 in base::SpinLock::SlowLock() ()
> #2 0x00000000010fa76d in impala::io::FileHandleCache<16ul>::GetFileHandle(hdfs_internal* const&, std::string*, long, bool, bool*) ()
> #3 0x00000000010f6e22 in impala::io::DiskIoMgr::GetCachedHdfsFileHandle(hdfs_internal* const&, std::string*, long, impala::io::RequestContext*, bool) ()
> #4 0x00000000010fd514 in impala::io::ScanRange::Open(bool) ()
> #5 0x00000000010f691f in impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, impala::io::RequestContext*, impala::io::ScanRange*) ()
> #6 0x00000000010f6dc4 in impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
> #7 0x0000000000d13333 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) ()
> #8 0x0000000000d13a74 in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > > >::run() ()
> #9 0x000000000128ea3a in thread_proxy ()
> #10 0x00007f49f2bbadc5 in start_thread () from /lib64/libpthread.so.0
> #11 0x00007f49f28e976d in clone () from /lib64/libc.so.6
> {code}
> {code}
> #0 0x0000000002085d47 in base::internal::SpinLockDelay(int volatile*, int, int) ()
> #1 0x0000000002085c29 in base::SpinLock::SlowLock() ()
> #2 0x00000000010f9929 in impala::io::FileHandleCache<16ul>::ReleaseFileHandle(std::string*, impala::io::HdfsFileHandle*, bool) ()
> #3 0x00000000010fe69e in impala::io::ScanRange::Close() ()
> #4 0x00000000010f6565 in impala::io::DiskIoMgr::HandleReadFinished(impala::io::DiskIoMgr::DiskQueue*, impala::io::RequestContext*, std::unique_ptr<impala::io::BufferDescriptor, std::default_delete<impala::io::BufferDescriptor> >) ()
> #5 0x00000000010f695b in impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, impala::io::RequestContext*, impala::io::ScanRange*) ()
> #6 0x00000000010f6dc4 in impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
> #7 0x0000000000d13333 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) ()
> #8 0x0000000000d13a74 in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > > >::run() ()
> #9 0x000000000128ea3a in thread_proxy ()
> #10 0x00007f49f2bbadc5 in start_thread () from /lib64/libpthread.so.0
> #11 0x00007f49f28e976d in clone () from /lib64/libc.so.6
> {code}
> Increasing the number of partitions to 256 made the contention go away, a simple fix would be to make the number of partitions a startup flag and change it to 256.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)