You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jo...@apache.org on 2019/02/07 04:47:27 UTC
[impala] 02/02: IMPALA-7265: Enable caching of remote file handles
by default
This is an automated email from the ASF dual-hosted git repository.
joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
commit 255ec4687ebe6195b20e5566394f3692c07e3b7f
Author: Joe McDonnell <jo...@cloudera.com>
AuthorDate: Wed Feb 6 12:41:23 2019 -0800
IMPALA-7265: Enable caching of remote file handles by default
This changes the default value of cache_remote_file_handles
from false to true. Testing shows that this setting has a
major impact on performance for clusters that do remote HDFS
reads. Hand testing of the cache did not reveal any problems
with the semantics of caching remote file handles.
Change-Id: I2fc4a69c6bf721017f4adcdc302db9eace5135a4
Reviewed-on: http://gerrit.cloudera.org:8080/12387
Reviewed-by: Philip Zeyliger <ph...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
be/src/runtime/io/disk-io-mgr.cc | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/be/src/runtime/io/disk-io-mgr.cc b/be/src/runtime/io/disk-io-mgr.cc
index cad2e65..ce56be0 100644
--- a/be/src/runtime/io/disk-io-mgr.cc
+++ b/be/src/runtime/io/disk-io-mgr.cc
@@ -127,10 +127,9 @@ DEFINE_uint64(unused_file_handle_timeout_sec, 21600, "Maximum time, in seconds,
DEFINE_uint64(num_file_handle_cache_partitions, 16, "Number of partitions used by the "
"file handle cache.");
-// Given the extra complexity of remote accesses and semantics, caching for remote HDFS
-// file handles is currently not enabled by default. This parameter enables caching
-// for remote HDFS file handles. It does not impact S3, ADLS, or ABFS file handles.
-DEFINE_bool(cache_remote_file_handles, false, "Enable the file handle cache for "
+// This parameter controls whether remote HDFS file handles are cached. It does not impact
+// S3, ADLS, or ABFS file handles. This is enabled by default.
+DEFINE_bool(cache_remote_file_handles, true, "Enable the file handle cache for "
"remote HDFS files.");
AtomicInt32 DiskIoMgr::next_disk_id_;