You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Anonymous Coward (Code Review)" <ge...@cloudera.org> on 2023/02/06 12:20:18 UTC

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

18770832838@163.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19475


Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve scan performance when cache miss happens.
Previously, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs,  the IO thread needs to take additional responsibility for cache writes,  which will lead to scan performance deterioration.
This patch uses a thread pool for asynchronous writes, and the number of threads in the pool is determined by the new configuration 'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only needs to copy data to the temporary buffer when storing data into the data cache. The additional memory consumption caused by temporary buffers can be limited, depending on the new configuration 'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
6 files changed, 330 insertions(+), 68 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/1
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward <18...@163.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#7).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve
scan performance when cache miss happens.
Previously, writes to the data cache are synchronized with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache-trace.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
8 files changed, 330 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/7
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
18770832838@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 10:

(2 comments)

Thanks again for your code review and suggestions.

http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache-test.cc
File be/src/runtime/io/data-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache-test.cc@107
PS10, Line 107:       usleep(10 * 1000);
> One small thing: Since we call this so frequently, it matters a lot to exec
Done


http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache.cc@960
PS10, Line 960:     if (buffer_limit < (1 << 23 /* 8MB */)) {
              :       return Status(Substitute("Configured data cache write buffer limit $0 is too small "
              :           "(less than 8MB)", FLAGS_data_cache_async_write_buffer_limit));
              :     }
> Unfortunately, this causes problems for the OutOfWriteBufferLimit test case
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 02:57:13 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#4).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve
scan performance when cache miss happens.
Previously, writes to the data cache are synchronized with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
6 files changed, 332 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/4
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#5).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve
scan performance when cache miss happens.
Previously, writes to the data cache are synchronized with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
6 files changed, 366 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/5
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12384/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Feb 2023 03:09:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12315/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Feb 2023 12:42:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
18770832838@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 4:

(9 comments)

Thanks for the code review!

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24
PS4, Line 24: Testing:
> What circumstances show better performance?

How can I show that? Maybe post a record of the comparative tests? I have no relevant experience, please give me some advice.


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache-test.cc
File be/src/runtime/io/data-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache-test.cc@106
PS4, Line 106:     while (cache.current_buffer_size_.Load() != 0) continue;
> Let's add a short sleep so that this is not spinning.
Done


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h
File be/src/runtime/io/data-cache.h:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@271
PS4, Line 271:     uint8_t* buffer_;
> I think it would be cleaner for this to be a unique_ptr.

Did you mean using unique_ptr<uint8_t[]> instead of uint8_t*? I tried but did not reduce the amount of code, did I misunderstand something?


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@203
PS4, Line 203: 
             :   /// The key used for look up in the cache.
             :   struct CacheKey {
             :    public:
             :     explicit CacheKey(const string& filename, int64_t mtime, int64_t offset)
             :       : key_(filename.size() + sizeof(mtime) + sizeof(offset)) {
             :       DCHECK_GE(mtime, 0);
             :       DCHECK_GE(offset, 0);
             :       key_.append(&mtime, sizeof(mtime));
             :       key_.append(&offset, sizeof(offset));
             :       key_.append(filename);
             :     }
             : 
             :     int64_t Hash() const {
             :       return HashUtil::FastHash64(key_.data(), key_.size(), 0);
             :     }
             : 
             :     Slice filename() const {
             :       return Slice(key_.data() + OFFSETOF_FILENAME, key_.size() - OFFSETOF_FILENAME);
             :     }
             : 
             :     int64_t mtime() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_MTIME);
             :     }
             : 
             :     int64_t offset() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_OFFSET);
             :     }
             : 
             :     Slice ToSlice() const {
             :       return key_;
             :     }
             : 
             :    private:
             :     // Key encoding stored in key_:
             :     //
             :     //  int64_t mtime;
             :     //  int64_t offset;
             :     //  <variable length bytes> filename;
             :     static constexpr int OFFSETOF_MTIME = 0;
             :     static constexpr int OFFSETOF_OFFSET = OFFSETOF_MTIME + sizeof(int64_t);
             :     static constexpr int OFFSETOF_FILENAME = OFFSETOF_OFFSET + sizeof(int64_t);
             :     kudu::faststring key_;
             :   };
             : 
             :   /// The class to abstruct store behavior, including copying the buffer and holding it
             :   /// until store complete.
             :   class StoreTask {
             :    public:
             :     /// Creating a store task requires the filename, mtime, offset that constitutes the
             :     /// cache key, and the buffer and length of the cached data is required too. We
             :     /// allocate a new buffer in the constructor and copy the cache data and update
             :     /// total_size which keeps track of the total buffer size allocate by all store tasks.
             :     explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset,
             :         const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size);
             : 
             :     /// When the store task is destroyed, the allocated buffer is freed and total_size is
             :     /// updated.
             :     ~StoreTask();
             : 
             :     const CacheKey& key() const { return key_; }
             :     const uint8_t* buffer() const { return buffer_; }
             :     int64_t buffer_len() const { return buffer_len_; }
             : 
             :    private:
             :     DISALLOW_COPY_AND_ASSIGN(StoreTask);
             : 
             :     CacheKey key_;
             :     uint8_t* buffer_;
             :     int64_t buffer_len_;
             :     AtomicInt64& total_size_;
             :   };
> Small thing: It would be nice to keep these structures defined in
 > data-cache.cc. The way to do that is to have a forward declaration
 > of CacheKey and StoreTask here in data-cache.h, move these
 > definitions to data-cache.cc, then also move the definition of
 > DataCache's constructor and destructor to data-cache.cc.
 > 
 > Here in data-cache.h:
 > explicit DataCache(const std::string config, bool trace_replay =
 > false);
 > 
 > ~DataCache();
 > 
 > Over in data-cache.cc (a good spot would be right above the
 > definition of DataCache::Init):
 > 
 > DataCache::DataCache(const std::string config, bool trace_replay)
 > : config_(config), trace_replay_(trace_replay) { }
 > 
 > DataCache::~DataCache() { ReleaseResources(); }

That would have been nicer, and I was going to do that initially, but declaring StoreTask forward would have caused unique_ptr later to use an incomplete type that wouldn't compile, so I had to bring the definition up here.
Perhaps pImpl can solve this problem, like this:

in data_cache.h:
class StoreTask;
struct StoreTaskHandle {
	StoreTaskHandle();
	~StoreTaskHandle();
	StoreTaskHandle(StoreTaskHandle&& other);
	StoreTaskHandle& operator=(StoreTaskHandle&& other);
	StoreTaskHandle(const StoreTask* task);
	const StoreTask* operator->() const;
	std::unique_ptr<const StoreTask> task;
};

in data_cache.cc:
DataCache::StoreTaskHandle::StoreTaskHandle() = default;
DataCache::StoreTaskHandle::~StoreTaskHandle() = default;
DataCache::StoreTaskHandle::StoreTaskHandle(StoreTaskHandle&& other) = default;
DataCache::StoreTaskHandle& DataCache::StoreTaskHandle::operator=(StoreTaskHandle&& other) = default;
DataCache::StoreTaskHandle::StoreTaskHandle(const StoreTask* task) : task(task) { }
const DataCache::StoreTask* DataCache::StoreTaskHandle::operator->() const { return task.get(); }

But I don't think it's any more aesthetically pleasing.  What do you think?


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.cc@117
PS4, Line 117: DEFINE_int32(data_cache_num_write_threads, 0,
             :     "Number of data cache write threads. Write threads will write the cache "
             :     "asynchronously after IO thread read data, so IO thread will return more quickly. "
             :     "Write threads need to bound the extra memory consumption for holding the temporary "
             :     "buffer though. If this's 0, then write will be synchronous.");
             : DEFINE_string(data_cache_write_buffer_limit, "1GB",
             :     "Limit of the total buffer size used by asynchronous store tasks.");
> For the startup parameter names, it would be good to include
 > "async" in the name so that it is clear that these only apply to
 > the async write functionality. i.e. data_cache_num_async_write_threads
 > and data_cache_async_write_buffer_limit.
 > 
 > A separate question: How would someone decide what the right values
 > are?

I think a good value for the number of threads is the same as data_cache_write_concurrency. This way we can avoid dropping entries due to concurrency limits. These threads are used by multiple partitions. It is also considered to multiply data_cache_write_concurrency with number of partitions as the config value. How do you think?
I don't have any ideas about how to decide buffer size limit. Do you have any suggestions?


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.cc@353
PS4, Line 353:   if (LIKELY(buffer != nullptr)) {
> If the buffer is nullptr, then there is nothing to store. Let's DCHECK that
Done


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.cc@987
PS4, Line 987:     "current buffer size: $0 size limitation: $1 require: $2",
             :     current_buffer_size_.Load(), store_buffer_capacity_, buffer_len);
> Nit: For these two lines, please indent by 4 spaces
Done


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/util/impalad-metrics.cc
File be/src/util/impalad-metrics.cc:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/util/impalad-metrics.cc@84
PS4, Line 84: const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_ACTIVE_BUFFER_BYTES =
            :     "impala-server.io-mgr.remote-data-cache-active-buffer-bytes";
            : const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_STORE_TASKS_CREATED =
            :     "impala-server.io-mgr.remote-data-cache-store-tasks-created";
            : const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_OUT_OF_BUFFER_LIMIT_BYTES =
            :     "impala-server.io-mgr.remote-data-cache-out-of-buffer-limit-bytes";
> From the point of view of naming, it would be nice if the names make it cle
Done


http://gerrit.cloudera.org:8080/#/c/19475/4/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/19475/4/common/thrift/metrics.json@661
PS4, Line 661:  
> Tiny thing: can we remove the trailing spaces? (Here and line 671)
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Feb 2023 03:24:09 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 11:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12674/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 11
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 03:12:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache-test.cc
File be/src/runtime/io/data-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache-test.cc@822
PS7, Line 822:   EXPECT_LT(count, NUM_CACHE_ENTRIES_NO_EVICT);
On our test jobs, this test fails on this assert. I think we might want to use a lower number for data_cache_async_write_threads for this test (e.g. 1 or 2).



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Sun, 12 Mar 2023 06:34:53 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19475/8/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/19475/8/bin/start-impala-cluster.py@409
PS8, Line 409:       
flake8: W293 blank line contains whitespace


http://gerrit.cloudera.org:8080/#/c/19475/8/bin/start-impala-cluster.py@409
PS8, Line 409:       
line has trailing whitespace



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 8
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Tue, 14 Mar 2023 07:52:29 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
18770832838@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24
PS4, Line 24: Testing:
> Thank you for your question, I have done some simple performance tests for 
I modified ThreadFn function in data-cache-test.cc and added a timer for writing, like this:
int64_t write_start_time = UnixMicros();
for (int64_t offset : offsets) {
  cache->Store(custom_fname, MTIME, offset, test_buffer() + offset,
      TEMP_BUFFER_SIZE);
}
*write_time_us = UnixMicros() - start_time;



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 10:36:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12319/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Feb 2023 02:40:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12503/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 10:53:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#8).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous writes to the data cache to improve
scan performance when a cache miss happens.
Previously, writes to the data cache are synchronous with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes
- Add a timer to the MultiThreadedReadWrite function to get the average
time of multithreaded writes. Here are some test cases and their time
that differ significantly between synchronous and asynchronous:
Test case                | Policy | Sync/Async | write time in ms
MultiThreadedNoMisses    | LRU    | Sync       |   12.20
MultiThreadedNoMisses    | LRU    | Async      |   20.74
MultiThreadedNoMisses    | LIRS   | Sync       |    9.42
MultiThreadedNoMisses    | LIRS   | Async      |   16.75
MultiThreadedWithMisses  | LRU    | Sync       |  510.87
MultiThreadedWithMisses  | LRU    | Async      |   10.06
MultiThreadedWithMisses  | LIRS   | Sync       | 1872.11
MultiThreadedWithMisses  | LIRS   | Async      |   11.02
MultiPartitions          | LRU    | Sync       |    1.20
MultiPartitions          | LRU    | Async      |    5.23
MultiPartitions          | LIRS   | Sync       |    1.26
MultiPartitions          | LIRS   | Async      |    7.91
AccessTraceAnonymization | LRU    | Sync       | 1963.89
AccessTraceAnonymization | LRU    | Sync       | 2073.62
AccessTraceAnonymization | LRU    | Async      |    9.43
AccessTraceAnonymization | LRU    | Async      |   13.13
AccessTraceAnonymization | LIRS   | Sync       | 1663.93
AccessTraceAnonymization | LIRS   | Sync       | 1501.86
AccessTraceAnonymization | LIRS   | Async      |   12.83
AccessTraceAnonymization | LIRS   | Async      |   12.74

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache-trace.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/metrics.json
10 files changed, 427 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/8
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 8
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#11).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous writes to the data cache to improve
scan performance when a cache miss happens.
Previously, writes to the data cache are synchronous with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes
- Add a timer to the MultiThreadedReadWrite function to get the average
time of multithreaded writes. Here are some test cases and their time
that differ significantly between synchronous and asynchronous:
Test case                | Policy | Sync/Async | write time in ms
MultiThreadedNoMisses    | LRU    | Sync       |   12.20
MultiThreadedNoMisses    | LRU    | Async      |   20.74
MultiThreadedNoMisses    | LIRS   | Sync       |    9.42
MultiThreadedNoMisses    | LIRS   | Async      |   16.75
MultiThreadedWithMisses  | LRU    | Sync       |  510.87
MultiThreadedWithMisses  | LRU    | Async      |   10.06
MultiThreadedWithMisses  | LIRS   | Sync       | 1872.11
MultiThreadedWithMisses  | LIRS   | Async      |   11.02
MultiPartitions          | LRU    | Sync       |    1.20
MultiPartitions          | LRU    | Async      |    5.23
MultiPartitions          | LIRS   | Sync       |    1.26
MultiPartitions          | LIRS   | Async      |    7.91
AccessTraceAnonymization | LRU    | Sync       | 1963.89
AccessTraceAnonymization | LRU    | Sync       | 2073.62
AccessTraceAnonymization | LRU    | Async      |    9.43
AccessTraceAnonymization | LRU    | Async      |   13.13
AccessTraceAnonymization | LIRS   | Sync       | 1663.93
AccessTraceAnonymization | LIRS   | Sync       | 1501.86
AccessTraceAnonymization | LIRS   | Async      |   12.83
AccessTraceAnonymization | LIRS   | Async      |   12.74

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache-trace.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/metrics.json
10 files changed, 427 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/11
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 11
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache-test.cc
File be/src/runtime/io/data-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache-test.cc@104
PS1, Line 104:   /// before any lookup when running test case in async write mode. 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.h
File be/src/runtime/io/data-cache.h:

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.h@499
PS1, Line 499:   /// Limit of the total buffer size used by asynchronous store tasks, when the current 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc@117
PS1, Line 117: DEFINE_int32(data_cache_num_write_threads, 0, 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc@122
PS1, Line 122: DEFINE_string(data_cache_write_buffer_limit, "1GB", 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc@895
PS1, Line 895:     int64_t buffer_limit = ParseUtil::ParseMemSpec(FLAGS_data_cache_write_buffer_limit, 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc
File be/src/util/impalad-metrics.cc:

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc@84
PS1, Line 84: const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_ACTIVE_BUFFER_BYTES = 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc@86
PS1, Line 86: const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_STORE_TASKS_CREATED = 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc@88
PS1, Line 88: const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_OUT_OF_BUFFER_LIMIT_BYTES = 
line has trailing whitespace



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Feb 2023 12:21:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24
PS4, Line 24: Testing:
What circumstances show better performance?


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache-test.cc
File be/src/runtime/io/data-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache-test.cc@106
PS4, Line 106:     while (cache.current_buffer_size_.Load() != 0) continue;
Let's add a short sleep so that this is not spinning.


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h
File be/src/runtime/io/data-cache.h:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@271
PS4, Line 271:     uint8_t* buffer_;
I think it would be cleaner for this to be a unique_ptr.


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@203
PS4, Line 203: 
             :   /// The key used for look up in the cache.
             :   struct CacheKey {
             :    public:
             :     explicit CacheKey(const string& filename, int64_t mtime, int64_t offset)
             :       : key_(filename.size() + sizeof(mtime) + sizeof(offset)) {
             :       DCHECK_GE(mtime, 0);
             :       DCHECK_GE(offset, 0);
             :       key_.append(&mtime, sizeof(mtime));
             :       key_.append(&offset, sizeof(offset));
             :       key_.append(filename);
             :     }
             : 
             :     int64_t Hash() const {
             :       return HashUtil::FastHash64(key_.data(), key_.size(), 0);
             :     }
             : 
             :     Slice filename() const {
             :       return Slice(key_.data() + OFFSETOF_FILENAME, key_.size() - OFFSETOF_FILENAME);
             :     }
             : 
             :     int64_t mtime() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_MTIME);
             :     }
             : 
             :     int64_t offset() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_OFFSET);
             :     }
             : 
             :     Slice ToSlice() const {
             :       return key_;
             :     }
             : 
             :    private:
             :     // Key encoding stored in key_:
             :     //
             :     //  int64_t mtime;
             :     //  int64_t offset;
             :     //  <variable length bytes> filename;
             :     static constexpr int OFFSETOF_MTIME = 0;
             :     static constexpr int OFFSETOF_OFFSET = OFFSETOF_MTIME + sizeof(int64_t);
             :     static constexpr int OFFSETOF_FILENAME = OFFSETOF_OFFSET + sizeof(int64_t);
             :     kudu::faststring key_;
             :   };
             : 
             :   /// The class to abstruct store behavior, including copying the buffer and holding it
             :   /// until store complete.
             :   class StoreTask {
             :    public:
             :     /// Creating a store task requires the filename, mtime, offset that constitutes the
             :     /// cache key, and the buffer and length of the cached data is required too. We
             :     /// allocate a new buffer in the constructor and copy the cache data and update
             :     /// total_size which keeps track of the total buffer size allocate by all store tasks.
             :     explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset,
             :         const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size);
             : 
             :     /// When the store task is destroyed, the allocated buffer is freed and total_size is
             :     /// updated.
             :     ~StoreTask();
             : 
             :     const CacheKey& key() const { return key_; }
             :     const uint8_t* buffer() const { return buffer_; }
             :     int64_t buffer_len() const { return buffer_len_; }
             : 
             :    private:
             :     DISALLOW_COPY_AND_ASSIGN(StoreTask);
             : 
             :     CacheKey key_;
             :     uint8_t* buffer_;
             :     int64_t buffer_len_;
             :     AtomicInt64& total_size_;
             :   };
Small thing: It would be nice to keep these structures defined in data-cache.cc. The way to do that is to have a forward declaration of CacheKey and StoreTask here in data-cache.h, move these definitions to data-cache.cc, then also move the definition of DataCache's constructor and destructor to data-cache.cc.

Here in data-cache.h:
  explicit DataCache(const std::string config, bool trace_replay = false);

  ~DataCache();

Over in data-cache.cc (a good spot would be right above the definition of DataCache::Init):

DataCache::DataCache(const std::string config, bool trace_replay)
  : config_(config), trace_replay_(trace_replay) { }

DataCache::~DataCache() { ReleaseResources(); }


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.cc@117
PS4, Line 117: DEFINE_int32(data_cache_num_write_threads, 0,
             :     "Number of data cache write threads. Write threads will write the cache "
             :     "asynchronously after IO thread read data, so IO thread will return more quickly. "
             :     "Write threads need to bound the extra memory consumption for holding the temporary "
             :     "buffer though. If this's 0, then write will be synchronous.");
             : DEFINE_string(data_cache_write_buffer_limit, "1GB",
             :     "Limit of the total buffer size used by asynchronous store tasks.");
For the startup parameter names, it would be good to include "async" in the name so that it is clear that these only apply to the async write functionality. i.e. data_cache_num_async_write_threads and data_cache_async_write_buffer_limit.

A separate question: How would someone decide what the right values are?


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.cc@353
PS4, Line 353:   if (LIKELY(buffer != nullptr)) {
If the buffer is nullptr, then there is nothing to store. Let's DCHECK that buffer is not nullptr.


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.cc@354
PS4, Line 354:     // TODO: Should we use memory pool instead of piecemeal memory allocate?
             :     buffer_ = new uint8_t[buffer_len];
We are going to want to account for the memory usage via a MemTracker. I have a change that I have been working on that adds a MemTracker to DataCache. I will post that and add a link here.


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.cc@987
PS4, Line 987:     "current buffer size: $0 size limitation: $1 require: $2",
             :     current_buffer_size_.Load(), store_buffer_capacity_, buffer_len);
Nit: For these two lines, please indent by 4 spaces


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/util/impalad-metrics.cc
File be/src/util/impalad-metrics.cc:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/util/impalad-metrics.cc@84
PS4, Line 84: const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_ACTIVE_BUFFER_BYTES =
            :     "impala-server.io-mgr.remote-data-cache-active-buffer-bytes";
            : const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_STORE_TASKS_CREATED =
            :     "impala-server.io-mgr.remote-data-cache-store-tasks-created";
            : const char* ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_OUT_OF_BUFFER_LIMIT_BYTES =
            :     "impala-server.io-mgr.remote-data-cache-out-of-buffer-limit-bytes";
From the point of view of naming, it would be nice if the names make it clear that this is related to async stores.

Some suggestions:
remote-data-cache-active-buffer-bytes -> remote-data-cache-async-stores-outstanding-bytes
remote-data-cache-store-tasks-created -> remote-data-cache-num-async-stores
remote-data-cache-out-of-buffer-limit-bytes -> remote-data-cache-async-stores-dropped-bytes

I think it would make sense to have a remote-data-cache-async-stores-dropped metric as well (i.e. the count of drops vs the number of dropped bytes).


http://gerrit.cloudera.org:8080/#/c/19475/4/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/19475/4/common/thrift/metrics.json@661
PS4, Line 661:  
Tiny thing: can we remove the trailing spaces? (Here and line 671)



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Tue, 14 Feb 2023 20:56:33 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#6).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve
scan performance when cache miss happens.
Previously, writes to the data cache are synchronized with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache-trace.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
8 files changed, 330 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/6
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 6
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#9).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous writes to the data cache to improve
scan performance when a cache miss happens.
Previously, writes to the data cache are synchronous with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes
- Add a timer to the MultiThreadedReadWrite function to get the average
time of multithreaded writes. Here are some test cases and their time
that differ significantly between synchronous and asynchronous:
Test case                | Policy | Sync/Async | write time in ms
MultiThreadedNoMisses    | LRU    | Sync       |   12.20
MultiThreadedNoMisses    | LRU    | Async      |   20.74
MultiThreadedNoMisses    | LIRS   | Sync       |    9.42
MultiThreadedNoMisses    | LIRS   | Async      |   16.75
MultiThreadedWithMisses  | LRU    | Sync       |  510.87
MultiThreadedWithMisses  | LRU    | Async      |   10.06
MultiThreadedWithMisses  | LIRS   | Sync       | 1872.11
MultiThreadedWithMisses  | LIRS   | Async      |   11.02
MultiPartitions          | LRU    | Sync       |    1.20
MultiPartitions          | LRU    | Async      |    5.23
MultiPartitions          | LIRS   | Sync       |    1.26
MultiPartitions          | LIRS   | Async      |    7.91
AccessTraceAnonymization | LRU    | Sync       | 1963.89
AccessTraceAnonymization | LRU    | Sync       | 2073.62
AccessTraceAnonymization | LRU    | Async      |    9.43
AccessTraceAnonymization | LRU    | Async      |   13.13
AccessTraceAnonymization | LIRS   | Sync       | 1663.93
AccessTraceAnonymization | LIRS   | Sync       | 1501.86
AccessTraceAnonymization | LIRS   | Async      |   12.83
AccessTraceAnonymization | LIRS   | Async      |   12.74

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache-trace.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/metrics.json
10 files changed, 427 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/9
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 9
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 7:

(13 comments)

This is interesting, and I think we should go forward with merging this. I have a few code organization comments, but I think this is getting very close.

I am running our test jobs with the data cache enabled and async writes turned on.

To make that easier in future, can you add code in bin/run-all-tests.sh that defines a new environment variable similar to the one we have for DATA_CACHE_EVICTION_POLICY (e.g. DATA_CACHE_ASYNC_WRITE_THREADS)? See the current code we have here:
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L67
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L82-L85
That let's us run our full test job with this enabled.

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24
PS4, Line 24: Testing:
> Thank you for your question, I have done some simple performance tests for 
Thank you for the performance numbers. This is interesting enough that we should go ahead and plan on checking this in. Can you incorporate a few of the interesting performance results into the commit message?


http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@9
PS7, Line 9: write
Nit: writes


http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@10
PS7, Line 10: when cache miss happens
Nit: "when a cache miss happens"


http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@11
PS7, Line 11: synchronized
Nit: synchronous


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@128
PS7, Line 128: const int MAX_STORE_TASK_QUEUE_SIZE = 1 << 20;
Let's add a comment here saying that this large value for the queue size is harmless because the total size of the entries on the queue are bound by the data_cache_async_write_bufer_limit.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@385
PS7, Line 385: abstruct
Nit: abstract


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@393
PS7, Line 393:   explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset,
             :       const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size)
             :     : key_(filename, mtime, offset),
I would like to make StoreTask a simple struct with minimal logic, and move the counter update / buffer allocation outside of this class.

Basically, DataCache::SubmitStoreTask() would:
 - Increment the counter atomically with the CompareAndSwap() subject to the limit
 - Allocate the memory buffer and copy into it
 - Increment the async writes outstanding bytes
 - Construct the StoreTask passing in the new buffer and all the other data needed (but not the total_size AtomicInt64 that we have today)
 - Unlike today, StoreTask would take in a pointer to the DataCache object.

Then, we would add a DataCache::CompleteStoreTask(const StoreTask& store_task), which does the work that the StoreTask destructor does today (decrementing the counters).

StoreTask becomes a simple holder of data. StoreTask's destructor calls CompleteStoreTask(this) on the DataCache pointer that was passed into the constructor.

Having the allocation logic in DataCache will make it easier for us to hook up the memory tracking.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@975
PS7, Line 975:     if (buffer_limit < PAGE_SIZE) {
             :       return Status(Substitute("Configured data cache write buffer limit $0 is too small",
             :           FLAGS_data_cache_async_write_buffer_limit));
             :     }
Let's require the limit to be higher. I think the minimum should be 8MB. If someone configures this to a value that is smaller than most IOs, then nothing will ever get cached. Most IOs we do are smaller than 8MB.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1059
PS7, Line 1059:   if (UNLIKELY(current_buffer_size_.Load() + buffer_len > store_buffer_capacity_)) {
              :     VLOG(2) << Substitute("Failed to create store task due to buffer size limitation, "
              :         "current buffer size: $0 size limitation: $1 require: $2",
              :         current_buffer_size_.Load(), store_buffer_capacity_, buffer_len);
              :     ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_BYTES->
              :         Increment(buffer_len);
              :     ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_ENTRIES->Increment(1);
              :     return false;
              :   }
This would become a CompareAndSwap loop where either we are at the limit and return false or we successfully bump the current_buffer_size_.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1069
PS7, Line 1069:   const StoreTask* task = new StoreTask(filename, mtime, offset, buffer, buffer_len,
              :       current_buffer_size_);
Move the logic from the StoreTask constructor (incrementing counters, allocating the buffer, copying the data) here and then pass in the new buffer directly to StoreTask. StoreTask also gets a pointer to this DataCache object.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1074
PS7, Line 1074: 
We would also add a CompleteStoreTask() function here that would be called from StoreTask's destructor (and do the decrements).


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc@80
PS7, Line 80: Write threads need to bound the extra memory consumption for holding the "
            :     "temporary buffer though.
Nit: Let's change this sentence to say that the extra memory for temporary buffers is limited by data_cache_async_write_buffer_limit.


http://gerrit.cloudera.org:8080/#/c/19475/7/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/19475/7/common/thrift/metrics.json@653
PS7, Line 653: bytes async
Nit: "bytes of async"



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Sat, 11 Mar 2023 21:12:36 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#3).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve
scan performance when cache miss happens.
Previously, writes to the data cache are synchronized with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
6 files changed, 330 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/3
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
18770832838@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24
PS4, Line 24: Testing:
> I'm not convinced that this change will improve performance noticeably. Whe
Thank you for your question, I have done some simple performance tests for this. Firstly, I modified ThreadFn function in data-cache-test.cc and added a timer for writing, like this:X Then I calculated the average writing time for each test case. Some of the tests had similar results between synchronous and asynchronous; while there were obvious differences in some cases as shown in the following table:
Test case                | Policy | Sync/Async | write time(ms)
MultiThreadedNoMisses    | LRU    | Sync       |   12.20
MultiThreadedNoMisses    | LRU    | Async      |   20.74
MultiThreadedNoMisses    | LIRS   | Sync       |    9.42
MultiThreadedNoMisses    | LIRS   | Async      |   16.75

MultiThreadedWithMisses  | LRU    | Sync       |  510.87
MultiThreadedWithMisses  | LRU    | Async      |   10.06
MultiThreadedWithMisses  | LIRS   | Sync       | 1872.11
MultiThreadedWithMisses  | LIRS   | Async      |   11.02

MultiPartitions          | LRU    | Sync       |    1.20
MultiPartitions          | LRU    | Async      |    5.23
MultiPartitions          | LIRS   | Sync       |    1.26
MultiPartitions          | LIRS   | Async      |    7.91

AccessTraceAnonymization | LRU    | Sync       | 1963.89
AccessTraceAnonymization | LRU    | Sync       | 2073.62
AccessTraceAnonymization | LRU    | Async      |    9.43
AccessTraceAnonymization | LRU    | Async      |   13.13
AccessTraceAnonymization | LIRS   | Sync       | 1663.93
AccessTraceAnonymization | LIRS   | Sync       | 1501.86
AccessTraceAnonymization | LIRS   | Async      |   12.83
AccessTraceAnonymization | LIRS   | Async      |   12.74

I also conducted some practical application tests on the HDFS data of the production environment. I selected some queries that were bottlenecks for HDFS scans and executed them several times on the newly started cluster, and recorded the time consumption (in seconds) for the HDFS scans. The results are as follows:
Test case | M/H   | No DataCache | Sync DataCache | Async DataCache

case1     | MISS  | 29.06        | 89.00          | 34.15
          | HIT   | 25.23        | 23.13          |
          | HIT   | 30.49        | 22.75          |

case2     | MISS  |  6.92        |  9.63          |  6.33 
          | HIT   |  5.19        |  4.04          |      
          | HIT   |  6.66        |  3.40          |   
         
case3     | MISS  | 11.96        | 36.73          | 17.41
          | HIT   |  7.94        |  5.75          |   
          | HIT   | 15.82        |  5.73          |    

From the test result, I think asynchronous writing can still improve the performance of scanning when cache miss, especially when there is cache entry eviction.


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h
File be/src/runtime/io/data-cache.h:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@271
PS4, Line 271:     uint8_t* buffer_;
> Yes, this should be unique_ptr<uint8_t[]>. It will have a similar amount of
Done


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@203
PS4, Line 203: 
             :   /// The key used for look up in the cache.
             :   struct CacheKey {
             :    public:
             :     explicit CacheKey(const string& filename, int64_t mtime, int64_t offset)
             :       : key_(filename.size() + sizeof(mtime) + sizeof(offset)) {
             :       DCHECK_GE(mtime, 0);
             :       DCHECK_GE(offset, 0);
             :       key_.append(&mtime, sizeof(mtime));
             :       key_.append(&offset, sizeof(offset));
             :       key_.append(filename);
             :     }
             : 
             :     int64_t Hash() const {
             :       return HashUtil::FastHash64(key_.data(), key_.size(), 0);
             :     }
             : 
             :     Slice filename() const {
             :       return Slice(key_.data() + OFFSETOF_FILENAME, key_.size() - OFFSETOF_FILENAME);
             :     }
             : 
             :     int64_t mtime() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_MTIME);
             :     }
             : 
             :     int64_t offset() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_OFFSET);
             :     }
             : 
             :     Slice ToSlice() const {
             :       return key_;
             :     }
             : 
             :    private:
             :     // Key encoding stored in key_:
             :     //
             :     //  int64_t mtime;
             :     //  int64_t offset;
             :     //  <variable length bytes> filename;
             :     static constexpr int OFFSETOF_MTIME = 0;
             :     static constexpr int OFFSETOF_OFFSET = OFFSETOF_MTIME + sizeof(int64_t);
             :     static constexpr int OFFSETOF_FILENAME = OFFSETOF_OFFSET + sizeof(int64_t);
             :     kudu::faststring key_;
             :   };
             : 
             :   /// The class to abstruct store behavior, including copying the buffer and holding it
             :   /// until store complete.
             :   class StoreTask {
             :    public:
             :     /// Creating a store task requires the filename, mtime, offset that constitutes the
             :     /// cache key, and the buffer and length of the cached data is required too. We
             :     /// allocate a new buffer in the constructor and copy the cache data and update
             :     /// total_size which keeps track of the total buffer size allocate by all store tasks.
             :     explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset,
             :         const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size);
             : 
             :     /// When the store task is destroyed, the allocated buffer is freed and total_size is
             :     /// updated.
             :     ~StoreTask();
             : 
             :     const CacheKey& key() const { return key_; }
             :     const uint8_t* buffer() const { return buffer_; }
             :     int64_t buffer_len() const { return buffer_len_; }
             : 
             :    private:
             :     DISALLOW_COPY_AND_ASSIGN(StoreTask);
             : 
             :     CacheKey key_;
             :     uint8_t* buffer_;
             :     int64_t buffer_len_;
             :     AtomicInt64& total_size_;
             :   };
> The reason the compiler gives an error about the incomplete type is because
You are right, I forgot to move the c'tor and d'tor of DataCache so it failed to compile. Now it has been fixed.


http://gerrit.cloudera.org:8080/#/c/19475/5/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/5/be/src/runtime/io/data-cache.cc@117
PS5, Line 117: DEFINE_int32(data_cache_num_write_threads, 0,
             :     "Number of data cache write threads. Write threads will write the cache "
             :     "asynchronously after IO thread read data, so IO thread will return more quickly. "
             :     "Write threads need to bound the extra memory consumption for holding the temporary "
             :     "buffer though. If this's 0, then write will be synchronous.");
             : DEFINE_string(data_cache_write_buffer_limit, "1GB",
             :     "Limit of the total buffer size used by asynchronous store tasks.");
> Let's describe these options as experimental by adding an "(Experimental)" 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 10:31:57 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12620/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 9
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Tue, 14 Mar 2023 08:15:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12619/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 8
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Tue, 14 Mar 2023 08:11:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9171/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 03:22:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24
PS4, Line 24: Testing:
> > What circumstances show better performance?
I'm not convinced that this change will improve performance noticeably. When the thread writes to the data cache, the OS should buffer the write in memory, and the writer thread should not need to wait for the actual write to the underlying IO device. That may change when we start to use Direct IO for the data cache (IMPALA-11905).

I'm looking for some circumstance where data_cache_num_write_threads=N performs better than data_cache_num_write_threads=0. IMPALA-11886 mentions that performance deteriorates when using the data cache. I would like to see concrete numbers about how much this code change helps.

In theory, that would apply to the first run of a query when the cache is empty. It could also apply to workloads where the data cache is too small and the cache is being thrashed (e.g. a data cache of 1GB when there is a 5GB working set).


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h
File be/src/runtime/io/data-cache.h:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@271
PS4, Line 271:     uint8_t* buffer_;
> > I think it would be cleaner for this to be a unique_ptr.
Yes, this should be unique_ptr<uint8_t[]>. It will have a similar amount of code, but it does automatic cleanup of the buffer when the StoreTask is deleted. For example, if the buffer_ is nullptr, unique_ptr knows not to try to call free on it.

Impala's code style prefers to use unique_ptr (or other smart pointers) wherever possible to avoid dealing with manual allocations / frees.

https://google.github.io/styleguide/cppguide.html#Ownership_and_Smart_Pointers


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@203
PS4, Line 203: 
             :   /// The key used for look up in the cache.
             :   struct CacheKey {
             :    public:
             :     explicit CacheKey(const string& filename, int64_t mtime, int64_t offset)
             :       : key_(filename.size() + sizeof(mtime) + sizeof(offset)) {
             :       DCHECK_GE(mtime, 0);
             :       DCHECK_GE(offset, 0);
             :       key_.append(&mtime, sizeof(mtime));
             :       key_.append(&offset, sizeof(offset));
             :       key_.append(filename);
             :     }
             : 
             :     int64_t Hash() const {
             :       return HashUtil::FastHash64(key_.data(), key_.size(), 0);
             :     }
             : 
             :     Slice filename() const {
             :       return Slice(key_.data() + OFFSETOF_FILENAME, key_.size() - OFFSETOF_FILENAME);
             :     }
             : 
             :     int64_t mtime() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_MTIME);
             :     }
             : 
             :     int64_t offset() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_OFFSET);
             :     }
             : 
             :     Slice ToSlice() const {
             :       return key_;
             :     }
             : 
             :    private:
             :     // Key encoding stored in key_:
             :     //
             :     //  int64_t mtime;
             :     //  int64_t offset;
             :     //  <variable length bytes> filename;
             :     static constexpr int OFFSETOF_MTIME = 0;
             :     static constexpr int OFFSETOF_OFFSET = OFFSETOF_MTIME + sizeof(int64_t);
             :     static constexpr int OFFSETOF_FILENAME = OFFSETOF_OFFSET + sizeof(int64_t);
             :     kudu::faststring key_;
             :   };
             : 
             :   /// The class to abstruct store behavior, including copying the buffer and holding it
             :   /// until store complete.
             :   class StoreTask {
             :    public:
             :     /// Creating a store task requires the filename, mtime, offset that constitutes the
             :     /// cache key, and the buffer and length of the cached data is required too. We
             :     /// allocate a new buffer in the constructor and copy the cache data and update
             :     /// total_size which keeps track of the total buffer size allocate by all store tasks.
             :     explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset,
             :         const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size);
             : 
             :     /// When the store task is destroyed, the allocated buffer is freed and total_size is
             :     /// updated.
             :     ~StoreTask();
             : 
             :     const CacheKey& key() const { return key_; }
             :     const uint8_t* buffer() const { return buffer_; }
             :     int64_t buffer_len() const { return buffer_len_; }
             : 
             :    private:
             :     DISALLOW_COPY_AND_ASSIGN(StoreTask);
             : 
             :     CacheKey key_;
             :     uint8_t* buffer_;
             :     int64_t buffer_len_;
             :     AtomicInt64& total_size_;
             :   };
> > Small thing: It would be nice to keep these structures defined in
The reason the compiler gives an error about the incomplete type is because the constructor and destructor need to know the size of the type. Moving them to the .cc file eliminates the need to know the size in the .h file, so it can use a forward declaration.

The change I suggested allows the code to compile correctly.


http://gerrit.cloudera.org:8080/#/c/19475/5/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/5/be/src/runtime/io/data-cache.cc@117
PS5, Line 117: DEFINE_int32(data_cache_num_async_write_threads, 0,
             :     "Number of data cache async write threads. Write threads will write the cache "
             :     "asynchronously after IO thread read data, so IO thread will return more quickly. "
             :     "Write threads need to bound the extra memory consumption for holding the temporary "
             :     "buffer though. If this's 0, then write will be synchronous.");
             : DEFINE_string(data_cache_async_write_buffer_limit, "1GB",
             :     "Limit of the total buffer size used by asynchronous store tasks.");
Let's describe these options as experimental by adding an "(Experimental)" prefix to the description.

Impala is not officially supporting these startup options yet, and we need flexibility to remove them later if there is a different design for async IO that we prefer.



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Feb 2023 21:42:17 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 10: Code-Review+1

(2 comments)

This is looking good. I only have a couple small things, then I think we are good to go. I will kick off another round of tests.

Thank you for your patience on this review.

http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache-test.cc
File be/src/runtime/io/data-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache-test.cc@107
PS10, Line 107:       usleep(10 * 1000);
One small thing: Since we call this so frequently, it matters a lot to execution time of these tests. In my experiments, these tests run much faster if we bring this down to 500 vs 10000.


http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/10/be/src/runtime/io/data-cache.cc@960
PS10, Line 960:     if (buffer_limit < (1 << 23 /* 8MB */)) {
              :       return Status(Substitute("Configured data cache write buffer limit $0 is too small "
              :           "(less than 8MB)", FLAGS_data_cache_async_write_buffer_limit));
              :     }
Unfortunately, this causes problems for the OutOfWriteBufferLimit test case, because it uses a smaller buffer. As a workaround, let's disable this check for backend tests:

if (!TestInfo::is_be_test() && buffer_limit < (1 << 23 /* 8MB */)) {



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Mar 2023 21:13:17 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 12: Code-Review+2

My internal test run looks good, this looks ready


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 03:22:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous writes to the data cache to improve
scan performance when a cache miss happens.
Previously, writes to the data cache are synchronous with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes
- Add a timer to the MultiThreadedReadWrite function to get the average
time of multithreaded writes. Here are some test cases and their time
that differ significantly between synchronous and asynchronous:
Test case                | Policy | Sync/Async | write time in ms
MultiThreadedNoMisses    | LRU    | Sync       |   12.20
MultiThreadedNoMisses    | LRU    | Async      |   20.74
MultiThreadedNoMisses    | LIRS   | Sync       |    9.42
MultiThreadedNoMisses    | LIRS   | Async      |   16.75
MultiThreadedWithMisses  | LRU    | Sync       |  510.87
MultiThreadedWithMisses  | LRU    | Async      |   10.06
MultiThreadedWithMisses  | LIRS   | Sync       | 1872.11
MultiThreadedWithMisses  | LIRS   | Async      |   11.02
MultiPartitions          | LRU    | Sync       |    1.20
MultiPartitions          | LRU    | Async      |    5.23
MultiPartitions          | LIRS   | Sync       |    1.26
MultiPartitions          | LIRS   | Async      |    7.91
AccessTraceAnonymization | LRU    | Sync       | 1963.89
AccessTraceAnonymization | LRU    | Sync       | 2073.62
AccessTraceAnonymization | LRU    | Async      |    9.43
AccessTraceAnonymization | LRU    | Async      |   13.13
AccessTraceAnonymization | LIRS   | Sync       | 1663.93
AccessTraceAnonymization | LIRS   | Sync       | 1501.86
AccessTraceAnonymization | LIRS   | Async      |   12.83
AccessTraceAnonymization | LIRS   | Async      |   12.74

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Reviewed-on: http://gerrit.cloudera.org:8080/19475
Reviewed-by: Joe McDonnell <jo...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache-trace.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/metrics.json
10 files changed, 427 insertions(+), 52 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 13
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
18770832838@163.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 7:

(14 comments)

Thank you for your suggestion! I have updated the code and commit message, and added time-consuming log printing in the test code for easier replication of test results.

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24
PS4, Line 24: Testing:
> Thank you for the performance numbers. This is interesting enough that we s
Done


http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@9
PS7, Line 9: write
> Nit: writes
Done


http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@10
PS7, Line 10: when cache miss happens
> Nit: "when a cache miss happens"
Done


http://gerrit.cloudera.org:8080/#/c/19475/7//COMMIT_MSG@11
PS7, Line 11: synchronized
> Nit: synchronous
Done


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache-test.cc
File be/src/runtime/io/data-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache-test.cc@822
PS7, Line 822:   EXPECT_LT(count, NUM_CACHE_ENTRIES_NO_EVICT);
> On our test jobs, this test fails on this assert. I think we might
 > want to use a lower number for data_cache_async_write_threads for
 > this test (e.g. 1 or 2).

Sorry, this was caused by me missing some code. In order not to interfere with the DataCache in TraceReplayer, I changed the number of asynchronous threads to the constructor parameter of DataCache, but I forgot to make the corresponding modifications in the test code. Now this issue has been fixed and all test cases should pass.


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@128
PS7, Line 128: const int MAX_STORE_TASK_QUEUE_SIZE = 1 << 20;
> Let's add a comment here saying that this large value for the queue size is
Done


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@385
PS7, Line 385: abstruct
> Nit: abstract
Done


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@393
PS7, Line 393:   explicit StoreTask(const std::string& filename, int64_t mtime, int64_t offset,
             :       const uint8_t* buffer, int64_t buffer_len, AtomicInt64& total_size)
             :     : key_(filename, mtime, offset),
> I would like to make StoreTask a simple struct with minimal logic, and move
Done


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@975
PS7, Line 975:     if (buffer_limit < PAGE_SIZE) {
             :       return Status(Substitute("Configured data cache write buffer limit $0 is too small",
             :           FLAGS_data_cache_async_write_buffer_limit));
             :     }
> Let's require the limit to be higher. I think the minimum should be 8MB. If
Done


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1059
PS7, Line 1059:   if (UNLIKELY(current_buffer_size_.Load() + buffer_len > store_buffer_capacity_)) {
              :     VLOG(2) << Substitute("Failed to create store task due to buffer size limitation, "
              :         "current buffer size: $0 size limitation: $1 require: $2",
              :         current_buffer_size_.Load(), store_buffer_capacity_, buffer_len);
              :     ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_BYTES->
              :         Increment(buffer_len);
              :     ImpaladMetrics::IO_MGR_REMOTE_DATA_CACHE_ASYNC_WRITES_DROPPED_ENTRIES->Increment(1);
              :     return false;
              :   }
> This would become a CompareAndSwap loop where either we are at the limit an
Done


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1069
PS7, Line 1069:   const StoreTask* task = new StoreTask(filename, mtime, offset, buffer, buffer_len,
              :       current_buffer_size_);
> Move the logic from the StoreTask constructor (incrementing counters, alloc
Done


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/data-cache.cc@1074
PS7, Line 1074: 
> We would also add a CompleteStoreTask() function here that would be called 
Done


http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/19475/7/be/src/runtime/io/disk-io-mgr.cc@80
PS7, Line 80: Write threads need to bound the extra memory consumption for holding the "
            :     "temporary buffer though.
> Nit: Let's change this sentence to say that the extra memory for temporary 
Done


http://gerrit.cloudera.org:8080/#/c/19475/7/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/19475/7/common/thrift/metrics.json@653
PS7, Line 653: bytes async
> Nit: "bytes of async"
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Tue, 14 Mar 2023 08:14:22 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#2).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve
scan performance when cache miss happens.

Previously, writes to the data cache are synchronized with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.

This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
6 files changed, 330 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/2
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 6:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12502/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 6
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 10:10:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#10).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous writes to the data cache to improve
scan performance when a cache miss happens.
Previously, writes to the data cache are synchronous with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes
- Add a timer to the MultiThreadedReadWrite function to get the average
time of multithreaded writes. Here are some test cases and their time
that differ significantly between synchronous and asynchronous:
Test case                | Policy | Sync/Async | write time in ms
MultiThreadedNoMisses    | LRU    | Sync       |   12.20
MultiThreadedNoMisses    | LRU    | Async      |   20.74
MultiThreadedNoMisses    | LIRS   | Sync       |    9.42
MultiThreadedNoMisses    | LIRS   | Async      |   16.75
MultiThreadedWithMisses  | LRU    | Sync       |  510.87
MultiThreadedWithMisses  | LRU    | Async      |   10.06
MultiThreadedWithMisses  | LIRS   | Sync       | 1872.11
MultiThreadedWithMisses  | LIRS   | Async      |   11.02
MultiPartitions          | LRU    | Sync       |    1.20
MultiPartitions          | LRU    | Async      |    5.23
MultiPartitions          | LIRS   | Sync       |    1.26
MultiPartitions          | LIRS   | Async      |    7.91
AccessTraceAnonymization | LRU    | Sync       | 1963.89
AccessTraceAnonymization | LRU    | Sync       | 2073.62
AccessTraceAnonymization | LRU    | Async      |    9.43
AccessTraceAnonymization | LRU    | Async      |   13.13
AccessTraceAnonymization | LIRS   | Sync       | 1663.93
AccessTraceAnonymization | LIRS   | Sync       | 1501.86
AccessTraceAnonymization | LIRS   | Async      |   12.83
AccessTraceAnonymization | LIRS   | Async      |   12.74

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache-trace.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/metrics.json
10 files changed, 427 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/10
-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 10:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12621/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Tue, 14 Mar 2023 08:21:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 12: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Anonymous Coward <18...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Mar 2023 08:32:11 +0000
Gerrit-HasComments: No