You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Yida Wu (Code Review)" <ge...@cloudera.org> on 2021/11/02 00:42:04 UTC

[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files

Yida Wu has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
......................................................................

IMPALA-10791 Add batching reading for remote temporary files

The patch adds a feature to batching read from a remote temporary
file in order to improve the reading performance for the spilled
remote data.

Originally, the design is to use the local disk file as the buffer
for batching reading from the remote file. But in practice, it
doesn't help to improve the performance. Therefore, the design
is changed to use the memory as the read buffer.

Currently, each TmpFileRemote has two DiskFile, one is for the
remote, and one is for the local buffer. The patch adds MemBlocks
to the local buffer file. Each local buffer file is divided into
several MemBlocks evenly, but in order to guarantee a page not
being cut into two parts in different blocks, the block size
could be a little different to each other in practice. The default
block size is the minimum value between 1/4 default file size and
MAX_REMOTE_READ_MEM_BLOCK_THRESHOLD_MB, which is 16MB.

When pinning a page, the system will detect if there is enough
memory for the block that holds the page, if not, we will go
reading the page directly and disable this block, because it may
be good to avoid duplicated reads from the remote fs for the same
content. If the system decides to fetch a block, the block will be
stored in the memory until all of the pages in the block are read
or the query ends.

One challenge of using the memory for the buffer is that, when the
system is lacking of memory when it needs to spill the data. So we
make a restriction to limit the percentage of the memory for the
read buffer to 5% of the total, because right now the impala
process will reserve 20% memory as unused memory by default, using
5% for the emergency case like spilling is reasonable.

Two start options have been added for the new feature.

1. remote_batching_read. Default is false. If set true, the batching
read is enabled.
2. remote_read_memory_buffer_size. Default is 1G. The maximum memory
that can be used by the read buffer. The number also restricted by
the total system memory, which can not exceed 5% of the total memory.

The patch also increases the MAX_REMOTE_TMPFILE_SIZE_THRESHOLD_MB
from 256 to 512.

Tests:
Ran core and exhaustive tests.
Added and ran TmpFileMgrTest::TestBatchingReadFromRemote.
Added e2e test test_scratch_dirs_batch_reading.

Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/disk-file.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M common/thrift/metrics.json
M tests/custom_cluster/test_scratch_disk.py
12 files changed, 1,110 insertions(+), 151 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/17979/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Gerrit-Change-Number: 17979
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu <wy...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Yida Wu <wy...@gmail.com>