You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2016/10/05 16:54:27 UTC

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4631

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................

IMPALA-3202: DiskIoMgr improvements for new buffer pool

The main goal of this patch is to add support to the DiskIoMgr
to read into client-provided buffers, instead of the DiskIoMgr
allocating intermediate buffers. This is important for the new
buffer pool because we want it to entirely manage its own memory,
rather than delegating part of its memory management to the DiskIoMgr.

Also do some cleanup:
* Consolidate some correlated ScanRange parameters into a parameter
  struct.
* Remove the "untracked" buffers mem tracker, which is no longer
  necessary.
* Change the buffer type in DiskIoMgr to use uint8_t* to be consistent
  with the rest of Impala.

Testing:
Added a unit test. We also get coverage from the BufferedBlockMgr unit
tests, any spilling tests and the Parquet tests with large footers.

Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-stress.h
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
15 files changed, 369 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/4631/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4631/3/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

PS3, Line 165: external_buffer_tag_ != ExternalBufferTag::CACHED_BUFFER
> !is_cached_buffer()? (either way is okay with me but sometimes you write th
is_cached_buffer() is a method of BufferDescriptor - there isn't an equivalent method of ScanRange.


PS3, Line 450: external_buffer_tag_ != ExternalBufferTag::CACHED_BUFFE
> this could be stronger DCHECK_EQ(..., NO_BUFFER), right?  (Also use the NE/
The NE/EQ macros don't work for strongly-typed "enum class" unions unfortunately. It works for the old-style enums since they decay to ints. I felt like this was the lesser evil.


http://gerrit.cloudera.org:8080/#/c/4631/3/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 641:       && range->external_buffer_tag_ != ScanRange::ExternalBufferTag::CLIENT_BUFFER) {
> or !is_client_buffer() (i'm fine either way, but just a bit shorter and wha
is_client_buffer() is a method of BufferDescriptor so doesn't work here. I could add accessors to ScanRange, but I don't feel like they carry their wieght.


PS3, Line 1066: range->external_buffer_tag_ == ScanRange::ExternalBufferTag::CLIENT_BUFF
> same
See above


-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#4).

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................

IMPALA-3202: DiskIoMgr improvements for new buffer pool

The main goal of this patch is to add support to the DiskIoMgr
to read into client-provided buffers, instead of the DiskIoMgr
allocating intermediate buffers. This is important for the new
buffer pool because we want it to entirely manage its own memory,
rather than delegating part of its memory management to the DiskIoMgr.

Also do some cleanup:
* Consolidate some correlated ScanRange parameters into a parameter
  struct.
* Remove the "untracked" buffers mem tracker, which is no longer
  necessary.
* Change the buffer type in DiskIoMgr to use uint8_t* to be consistent
  with the rest of Impala.

Testing:
Added a unit test. We also get coverage from the BufferedBlockMgr unit
tests, any spilling tests and the Parquet tests with large footers.

Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-stress.h
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
15 files changed, 404 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/4631/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

PS1, Line 193: BufferOpts
> const& ?
Done


http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS1, Line 642: without the client providing a "
             :           "buffer"
> is this going to make sense to a user?
Probably not. The old message wasn't actionable either by a user (beyond reporting the bug). Reworded to make it clear that it's an internal error.


Line 708:   DCHECK_LE(buffer_size, max_buffer_size_);
> how do we ensure this is true? is it the check in DiskIoMgr::Read() that wi
DiskIoMgr::ReadRange() checks if there's a client buffer before calling TryAllocateNextBufferForRange().


http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 147: these buffers are returned to the caller wrapped
             : /// in BufferDescriptors.
> took a few seconds to parse and understand. how about:
Yes that's better...


Line 150: /// another read. In error cases, the IoMgr will recycle the buffers more promptly but
> is this relevant for all cases, or only cases (a) and (b)?
Tried to clarify. Also removed the last sentence since I think it was confusing and not really useful to the reader.


PS1, Line 392: bool
> maybe make all of these members 'const' since they are "options" that don't
Done


PS1, Line 422: BufferOpts
> const BufferOpts& ?
Done


Line 506:     int64_t client_buffer_len_;
> maybe move this right before cached_buffer_ since it's related in the sense
Switched it to a tagged union. We don't really have much precedent in the codebase for this pattern so I invented some conventions that made sense to me.


-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


IMPALA-3202: DiskIoMgr improvements for new buffer pool

The main goal of this patch is to add support to the DiskIoMgr
to read into client-provided buffers, instead of the DiskIoMgr
allocating intermediate buffers. This is important for the new
buffer pool because we want it to entirely manage its own memory,
rather than delegating part of its memory management to the DiskIoMgr.

Also do some cleanup:
* Consolidate some correlated ScanRange parameters into a parameter
  struct.
* Remove the "untracked" buffers mem tracker, which is no longer
  necessary.
* Change the buffer type in DiskIoMgr to use uint8_t* to be consistent
  with the rest of Impala.

Testing:
Added a unit test. We also get coverage from the BufferedBlockMgr unit
tests, any spilling tests and the Parquet tests with large footers.

Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Reviewed-on: http://gerrit.cloudera.org:8080/4631
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Internal Jenkins
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-stress.h
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
15 files changed, 404 insertions(+), 224 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

PS1, Line 193: BufferOpts
const& ?


http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS1, Line 642: without the client providing a "
             :           "buffer"
is this going to make sense to a user?


Line 708:   DCHECK_LE(buffer_size, max_buffer_size_);
how do we ensure this is true? is it the check in DiskIoMgr::Read() that will always enforce?


http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 147: these buffers are returned to the caller wrapped
             : /// in BufferDescriptors.
took a few seconds to parse and understand. how about:

... these buffers are wrapped in BufferDescriptors and return to the caller.


Line 150: /// another read. In error cases, the IoMgr will recycle the buffers more promptly but
is this relevant for all cases, or only cases (a) and (b)?


PS1, Line 392: bool
maybe make all of these members 'const' since they are "options" that don't change, right?


PS1, Line 422: BufferOpts
const BufferOpts& ?


Line 506:     int64_t client_buffer_len_;
maybe move this right before cached_buffer_ since it's related in the sense that it's another possible source of the buffer.  Alternatively, consider even making a union to make it clear that they are mutually exclusive, and the union tag indicates the buffer "mode" (Managed, HdfsCached, Client - or something like that).  Though it could be that doesn't work out cleanly given that HdfsCached can fallback to Managed mode.


-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 5: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................

IMPALA-3202: DiskIoMgr improvements for new buffer pool

The main goal of this patch is to add support to the DiskIoMgr
to read into client-provided buffers, instead of the DiskIoMgr
allocating intermediate buffers. This is important for the new
buffer pool because we want it to entirely manage its own memory,
rather than delegating part of its memory management to the DiskIoMgr.

Also do some cleanup:
* Consolidate some correlated ScanRange parameters into a parameter
  struct.
* Remove the "untracked" buffers mem tracker, which is no longer
  necessary.
* Change the buffer type in DiskIoMgr to use uint8_t* to be consistent
  with the rest of Impala.

Testing:
Added a unit test. We also get coverage from the BufferedBlockMgr unit
tests, any spilling tests and the Parquet tests with large footers.

Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-stress.h
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
15 files changed, 404 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/4631/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 5: Verified-1

Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/371/

-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 7: Code-Review+2

Carry

-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 8: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 8: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 7:

Fixed the bug found with IMPALA-4414

-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Internal Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4631

to look at the new patch set (#6).

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................

IMPALA-3202: DiskIoMgr improvements for new buffer pool

The main goal of this patch is to add support to the DiskIoMgr
to read into client-provided buffers, instead of the DiskIoMgr
allocating intermediate buffers. This is important for the new
buffer pool because we want it to entirely manage its own memory,
rather than delegating part of its memory management to the DiskIoMgr.

Also do some cleanup:
* Consolidate some correlated ScanRange parameters into a parameter
  struct.
* Remove the "untracked" buffers mem tracker, which is no longer
  necessary.
* Change the buffer type in DiskIoMgr to use uint8_t* to be consistent
  with the rest of Impala.

Testing:
Added a unit test. We also get coverage from the BufferedBlockMgr unit
tests, any spilling tests and the Parquet tests with large footers.

Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-stress.h
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
15 files changed, 404 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/4631/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4631/3/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

PS3, Line 450: external_buffer_tag_ != ExternalBufferTag::CACHED_BUFFE
> The NE/EQ macros don't work for strongly-typed "enum class" unions unfortun
okay. but still, wouldn't thie be clearer to check:

DCHECK(external_buffer_tag_ == ExternalBufferTag::NO_BUFFER)?

i.e. it's not okay to have a client buffer here either.


-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4631/3/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

PS3, Line 450: external_buffer_tag_ != ExternalBufferTag::CACHED_BUFFE
> okay. but still, wouldn't thie be clearer to check:
Ah, I understand now. Fixed.


-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4631/3/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

PS3, Line 165: external_buffer_tag_ != ExternalBufferTag::CACHED_BUFFER
!is_cached_buffer()? (either way is okay with me but sometimes you write that)


PS3, Line 450: external_buffer_tag_ != ExternalBufferTag::CACHED_BUFFE
this could be stronger DCHECK_EQ(..., NO_BUFFER), right?  (Also use the NE/EQ macros)


http://gerrit.cloudera.org:8080/#/c/4631/3/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 641:       && range->external_buffer_tag_ != ScanRange::ExternalBufferTag::CLIENT_BUFFER) {
or !is_client_buffer() (i'm fine either way, but just a bit shorter and what you wrote at line 662).


PS3, Line 1066: range->external_buffer_tag_ == ScanRange::ExternalBufferTag::CLIENT_BUFF
same


-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes