You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2017/06/14 20:44:06 UTC

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7182

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................

IMPALA-5389: simplify BufferDescriptor lifetime

This is cleanup to make the code easier to understand in anticipation of
some trickier changes to I/O buffer management for IMPALA-4835. I do
not expect any changes in behaviour as a result of the test.

* Use unique_ptr to make BufferDescriptor ownership transfer more
  explicit and allow enforcing that the buffers are not leaked via
  a DCHECK in ~BufferDescriptor.
* Remove 'free_buffer_descs_' cache. TCMalloc is good at caching small
  objects and there will likely be a lot less lock contention compared
  with a single global lock. The cache was did not avoid calls to
  malloc() anyway because appending to std::list<> requires allocating
  a list node.
* Use std::deque instead of std::list in a couple of places - it offers
  O(1) push_*()/pop_*() at both ends and requires fewer calls into
  malloc()/free().

Testing:
Ran ASAN and exhaustive builds.

Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/tmp-file-mgr.cc
11 files changed, 134 insertions(+), 195 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/7182/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/741/

-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................


Patch Set 1:

(4 comments)

First pass looks good. A few minor comments.

http://gerrit.cloudera.org:8080/#/c/7182/1//COMMIT_MSG
Commit Message:

PS1, Line 11: test.
test?


PS1, Line 18: was did not
did not


http://gerrit.cloudera.org:8080/#/c/7182/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 150: recycling of the buffer descriptor
Since we no longer reuse the buffer descriptor, I think this comment should change.


http://gerrit.cloudera.org:8080/#/c/7182/1/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

PS1, Line 217: std::unique_ptr<DiskIoMgr::BufferDescriptor>&& buffer
I believe this can be std::unique_ptr<DiskIoMgr::BufferDescriptor>, since we take ownership in all cases. I could be wrong.


-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................


Patch Set 3: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................


Patch Set 1: Code-Review+1

(5 comments)

This does seem more understandable to me. Just some small suggestions.

http://gerrit.cloudera.org:8080/#/c/7182/1/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 85:   *buffer = nullptr;
maybe prefer buffer->reset() here (for me it's clearer not to use the overloaded operator).


http://gerrit.cloudera.org:8080/#/c/7182/1/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS1, Line 208: len_(0),
             :     eosr_(false),
             :     scan_range_offset_(0
we've been moving towards initializing members with constants in the header file. If, like me, you prefer that style suggest you do so here to keep the initializer list short.


PS1, Line 700: unique_ptr<BufferDescriptor>
make_unique(this, reader, range, buffer, buffer_size, reader->mem_tracker_)?


PS1, Line 710: std
remove std::?


http://gerrit.cloudera.org:8080/#/c/7182/1/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

PS1, Line 217: std::unique_ptr<DiskIoMgr::BufferDescriptor>&& buffer
> I believe this can be std::unique_ptr<DiskIoMgr::BufferDescriptor>, since w
Right, since you can't pass a unique_ptr<> except by moving it, I think pass-by-value is equivalent. There's an argument in favour of keeping this an rvalue ref though, because if we ever changed the type of the ptr to be copyable, pass-by-value would silently start copying at each callsite - it's the sort of latent bug that could go undetected. From a readability perspective, rvalue-refs are explicit that the argument is going to get consumed. 

I don't feel very strongly though. It would be good to be consistent, since there are other methods that take a unique_ptr by value.


-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7182

to look at the new patch set (#2).

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................

IMPALA-5389: simplify BufferDescriptor lifetime

This is cleanup to make the code easier to understand in anticipation of
some trickier changes to I/O buffer management for IMPALA-4835. I do
not expect any changes in behaviour as a result of this patch.

* Use unique_ptr to make BufferDescriptor ownership transfer more
  explicit and allow enforcing that the buffers are not leaked via
  a DCHECK in ~BufferDescriptor.
* Remove 'free_buffer_descs_' cache. TCMalloc is good at caching small
  objects and there will likely be a lot less lock contention compared
  with a single global lock. The cache did not avoid calls to
  malloc() anyway because appending to std::list<> requires allocating
  a list node.
* Use std::deque instead of std::list in a couple of places - it offers
  O(1) push_*()/pop_*() at both ends and requires fewer calls into
  malloc()/free().

Testing:
Ran ASAN and exhaustive builds.

Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/tmp-file-mgr.cc
11 files changed, 137 insertions(+), 202 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/7182/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................


Patch Set 2: Code-Review+1

Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................


IMPALA-5389: simplify BufferDescriptor lifetime

This is cleanup to make the code easier to understand in anticipation of
some trickier changes to I/O buffer management for IMPALA-4835. I do
not expect any changes in behaviour as a result of this patch.

* Use unique_ptr to make BufferDescriptor ownership transfer more
  explicit and allow enforcing that the buffers are not leaked via
  a DCHECK in ~BufferDescriptor.
* Remove 'free_buffer_descs_' cache. TCMalloc is good at caching small
  objects and there will likely be a lot less lock contention compared
  with a single global lock. The cache did not avoid calls to
  malloc() anyway because appending to std::list<> requires allocating
  a list node.
* Use std::deque instead of std::list in a couple of places - it offers
  O(1) push_*()/pop_*() at both ends and requires fewer calls into
  malloc()/free().

Testing:
Ran ASAN and exhaustive builds.

Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Reviewed-on: http://gerrit.cloudera.org:8080/7182
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/tmp-file-mgr.cc
11 files changed, 137 insertions(+), 202 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5389: simplify BufferDescriptor lifetime

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5389: simplify BufferDescriptor lifetime
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/7182/1//COMMIT_MSG
Commit Message:

PS1, Line 11: test.
> test?
Done


PS1, Line 18: was did not
> did not
Done


http://gerrit.cloudera.org:8080/#/c/7182/1/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 85:   *buffer = nullptr;
> maybe prefer buffer->reset() here (for me it's clearer not to use the overl
Oops missed this, it just worked by coincidence. I changed it to a DCHECK, since there's no way passing in a buffer makes sense.


http://gerrit.cloudera.org:8080/#/c/7182/1/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS1, Line 208: len_(0),
             :     eosr_(false),
             :     scan_range_offset_(0
> we've been moving towards initializing members with constants in the header
Done


PS1, Line 700: unique_ptr<BufferDescriptor>
> make_unique(this, reader, range, buffer, buffer_size, reader->mem_tracker_)
It unfortunately doesn't work because the constructor is not accessible to make_unique() (this appears to be a known problem with make_unique()).


PS1, Line 710: std
> remove std::?
Done


http://gerrit.cloudera.org:8080/#/c/7182/1/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS1, Line 150: recycling of the buffer descriptor
> Since we no longer reuse the buffer descriptor, I think this comment should
Done


http://gerrit.cloudera.org:8080/#/c/7182/1/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

PS1, Line 217: std::unique_ptr<DiskIoMgr::BufferDescriptor>&& buffer
> Right, since you can't pass a unique_ptr<> except by moving it, I think pas
Changed it for consistency - I don't see any instances of unique_ptr<>&& in the codebase.

Yeah the advantage of passing by value is that it means that the callee always has to take ownership. For unique_ptr copying the pointer doesn't really matter (with the rvalue ref we're really passing in a pointer to a pointer).


-- 
To view, visit http://gerrit.cloudera.org:8080/7182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I007d098e9a1abb1f684be37b7f1ee6c03d3879b2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes