You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2017/03/16 01:08:48 UTC

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6410

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class, protected by a
SpinLock. Free buffers are stored in a heap, ordered by the
memory address of the buffer, to help reduce address space
fragmentation.

A follow-up patch will use the free lists in BufferPool.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
9 files changed, 860 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/6410/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Currently TCMalloc has thread-local caches with somewhat similar
purpose. However, these are not suitable for our purposes for
several reasons:
* They are designed for caching small allocations - large allocations
  like most buffers are served from a global page heap protected
  by a global lock.
* We intend to move away from using TCMalloc for buffers: IMPALA-5073
* Thread caches are ineffective for the producer-consumer pattern where
  one thread allocates memory and another thread frees it.
* TCMalloc gives us limited control over how and when memory is
  actually released to the OS.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

The benchmark also showed that non-inlined constructors and move
operators of BufferHandle were taking significant CPU cycles, so
I inlined those.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Reviewed-on: http://gerrit.cloudera.org:8080/6410
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
M be/src/util/benchmark-test.cc
M be/src/util/benchmark.cc
M be/src/util/benchmark.h
12 files changed, 813 insertions(+), 64 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 10: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.

Dan Hecht has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 8: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.

Jim Apple has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 6:

(24 comments)

http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/benchmarks/free-lists-benchmark.cc
File be/src/benchmarks/free-lists-benchmark.cc:

PS6, Line 48: per thread
It might be easier to read if it showed total throughput


PS6, Line 59: i7-4790
Do you have Hyperthreading on?


PS6, Line 61: 0 iters
It might help to explain what this is before these benchmarks


Line 257: 
elide this empty line


PS6, Line 288: MAX_LIST_ENTRIES
How was this chosen?


PS6, Line 303: *
&


Line 308:     lock_guard<SpinLock> l(free_list->first);
In the case where there is one list per thread, why bother with the lock?


Line 350:     int64_t op = uniform_int_distribution<int64_t>(0, 1)(rng);
const


Line 405:         for (int num_threads : {1, 2, 4, CpuInfo::num_cores()}) {
Is it worth trying 2*num_cores?


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/free-list-test.cc
File be/src/runtime/bufferpool/free-list-test.cc:

PS6, Line 18: c
also use <cstdint> in free-list.h or use <stdlib.h> here


PS6, Line 50: vector<BufferHandle>* buffers
You can just return this out parameter and the [N]RVO will make it efficient: http://en.cppreference.com/w/cpp/language/copy_elision


PS6, Line 58: void*
const void*


PS6, Line 103: various numbers
Only LIST_SIZE; no need for a loop


Line 112:       vector<void*> addrs = GetSortedAddrs(buffers);
const vector...


PS6, Line 118: small_list.Size() - LIST_SIZE
always 0?


PS6, Line 120: two
LIST_SIZE


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/free-list.h
File be/src/runtime/bufferpool/free-list.h:

PS6, Line 50: This helps to consolidate buffers within the address space
Sorry, I don't think I understand this - how does it help consolidate buffers?


PS6, Line 60: Get
Getters are often const methods. Maybe 'PopFreeBuffer'?


PS6, Line 60: bool
Can you just return nullptr on failure instead?


Line 69:   void AddFreeBuffer(BufferHandle buffer) {
If this will always be a moved argument, you can give it type BufferHandle&& buffer


PS6, Line 81: int
free_list_.size() is likely stored as a 64-bit integer type and is below returned as a 64-bit integer type.


PS6, Line 84: heap
minheap


PS6, Line 115: Limited to 'max_capacity_' capacity
Obsolete.


Line 117:   std::vector<BufferHandle> free_list_;
Since you need a double-ended priority queue, did you consider using a std::set?


-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 9: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/395/

-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 3:

As discussed offline with dhecht, I removed the "policy" aspects of the  implementation - these will be instead implemented in BufferPOol in part 2

-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Hello Impala Public Jenkins, Jim Apple, Dan Hecht,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6410

to look at the new patch set (#10).

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Currently TCMalloc has thread-local caches with somewhat similar
purpose. However, these are not suitable for our purposes for
several reasons:
* They are designed for caching small allocations - large allocations
  like most buffers are served from a global page heap protected
  by a global lock.
* We intend to move away from using TCMalloc for buffers: IMPALA-5073
* Thread caches are ineffective for the producer-consumer pattern where
  one thread allocates memory and another thread frees it.
* TCMalloc gives us limited control over how and when memory is
  actually released to the OS.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

The benchmark also showed that non-inlined constructors and move
operators of BufferHandle were taking significant CPU cycles, so
I inlined those.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
M be/src/util/benchmark-test.cc
M be/src/util/benchmark.cc
M be/src/util/benchmark.h
12 files changed, 813 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/6410/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Hello Jim Apple,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6410

to look at the new patch set (#8).

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Currently TCMalloc has thread-local caches with somewhat similar
purpose. However, these are not suitable for our purposes for
several reasons:
* They are designed for caching small allocations - large allocations
  like most buffers are served from a global page heap protected
  by a global lock.
* We intend to move away from using TCMalloc for buffers: IMPALA-5073
* Thread caches are ineffective for the producer-consumer pattern where
  one thread allocates memory and another thread frees it.
* TCMalloc gives us limited control over how and when memory is
  actually released to the OS.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

The benchmark also showed that non-inlined constructors and move
operators of BufferHandle were taking significant CPU cycles, so
I inlined those.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
M be/src/util/benchmark.cc
M be/src/util/benchmark.h
11 files changed, 813 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/6410/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.

Dan Hecht has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/6410/2//COMMIT_MSG
Commit Message:

Line 10: are stored in a heap, ordered by the memory address of the
tc-malloc has similar caches. are we adding this because we will stop using tc-malloc, or do we want it even with tc-malloc (since tc-malloc only caches small allocations)? answering these questions in the commit message (or jira) would be helpful to people to understand motivation (now and in the future).

http://gerrit.cloudera.org:8080/#/c/6410/4/be/src/benchmarks/free-lists-benchmark.cc
File be/src/benchmarks/free-lists-benchmark.cc:

PS4, Line 58: 0 iters
what does 0 iterations mean?

PS4, Line 234: 256 kb
won't this usually be 2MB? so why measuring only to 256 kb?

http://gerrit.cloudera.org:8080/#/c/6410/4/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

PS4, Line 39: buffer.Reset();
given that 'buffer' is passed by copy, what is this trying to do?

http://gerrit.cloudera.org:8080/#/c/6410/4/be/src/runtime/bufferpool/free-list.h
File be/src/runtime/bufferpool/free-list.h:

Line 36: /// A non-threadsafe list of free buffers.
a couple of things that wasn't clear without reading the code:
- how does buffer allocation work. it's done by the caller, yet the free list does buffer freeing.
- the list itself doesn't know (or care) about the property of the buffers it's storing (e.g. size, numa node). the caller is responsible for maintaining the list properly (this follows from the first bullet, but could be explicit).

PS4, Line 42: which can cause difficulties for the OS in some
: /// cases.
what's an example?

Line 46: /// Gets a free buffer. Returns true and sets 'buffer' to a buffer.
explain the false case.

PS4, Line 99: b1.data() >= b2.data()
why not '!SortCompare(...)' since that's the requirement?

--
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 10:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/396/

-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 9:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/395/

-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 10: Code-Review+2

Missed updating benchmark-test to reflect the change in optional arguments

-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.

Jim Apple has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 7: Code-Review+1

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6410/7/be/src/benchmarks/free-lists-benchmark.cc
File be/src/benchmarks/free-lists-benchmark.cc:

PS7, Line 64: In the 0 iters case
Thanks, I think this was confusing me


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/free-list-test.cc
File be/src/runtime/bufferpool/free-list-test.cc:

Line 112:       vector<const void*> addrs = GetSortedAddrs(buffers);
> Done
can also be a const vector


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/free-list.h
File be/src/runtime/bufferpool/free-list.h:

Line 117:     return SortCompare(b2, b1);
> The constant overheads of the balanced binary tree are pretty bad - each no
You might want to leave a note on FreeBuffers that its complexity is Theta(n log n).


http://gerrit.cloudera.org:8080/#/c/6410/7/be/src/runtime/bufferpool/free-list.h
File be/src/runtime/bufferpool/free-list.h:

PS7, Line 51: over
"into"


-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 8: Code-Review+1

Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.

Dan Hecht has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6410/4/be/src/benchmarks/free-lists-benchmark.cc
File be/src/benchmarks/free-lists-benchmark.cc:

PS4, Line 58: 
> Added a comment to explain - it's the number of passes we do over the memor
Sure, a comment explaining it is fine. It just wasn't clear without reading all the benchmark code.


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/benchmarks/free-lists-benchmark.cc
File be/src/benchmarks/free-lists-benchmark.cc:

Line 58: //
what conclusion should we draw from this benchmark if we end up memory bound much earlier than the normal 2MB buffer size?


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

PS6, Line 37: BufferPool::BufferHandle buffer
how does the caller know it must do a move() here? i.e. what prevents the caller from holding on to a dangling buffer.data reference?


-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/free-list-test.cc
File be/src/runtime/bufferpool/free-list-test.cc:

Line 112:       vector<const void*> addrs = GetSortedAddrs(buffers);
> can also be a const vector
Done


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/free-list.h
File be/src/runtime/bufferpool/free-list.h:

Line 117:     return SortCompare(b2, b1);
> You might want to leave a note on FreeBuffers that its complexity is Theta(
Done. In the caller I'm going to always remove a fixed percentage of the entries p when the list gets full, so overall it should be amortised log n time, since FreeBuffers() will be called only ones every (p * n) operations.


http://gerrit.cloudera.org:8080/#/c/6410/7/be/src/runtime/bufferpool/free-list.h
File be/src/runtime/bufferpool/free-list.h:

PS7, Line 51: over
> "into"
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 9: Code-Review+2

Rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6410 )

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1806/


-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-Change-Number: 6410
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Jan 2018 03:05:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has uploaded a new patch set (#6).

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Currently TCMalloc has thread-local caches with somewhat similar
purpose. However, these are not suitable for our purposes for
several reasons:
* They are designed for caching small allocations - large allocations
  like most buffers are served from a global page heap protected
  by a global lock.
* We intend to move away from using TCMalloc for buffers: IMPALA-5073
* Thread caches are ineffective for the producer-consumer pattern where
  one thread allocates memory and another thread frees it.
* TCMalloc gives us limited control over how and when memory is
  actually released to the OS.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
9 files changed, 730 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/6410/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has uploaded a new patch set (#5).

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Currently TCMalloc has thread-local caches with somewhat similar
purpose. However, these are not suitable for our purposes for
several reasons:
* They are designed for caching small allocations - large allocations
  like most buffers are served from a global page heap protected
  by a global lock.
* We intend to move away from using TCMalloc for buffers: IMPALA-5073
* Thread caches are ineffective for the producer-consumer pattern where
  one thread allocates memory and another thread frees it.
* TCMalloc gives us limited control over how and when memory is
  actually released to the OS.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
9 files changed, 728 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/6410/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/6410/2//COMMIT_MSG
Commit Message:

Line 10: are stored in a heap, ordered by the memory address of the
> tc-malloc has similar caches.  are we adding this because we will stop usin
Done


http://gerrit.cloudera.org:8080/#/c/6410/4/be/src/benchmarks/free-lists-benchmark.cc
File be/src/benchmarks/free-lists-benchmark.cc:

PS4, Line 58: 0 iters 
> what does 0 iterations mean?
Added a comment to explain - it's the number of passes we do over the memory as part of the "work".

Making the label longer messes up the formatting unfortunately.


PS4, Line 234: 256 kb
> won't this usually be 2MB? so why measuring only to 256 kb?
It's not really interesting - even at 256KB throughput is memory-bound.


http://gerrit.cloudera.org:8080/#/c/6410/4/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

PS4, Line 39: buffer.Reset();
> given that 'buffer' is passed by copy, what is this trying to do?
There's a DCHECK in ~BufferHandle to detect leaks. Added a comment to explain.


http://gerrit.cloudera.org:8080/#/c/6410/4/be/src/runtime/bufferpool/free-list.h
File be/src/runtime/bufferpool/free-list.h:

Line 36: /// A non-threadsafe list of free buffers.
> a couple of things that wasn't clear without reading the code:
Done


PS4, Line 42: which can cause difficulties for the OS in some
            : /// cases.
> what's an example?
Linux has a limit on the number of non-contiguous mapped regions. Updated the comment.


Line 46:   /// Gets a free buffer. Returns true and sets 'buffer' to a buffer.
> explain the false case.
Done


PS4, Line 99: b1.data() >= b2.data()
> why not '!SortCompare(...)' since that's the requirement?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................


Patch Set 6:

(27 comments)

http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/benchmarks/free-lists-benchmark.cc
File be/src/benchmarks/free-lists-benchmark.cc:

PS6, Line 48: per thread
> It might be easier to read if it showed total throughput
Did this. Agree it's probably a bit easier to interpret. Had to tweak the benchmark setup a bit as a consequence so that it actually ran enough iterations of the benchmark (sometimes it ran 0).

Also found a couple of optimisation opportunites to speed up the benchmark.


Line 58: //
> what conclusion should we draw from this benchmark if we end up memory boun
I wrote up a short summary of the results and conclusions here - it's hard to know what to make of it otherwise.


PS6, Line 59: i7-4790
> Do you have Hyperthreading on?
Yes


PS6, Line 61: 0 iters
> It might help to explain what this is before these benchmarks
it's mentioned on l 57 but I made it clearer.


Line 257: 
> elide this empty line
Done


PS6, Line 288: MAX_LIST_ENTRIES
> How was this chosen?
Added comment.


PS6, Line 303: *
> &
Done


Line 308:     lock_guard<SpinLock> l(free_list->first);
> In the case where there is one list per thread, why bother with the lock?
I wanted to measure throughput including the uncontended lock acquisition - we'd still need the lock in practice since other threads will be able to access the free lists of other threads (either because they're scheduled on the same core or because they are trying to steal a buffer from a different core).


Line 350:     int64_t op = uniform_int_distribution<int64_t>(0, 1)(rng);
> const
Done


PS6, Line 373: new pair<SpinLock, FreeList>
I realised these could end up on the same cache line - I think this may have affected some results.


Line 405:         for (int num_threads : {1, 2, 4, CpuInfo::num_cores()}) {
> Is it worth trying 2*num_cores?
It's probably not very interesting since num_cores already includes the hyperthreads.


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

PS6, Line 37: BufferPool::BufferHandle buffer
> how does the caller know it must do a move() here? i.e. what prevents the c
BufferHandle isn't copyable so it's impossible to pass in a value without move()ing it. I'm not that familiar with the best practices and trade-offs with passing rvalue references versus values but it seems to work either way and the rvalue ref makes it more explicit, so I changed it.


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/free-list-test.cc
File be/src/runtime/bufferpool/free-list-test.cc:

PS6, Line 18: c
> also use <cstdint> in free-list.h or use <stdlib.h> here
Done


PS6, Line 50: vector<BufferHandle>* buffers
> You can just return this out parameter and the [N]RVO will make it efficien
Unfortunately the ASSERT macros only work in functions returning void.


PS6, Line 58: void*
> const void*
Done


PS6, Line 103: various numbers
> Only LIST_SIZE; no need for a loop
I missed changing the loop bounds to test various sizes. Fixed that.


Line 112:       vector<void*> addrs = GetSortedAddrs(buffers);
> const vector...
Done


PS6, Line 118: small_list.Size() - LIST_SIZE
> always 0?
I intended to run for different values of num_buffers but didn't do it - fixed.


PS6, Line 120: two
> LIST_SIZE
Done


http://gerrit.cloudera.org:8080/#/c/6410/6/be/src/runtime/bufferpool/free-list.h
File be/src/runtime/bufferpool/free-list.h:

PS6, Line 50: This helps to consolidate buffers within the address space
> Sorry, I don't think I understand this - how does it help consolidate buffe
I elaborated a bit on the problem - agree this was cryptic.


PS6, Line 60: bool
> Can you just return nullptr on failure instead?
We'd be returning a BufferHandle, not a BufferHandle*. It could return a closed BufferHandle to indicate failure but I feel like that makes it harder to structure the logic in the caller. E.g. currently you can do this.

  BufferHandle handle;
  if (list1.GetFreeBuffer(&handle)) {
  
  } else if (list2.GetFreeBuffer(&handle)) {
  
  }

But if we changed it, it would look more like:

  BufferHandle handle = list1.GetFreeBuffer();
  if (handle.is_open()) {
  
  } else {
    handle = list2.GetFreeBuffer();
    if (handle2.is_open()) {
  
    }
  }

or 

  BufferHandle handle;
  if ((handle = list1.GetFreeBuffer()).is_open()) {
  
  } else if ((handle = list2.GetFreeBuffer()).is_open()) {
  
  }

both of which seem clunkier


PS6, Line 60: Get
> Getters are often const methods. Maybe 'PopFreeBuffer'?
Works for me.


Line 69:   void AddFreeBuffer(BufferHandle buffer) {
> If this will always be a moved argument, you can give it type BufferHandle&
Done


PS6, Line 81: int
> free_list_.size() is likely stored as a 64-bit integer type and is below re
In practice storing over 2 billion entries would be crazy but might as well be consistent with the types.


PS6, Line 84: heap
> minheap
Done


PS6, Line 115: Limited to 'max_capacity_' capacity
> Obsolete.
Done


Line 117:   std::vector<BufferHandle> free_list_;
> Since you need a double-ended priority queue, did you consider using a std:
The constant overheads of the balanced binary tree are pretty bad - each node has a lot of memory overhead, the memory isn't contiguous, and it calls malloc() or free() for most operations. On a previous project I replaced an rbtree priority queue with a binary heap and saw some significant improvements.


-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Hello Jim Apple, Dan Hecht,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6410

to look at the new patch set (#9).

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Currently TCMalloc has thread-local caches with somewhat similar
purpose. However, these are not suitable for our purposes for
several reasons:
* They are designed for caching small allocations - large allocations
  like most buffers are served from a global page heap protected
  by a global lock.
* We intend to move away from using TCMalloc for buffers: IMPALA-5073
* Thread caches are ineffective for the producer-consumer pattern where
  one thread allocates memory and another thread frees it.
* TCMalloc gives us limited control over how and when memory is
  actually released to the OS.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

The benchmark also showed that non-inlined constructors and move
operators of BufferHandle were taking significant CPU cycles, so
I inlined those.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
M be/src/util/benchmark.cc
M be/src/util/benchmark.h
11 files changed, 812 insertions(+), 63 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/6410/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
9 files changed, 713 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/6410/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has uploaded a new patch set (#7).

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Currently TCMalloc has thread-local caches with somewhat similar
purpose. However, these are not suitable for our purposes for
several reasons:
* They are designed for caching small allocations - large allocations
  like most buffers are served from a global page heap protected
  by a global lock.
* We intend to move away from using TCMalloc for buffers: IMPALA-5073
* Thread caches are ineffective for the producer-consumer pattern where
  one thread allocates memory and another thread frees it.
* TCMalloc gives us limited control over how and when memory is
  actually released to the OS.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

The benchmark also showed that non-inlined constructors and move
operators of BufferHandle were taking significant CPU cycles, so
I inlined those.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
M be/src/util/benchmark.cc
M be/src/util/benchmark.h
11 files changed, 812 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/6410/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>