You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/03 21:07:07 UTC

[GitHub] [arrow] bkietz opened a new pull request #10233: ARROW-12641: [C++] Provide thread id accessors

bkietz opened a new pull request #10233:
URL: https://github.com/apache/arrow/pull/10233


   Sometimes, a worker thread must identify its thread id within a ThreadPool (for example, to support safe concurrent access to per-thread state). We could use std::this_thread::get_id() for this, but this is less convenient for indexing into a container of per-thread state.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on pull request #10233: ARROW-12641: [C++] Provide thread id accessors

Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#issuecomment-832845569


   A different approach: instead of trying to identify individual threads, provide a set of thread local states which can be safely borrowed by tasks. This approach does not require directly specifying capacity and handles changing capacity fluidly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10233: ARROW-12641: [C++] Provide thread id accessors

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#discussion_r625656065



##########
File path: cpp/src/arrow/util/future.h
##########
@@ -573,6 +573,15 @@ class ARROW_MUST_USE_TYPE Future {
   FRIEND_TEST(FutureRefTest, HeadRemoved);
 };
 
+namespace internal {

Review comment:
       Probably for use within `ASSERT_OK`, but we already have specific macros for futures.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on pull request #10233: ARROW-12641: [C++] Provide thread local state for tasks in a thread pool

Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#issuecomment-881342861


   Closing for now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz closed pull request #10233: ARROW-12641: [C++] Provide thread local state for tasks in a thread pool

Posted by GitBox <gi...@apache.org>.
bkietz closed pull request #10233:
URL: https://github.com/apache/arrow/pull/10233


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on pull request #10233: ARROW-12641: [C++] Provide thread id accessors

Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#issuecomment-831991760


   To be honest, I hadn't considered changing capacity in mid workflow (increase or decrease). For now I've added a unit test which exercises the intended use case, I'll give some thought to changing capacity


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #10233: ARROW-12641: [C++] Provide thread id accessors

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#discussion_r625846193



##########
File path: cpp/src/arrow/util/thread_pool_test.cc
##########
@@ -512,6 +512,98 @@ TEST_F(TestThreadPool, Submit) {
   }
 }
 
+TEST_F(TestThreadPool, GetCurrentThreadPool) {
+  ASSERT_EQ(ThreadPool::GetCurrentThreadPool(), nullptr);
+
+  auto pool = this->MakeThreadPool(5);
+
+  std::vector<Future<>> futures(1000);
+
+  for (size_t i = 0; i < futures.size(); ++i) {
+    ASSERT_OK_AND_ASSIGN(futures[i], pool->Submit([i, pool] {
+      if (ThreadPool::GetCurrentThreadPool() == pool.get()) {
+        return Status::OK();
+      }
+      return Status::Invalid("Task #", i, " did not point to the associated ThreadPool");
+    }));
+  }
+
+  ASSERT_OK(AllComplete(futures).status());
+  ASSERT_OK(pool->Shutdown());
+}
+
+TEST_F(TestThreadPool, GetCurrentThreadIndex) {
+  ASSERT_EQ(ThreadPool::GetCurrentThreadIndex(), 0);
+
+  constexpr int capacity = 5;
+
+  auto pool = this->MakeThreadPool(capacity);
+
+  std::vector<Future<>> futures(1000);
+  std::vector<util::optional<std::thread::id>> std_ids(capacity);
+
+  for (size_t i = 0; i < futures.size(); ++i) {
+    ASSERT_OK_AND_ASSIGN(futures[i], pool->Submit([&std_ids, i] {
+      auto id = ThreadPool::GetCurrentThreadIndex();
+      if (!std_ids[id].has_value()) {
+        std_ids[id] = std::this_thread::get_id();
+        return Status::OK();
+      }
+
+      if (std_ids[id] == std::this_thread::get_id()) {
+        return Status::OK();
+      }
+
+      return Status::Invalid("Task #", i, " did not point to the associated ThreadPool");
+    }));
+  }
+
+  ASSERT_OK(AllComplete(futures).status());
+  ASSERT_OK(pool->Shutdown());
+}
+
+TEST_F(TestThreadPool, ParallelSummationWithThreadLocalState) {
+  // Sum all integers in [0, 1000000) in parallel using thread local sums.
+  constexpr int kThreadPoolCapacity = 5;
+  constexpr int kBatchSize = 1000;
+  constexpr int kBatchCount = 1000;
+
+  auto pool = this->MakeThreadPool(kThreadPoolCapacity);
+
+  std::vector<std::unique_ptr<int64_t>> local_sums(kThreadPoolCapacity);

Review comment:
       In this example per thread state is artificially cheap; we could also have a per-task sum without much change in performance and without invoking such roundabout storage. In general state can be arbitrarily expensive, for example a hash table used in dictionary encoding. (I can write a parallel dict encode example too, if that'd be of interest.)
   
   Whenever construction/maintenance of state is expensive, we'd prefer to reuse it and keep the number of instances to a minimum




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10233: ARROW-12641: [C++] Provide thread id accessors

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#issuecomment-831894788


   Like Weston I would like to know a bit more about the intended use case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on pull request #10233: ARROW-12641: [C++] Provide thread local state for tasks in a thread pool

Posted by GitBox <gi...@apache.org>.
westonpace commented on pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#issuecomment-833010580


   Ok, I think that makes sense then.
   
   * I wouldn't name it `MakeThreadLocalState`.  "Thread local" has a pretty well understood existing definition that isn't really this.  Also, the lifetime of the `BorrowSet` is independent of the lifetime of the thread (it is more tied to the lifetime of a "workflow").  Maybe call it "workflow" state?
   * Outside of a some callbacks I don't think this logic belongs in `thread_pool.h`/`executor.h`.  You could put the `MakeThreadLocalState` function in `borrowed.h` instead (and simply take `Executor*` as an argument).  People who are working on schedulers and thread pools probably don't want to have to know about borrow sets.
   * I can't speak to the merits of any performance benefit / complexity tradeoff but I can see where it could theoretically be advantageous to use a borrow set.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #10233: ARROW-12641: [C++] Provide thread id accessors

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#discussion_r625806258



##########
File path: cpp/src/arrow/util/thread_pool.cc
##########
@@ -321,13 +321,28 @@ void ThreadPool::CollectFinishedWorkersUnlocked() {
   state_->finished_workers_.clear();
 }
 
+thread_local ThreadPool* g_current_thread_pool = nullptr;
+
+ThreadPool* ThreadPool::GetCurrentThreadPool() { return g_current_thread_pool; }
+
+thread_local int g_current_thread_id = 0;
+
+int ThreadPool::GetCurrentThreadId() { return g_current_thread_id; }

Review comment:
       Good call, I'll rename to thread index

##########
File path: cpp/src/arrow/util/future.h
##########
@@ -573,6 +573,15 @@ class ARROW_MUST_USE_TYPE Future {
   FRIEND_TEST(FutureRefTest, HeadRemoved);
 };
 
+namespace internal {

Review comment:
       I'll revert this, then




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10233: ARROW-12641: [C++] Provide thread local state for tasks in a thread pool

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#discussion_r629378984



##########
File path: cpp/src/arrow/util/thread_pool_test.cc
##########
@@ -512,6 +514,74 @@ TEST_F(TestThreadPool, Submit) {
   }
 }
 
+TEST_F(TestThreadPool, ParallelSummationWithPerThreadState) {
+  // Sum all integers in [0, 1000000) in parallel.
+  struct {
+    void WriteInto(std::vector<int64_t>* addends) {
+      constexpr int kBatchSize = 1000;
+
+      addends->resize(kBatchSize);
+      std::iota(addends->begin(), addends->end(), begin_.fetch_add(kBatchSize));
+
+      expected_sum_ += (addends->front() + addends->back()) * kBatchSize / 2;
+    }
+
+    std::atomic<int64_t> begin_{0}, expected_sum_{0};
+  } addend_source;
+
+  constexpr int kThreadPoolCapacity = 5;
+  constexpr int kBatchCount = 1000;
+
+  auto pool = this->MakeThreadPool(kThreadPoolCapacity);
+
+  // Each thread will have a unique local sum
+  auto local_sums = BorrowSet<int64_t>::Make(pool.get());
+
+  // Each task needs a vector into which addends can be written by a source.
+  // Instead of allocating a vector in each task, we can provide a BorrowSet
+  // to allow reuse of a vector.
+  auto local_addend_batches = BorrowSet<std::vector<int64_t>>::Make(pool.get());
+
+  std::atomic<int64_t> sum{0}, state_count{0};
+  local_sums->OnDone([&](int64_t&& local_sum) {
+    // When we're done with a local sum, add it into the global sum.
+    sum += local_sum;
+
+    // Track the total number of constructed states; should == thread count
+    ++state_count;
+  });
+
+  std::vector<Future<>> futures(kBatchCount);
+
+  for (int i = 0; i < kBatchCount; ++i) {
+    ASSERT_OK_AND_ASSIGN(futures[i], pool->Submit([&, local_sums, local_addend_batches] {
+      // Acquire thread local state. Each task may safely mutate since tasks
+      // running in another thread will acquire a different local_sum
+      auto local_sum = local_sums->BorrowOne();
+
+      auto addends = local_addend_batches->BorrowOne();
+      addend_source.WriteInto(addends.get());
+
+      for (int64_t addend : *addends) {
+        *local_sum += addend;
+      }
+
+      // decrease the thread pool capacity halfway through:
+      if (i == kBatchCount / 2) {

Review comment:
       `i` is captured by reference, so chances are it will be equal to `kBatchCount` here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10233: ARROW-12641: [C++] Provide thread local state for tasks in a thread pool

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#discussion_r629378247



##########
File path: cpp/src/arrow/util/thread_pool_test.cc
##########
@@ -512,6 +514,74 @@ TEST_F(TestThreadPool, Submit) {
   }
 }
 
+TEST_F(TestThreadPool, ParallelSummationWithPerThreadState) {
+  // Sum all integers in [0, 1000000) in parallel.
+  struct {
+    void WriteInto(std::vector<int64_t>* addends) {
+      constexpr int kBatchSize = 1000;
+
+      addends->resize(kBatchSize);
+      std::iota(addends->begin(), addends->end(), begin_.fetch_add(kBatchSize));
+
+      expected_sum_ += (addends->front() + addends->back()) * kBatchSize / 2;
+    }
+
+    std::atomic<int64_t> begin_{0}, expected_sum_{0};
+  } addend_source;
+
+  constexpr int kThreadPoolCapacity = 5;
+  constexpr int kBatchCount = 1000;
+
+  auto pool = this->MakeThreadPool(kThreadPoolCapacity);
+
+  // Each thread will have a unique local sum
+  auto local_sums = BorrowSet<int64_t>::Make(pool.get());
+
+  // Each task needs a vector into which addends can be written by a source.
+  // Instead of allocating a vector in each task, we can provide a BorrowSet
+  // to allow reuse of a vector.
+  auto local_addend_batches = BorrowSet<std::vector<int64_t>>::Make(pool.get());
+
+  std::atomic<int64_t> sum{0}, state_count{0};
+  local_sums->OnDone([&](int64_t&& local_sum) {
+    // When we're done with a local sum, add it into the global sum.
+    sum += local_sum;
+
+    // Track the total number of constructed states; should == thread count
+    ++state_count;
+  });
+
+  std::vector<Future<>> futures(kBatchCount);
+
+  for (int i = 0; i < kBatchCount; ++i) {
+    ASSERT_OK_AND_ASSIGN(futures[i], pool->Submit([&, local_sums, local_addend_batches] {
+      // Acquire thread local state. Each task may safely mutate since tasks
+      // running in another thread will acquire a different local_sum
+      auto local_sum = local_sums->BorrowOne();
+
+      auto addends = local_addend_batches->BorrowOne();
+      addend_source.WriteInto(addends.get());

Review comment:
       I must admit I don't understand why this is creating a chunk of addends in each thread. I understand it's a test so it doesn't matter much, but it still looks a bit gratuitous.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10233: ARROW-12641: [C++] Provide thread id accessors

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#issuecomment-831534498


   https://issues.apache.org/jira/browse/ARROW-12641


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on pull request #10233: ARROW-12641: [C++] Provide thread local state for tasks in a thread pool

Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#issuecomment-832989304


   > Why do you need to be so closely tied to the thread pool capacity?
   
   We could certainly simplify by setting a static capacity. This can result in extraneous instances of `State` sitting in the cache and doing nothing until a set of tasks completed in the event of capacity decrease, since we wouldn't be able to react
   
   > What if the tasks are running on a non-thread pool Executor (we have an event loop inspired serial executor now and maybe we will have another kind of executor in the future)?
   
   This could be generalized to support arbitrary Executors without too much trouble, I think. If the general approach of BorrowSet seems acceptable, then I can add the OnThreadAdded etc callbacks and move MakeThreadLocalState to Executor
   
   > Are the construct and destroy functions just adding complexity. Any user that needs to do some special construct/destroy logic could do so in the constructor/destructor it would seem.
   
   Re construct: I was thinking of state like `unique_ptr<Encoder>` or so, whose default constructor just leaves it null. This isn't the end of the world; it'd just defer construction to the task:
   
   ```c++
   auto encoder_set = pool->MakeThreadLocalState<unique_ptr<Encoder>>(...);
   auto encoded_gen = MakeMappedGenerator(
       MakeTransferredGenerator(batch_gen, pool.get()),
       [encoder_set](const std::shared_ptr<RecordBatch>& batch) {
         auto encoder = encoder_set->BorrowOne();
         if (!encoder.get()) encoder.get() = Encoder::Make();
         return encoder.get()->Encode(batch);
       });
   ```
   
   Re destroy: as currently written, this is also the "reduce" step (in the unit test I use it to update the global sum from a local sum). I could just rename it to "reduce" or "gather". If we want to support dynamic reduction of capacity then we'll need to keep this as a callback, otherwise going from 5 threads to 2 would fail to garbage collect 3 unused states until a workflow was completed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10233: ARROW-12641: [C++] Provide thread id accessors

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#discussion_r625868623



##########
File path: cpp/src/arrow/util/thread_pool_test.cc
##########
@@ -512,6 +512,98 @@ TEST_F(TestThreadPool, Submit) {
   }
 }
 
+TEST_F(TestThreadPool, GetCurrentThreadPool) {
+  ASSERT_EQ(ThreadPool::GetCurrentThreadPool(), nullptr);
+
+  auto pool = this->MakeThreadPool(5);
+
+  std::vector<Future<>> futures(1000);
+
+  for (size_t i = 0; i < futures.size(); ++i) {
+    ASSERT_OK_AND_ASSIGN(futures[i], pool->Submit([i, pool] {
+      if (ThreadPool::GetCurrentThreadPool() == pool.get()) {
+        return Status::OK();
+      }
+      return Status::Invalid("Task #", i, " did not point to the associated ThreadPool");
+    }));
+  }
+
+  ASSERT_OK(AllComplete(futures).status());
+  ASSERT_OK(pool->Shutdown());
+}
+
+TEST_F(TestThreadPool, GetCurrentThreadIndex) {
+  ASSERT_EQ(ThreadPool::GetCurrentThreadIndex(), 0);
+
+  constexpr int capacity = 5;
+
+  auto pool = this->MakeThreadPool(capacity);
+
+  std::vector<Future<>> futures(1000);
+  std::vector<util::optional<std::thread::id>> std_ids(capacity);
+
+  for (size_t i = 0; i < futures.size(); ++i) {
+    ASSERT_OK_AND_ASSIGN(futures[i], pool->Submit([&std_ids, i] {
+      auto id = ThreadPool::GetCurrentThreadIndex();
+      if (!std_ids[id].has_value()) {
+        std_ids[id] = std::this_thread::get_id();
+        return Status::OK();
+      }
+
+      if (std_ids[id] == std::this_thread::get_id()) {
+        return Status::OK();
+      }
+
+      return Status::Invalid("Task #", i, " did not point to the associated ThreadPool");
+    }));
+  }
+
+  ASSERT_OK(AllComplete(futures).status());
+  ASSERT_OK(pool->Shutdown());
+}
+
+TEST_F(TestThreadPool, ParallelSummationWithThreadLocalState) {
+  // Sum all integers in [0, 1000000) in parallel using thread local sums.
+  constexpr int kThreadPoolCapacity = 5;
+  constexpr int kBatchSize = 1000;
+  constexpr int kBatchCount = 1000;
+
+  auto pool = this->MakeThreadPool(kThreadPoolCapacity);
+
+  std::vector<std::unique_ptr<int64_t>> local_sums(kThreadPoolCapacity);

Review comment:
       It would be nice if this test exercised an unknown capacity (which is the same as a changing capacity, I think).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #10233: ARROW-12641: [C++] Provide thread id accessors

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#discussion_r625375258



##########
File path: cpp/src/arrow/util/future.h
##########
@@ -573,6 +573,15 @@ class ARROW_MUST_USE_TYPE Future {
   FRIEND_TEST(FutureRefTest, HeadRemoved);
 };
 
+namespace internal {

Review comment:
       I don't see this used.  What is the intent?  There is already `Future<T>::status`

##########
File path: cpp/src/arrow/util/thread_pool.cc
##########
@@ -321,13 +321,28 @@ void ThreadPool::CollectFinishedWorkersUnlocked() {
   state_->finished_workers_.clear();
 }
 
+thread_local ThreadPool* g_current_thread_pool = nullptr;
+
+ThreadPool* ThreadPool::GetCurrentThreadPool() { return g_current_thread_pool; }
+
+thread_local int g_current_thread_id = 0;
+
+int ThreadPool::GetCurrentThreadId() { return g_current_thread_id; }

Review comment:
       If this is returning the thread's `index` maybe `id` is not the best name?  Especially since it might be confused with the existing concept of a thread id.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10233: ARROW-12641: [C++] Provide thread local state for tasks in a thread pool

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10233:
URL: https://github.com/apache/arrow/pull/10233#discussion_r629371400



##########
File path: cpp/src/arrow/util/borrowed.h
##########
@@ -0,0 +1,165 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include <functional>
+#include <memory>
+#include <utility>
+#include <vector>
+
+#include "arrow/util/macros.h"
+#include "arrow/util/mutex.h"
+#include "arrow/util/thread_pool.h"
+
+namespace arrow {
+namespace internal {
+
+/// A thread safe set of values which can be borrowed (for example by tasks running in a
+/// ThreadPool). Borrowing a value will either lazily construct it or pop from a cache.
+/// Calls to the constructor and destructor are not synchronized.
+template <typename T>
+class BorrowSet : public std::enable_shared_from_this<BorrowSet<T>> {

Review comment:
       Is there precedent in calling this "borrow set"? I would call this a free list, or free list allocator. Or perhaps a "reuse pool".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org