You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/01/10 19:16:42 UTC

[GitHub] [arrow] jvanstraten opened a new pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

jvanstraten opened a new pull request #12116:
URL: https://github.com/apache/arrow/pull/12116


   Add memory-optimized logic to create immutable buffers filled with zeros as MakeArrayOfNull, by reusing them at the MemoryPool level rather than only at Array level. Includes an additional optimization for Linux that uses mmap to generate read-only zero buffers that, depending on the implementation in the kernel and the architecture, might not cost physical memory at all and/or do the zeroing operation lazily.
   
   Draft for now; tests still fail, but the general implementation is there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r808222620



##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       Okay, that's fair... but don't you agree with my point that if a `Buffer` is associated with some memory manager, it should be possible to recreate it with that memory manager? Maybe the solution is just to *not* associate a memory manager with immutable zeros buffers, but doesn't that imply that the memory it points to is not owned by a memory pool? I'm generally confused about how to correctly implement this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806147259



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       Compared to my algorithm:
   
    - `+` In terms of thread primitives, your fast path only involves a memory fence, whereas mine involves a mutex. I'm not sure if it's even needed, though: it isn't if you can safely make a copy of a `shared_ptr` while another thread may be updating the contained pointer. That feels like a true statement, at least on x86, but I couldn't figure it out for sure from the C++ docs. If I can remove it, my fast path is just a `shared_ptr` copy (so, an atomic increment), a null check, and a size check, which I'm pretty sure is the fastest way to do it that implements reference counting for deallocation.
    - `+` Your version doesn't allocate unnecessarily small buffers.
    - `+` Your version is more readable, especially seeing how unnecessarily cryptic I wrote the reallocation logic.
    - `-` Your version has no way to free buffers, so I would argue that it leaks memory. Granted, it's upper-bounded by a bit less than 2x the next larger power-of-two of the largest buffer allocated, so it won't grow without bound. By comparison however, my version will release smaller buffers when they are no longer used, and will free its cache when `ReleaseUnused()` is called and there are no other users. I also considered a version where the cache is a `weak_ptr`, in which case the `ReleaseUnused()` would not be needed, but decided against it mostly because `ReleaseUnused()` already existed.
    - `-` Nit, but your version will allocate small buffers regardless of whether a larger buffer is already available, whereas my version will return the largest buffer allocated thus far, and will automatically free previously allocated smaller buffers when all their users go out of scope.
    - `-` Also kind of a nit, but rounding up to power-of-two-sized buffers means that you might throw an out of memory error even if almost half of the requested memory isn't needed. My algorithm will back off and allocate only as much as is needed if the 2 * previous size allocation fails.
   
   An inability to free something, especially if that something is large, feels like bad news to me, so I'm hesitant to just copy your version in and call it a day. But if nothing else, I'll add a lower bound for allocation size and try to rewrite the allocation algorithm to be less cryptic tomorrow.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r802858907



##########
File path: cpp/src/arrow/memory_pool.h
##########
@@ -109,6 +157,14 @@ class ARROW_EXPORT MemoryPool {
 
  protected:
   MemoryPool() = default;
+
+  /// Free a memory region allocated by GetImmutableZeros().
+  ///
+  /// @param buffer Pointer to the start of the allocated memory region
+  /// @param size Allocated size located at buffer. An allocator implementation

Review comment:
       Oops, I think my editor is configured to autocomplete using `@param` and I just didn't notice. a551146

##########
File path: cpp/src/arrow/array/util.cc
##########
@@ -534,6 +533,151 @@ class NullArrayFactory {
   std::shared_ptr<Buffer> buffer_;
 };
 
+// mutable version of ImmutableNullArrayFactory
+class NullArrayFactory {
+ private:
+  Result<std::shared_ptr<Buffer>> CreateZeroByteBuffer(size_t scalar_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer(length_ * scalar_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroOffsetBuffer(size_t index_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ + 1) * index_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroBitBuffer(size_t scalar_size_bits) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ * scalar_size_bits + 7) / 8, pool_));

Review comment:
       Changed in 689a8af




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r805825245



##########
File path: cpp/src/arrow/array/util.cc
##########
@@ -534,6 +533,152 @@ class NullArrayFactory {
   std::shared_ptr<Buffer> buffer_;
 };
 
+// mutable version of NullArrayFactory, i.e. one that doesn't reuse a single buffer
+class MutableNullArrayFactory {
+ private:
+  Result<std::shared_ptr<Buffer>> CreateZeroByteBuffer(size_t scalar_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer(length_ * scalar_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroOffsetBuffer(size_t index_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ + 1) * index_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroBitBuffer(size_t scalar_size_bits) const {
+    ARROW_ASSIGN_OR_RAISE(
+        auto buffer,
+        AllocateBuffer(bit_util::BytesForBits(length_ * scalar_size_bits), pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  static Result<std::shared_ptr<Buffer>> CreateEmptyBuffer() { return AllocateBuffer(0); }

Review comment:
       nit: AllocateBuffer(0, pool_)?

##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,53 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.
+class ARROW_EXPORT CPUImmutableZerosMemoryManager : public MemoryManager {

Review comment:
       Additionally, I don't see `is_mutable` overridden. But if it's not overridden, is it useful to have?

##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,53 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.
+class ARROW_EXPORT CPUImmutableZerosMemoryManager : public MemoryManager {

Review comment:
       Couldn't this be placed inside device.cc instead?

##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,53 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.
+class ARROW_EXPORT CPUImmutableZerosMemoryManager : public MemoryManager {

Review comment:
       We could also check `Buffer::is_mutable` in `CopyBufferTo` instead above.

##########
File path: cpp/src/arrow/array/util.cc
##########
@@ -534,6 +533,152 @@ class NullArrayFactory {
   std::shared_ptr<Buffer> buffer_;
 };
 
+// mutable version of NullArrayFactory, i.e. one that doesn't reuse a single buffer
+class MutableNullArrayFactory {
+ private:
+  Result<std::shared_ptr<Buffer>> CreateZeroByteBuffer(size_t scalar_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer(length_ * scalar_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroOffsetBuffer(size_t index_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ + 1) * index_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroBitBuffer(size_t scalar_size_bits) const {
+    ARROW_ASSIGN_OR_RAISE(
+        auto buffer,
+        AllocateBuffer(bit_util::BytesForBits(length_ * scalar_size_bits), pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  static Result<std::shared_ptr<Buffer>> CreateEmptyBuffer() { return AllocateBuffer(0); }
+
+ public:
+  MutableNullArrayFactory(MemoryPool* pool, const std::shared_ptr<DataType>& type,
+                          int64_t length)
+      : pool_(pool), type_(type), length_(length) {}
+
+  Result<std::shared_ptr<ArrayData>> Create() {
+    std::vector<std::shared_ptr<ArrayData>> child_data(type_->num_fields());
+    ARROW_ASSIGN_OR_RAISE(auto validity, CreateZeroBitBuffer(1));
+    out_ = ArrayData::Make(type_, length_, {validity}, child_data, length_, 0);
+    RETURN_NOT_OK(VisitTypeInline(*type_, this));
+    return out_;
+  }
+
+  Status Visit(const NullType&) {
+    out_->buffers.resize(1, nullptr);

Review comment:
       Hmm, should this be `out_->buffers.resize(0)` since a NullArray has no buffers?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r805894493



##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,53 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.
+class ARROW_EXPORT CPUImmutableZerosMemoryManager : public MemoryManager {

Review comment:
       > Couldn't this be placed inside device.cc instead?
   
   Probably, since it's not used extensively or at all yet. But it seems odd to me to make `CPUMemoryManager` public (and it needs to be, there is code referring to it directly) while hiding its cousin. IMO it should be public for symmetry if nothing else.
   
   > Additionally, I don't see `is_mutable` overridden.
   
   Oops, I completely forgot to do that. 41ebefc
   
   > [...] is it useful to have?
   
   My logic here is that if a buffer is associated with a memory pool that generates immutable buffers of which the data corresponds to some pattern (otherwise the buffers wouldn't be very useful in general), one should be able to assume that the contents of this buffer actually correspond to said pattern. If however views/copies of other buffers can be created within each CPU memory pool, which the previous implementation did (in fact it assumed there is only one CPU memory pool, as it always just used AllocateBuffer for the default CPU memory pool) that invariant does not hold. The `is_mutable()` flag was my way of solving this, or at least my way of catching as many possible application logic problems related to this as possible (a user can just make their own memory pools that ignore the destination memory pool argument in `CopyBufferTo` and `ViewBufferTo`, not unlike what the default CPU memory pool was doing, so it's certainly not foolproof).
   
   However, whether this logic is sound and/or whether this is desirable is certainly disputable. As is the name of the method.
   
   > We could also check `Buffer::is_mutable` in `CopyBufferTo` instead above.
   
   That would be too restrictive per the above reasoning. For example, `MemoryManager::is_mutable()` is/should be checked whenever a view is made of another buffer within the confines of that memory manager; if the memory manager makes immutable buffers and thus is making some assertion about the contents of its buffers, making the view is denied (that's pessimistic, because the data *might* still conform to the pattern, but at this point the user is probably doing something they didn't intend to do regardless of what's actually in the buffer). On the other hand, if the memory pool can make mutable buffers and thus does not make any assertions about buffer contents, it's perfectly fine to return a view of an existing buffer, regardless of that buffer's mutability.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r805894197



##########
File path: cpp/src/arrow/array/util.cc
##########
@@ -534,6 +533,152 @@ class NullArrayFactory {
   std::shared_ptr<Buffer> buffer_;
 };
 
+// mutable version of NullArrayFactory, i.e. one that doesn't reuse a single buffer
+class MutableNullArrayFactory {
+ private:
+  Result<std::shared_ptr<Buffer>> CreateZeroByteBuffer(size_t scalar_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer(length_ * scalar_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroOffsetBuffer(size_t index_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ + 1) * index_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroBitBuffer(size_t scalar_size_bits) const {
+    ARROW_ASSIGN_OR_RAISE(
+        auto buffer,
+        AllocateBuffer(bit_util::BytesForBits(length_ * scalar_size_bits), pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  static Result<std::shared_ptr<Buffer>> CreateEmptyBuffer() { return AllocateBuffer(0); }
+
+ public:
+  MutableNullArrayFactory(MemoryPool* pool, const std::shared_ptr<DataType>& type,
+                          int64_t length)
+      : pool_(pool), type_(type), length_(length) {}
+
+  Result<std::shared_ptr<ArrayData>> Create() {
+    std::vector<std::shared_ptr<ArrayData>> child_data(type_->num_fields());
+    ARROW_ASSIGN_OR_RAISE(auto validity, CreateZeroBitBuffer(1));
+    out_ = ArrayData::Make(type_, length_, {validity}, child_data, length_, 0);
+    RETURN_NOT_OK(VisitTypeInline(*type_, this));
+    return out_;
+  }
+
+  Status Visit(const NullType&) {
+    out_->buffers.resize(1, nullptr);

Review comment:
       I just copied this from the original `NullArrayFactory`, and don't ask me why, but at least the `ValidateLayout()` fails when `buffers` is empty. I'm assuming there is code in Arrow that assumes that the `buffers[0]` entry always exists and points to the values buffer, even if a type doesn't actually have a values buffer, as this also explains the `nullptr` entry before the actual buffer pointers for `UnionType`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806979301



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       Note that [VirtualAlloc](https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualalloc) on Windows zero-initializes pages. We can fall back on `calloc` by default.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806108396



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       Ah, you're right, we don't know that the allocator returned us zero-allocated memory, unfortunately.
   (note that both jemalloc and mimalloc have zero-initializing allocation APIs, so we could use those)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r807804329



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       Ah, sorry. You can just use `arrow/util/atomic_shared_ptr.h` :-)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r802665698



##########
File path: cpp/src/arrow/array/util.h
##########
@@ -37,7 +37,20 @@ namespace arrow {
 ARROW_EXPORT
 std::shared_ptr<Array> MakeArray(const std::shared_ptr<ArrayData>& data);
 
-/// \brief Create a strongly-typed Array instance with all elements null
+/// \brief Create a strongly-typed mutable Array instance with all elements initially set
+/// to null
+/// \param[in] type the array type \param[in] length the array length
+/// \param[in] pool the memory pool to allocate memory from
+ARROW_EXPORT
+Result<std::shared_ptr<Array>> MakeMutableArrayOfNull(
+    const std::shared_ptr<DataType>& type, int64_t length,
+    MemoryPool* pool = default_memory_pool());
+
+/// \brief Create a strongly-typed immutable Array instance with all elements null
+///
+/// This function may reuse a single zero buffer, but may also defer to
+/// MakeArrayOfNull().

Review comment:
       was this meant to be MakeMutableArrayOfNull? it sounds a little odd for a function to defer to itself

##########
File path: cpp/src/arrow/array/util.cc
##########
@@ -534,6 +533,151 @@ class NullArrayFactory {
   std::shared_ptr<Buffer> buffer_;
 };
 
+// mutable version of ImmutableNullArrayFactory
+class NullArrayFactory {
+ private:
+  Result<std::shared_ptr<Buffer>> CreateZeroByteBuffer(size_t scalar_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer(length_ * scalar_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroOffsetBuffer(size_t index_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ + 1) * index_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroBitBuffer(size_t scalar_size_bits) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ * scalar_size_bits + 7) / 8, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  static Result<std::shared_ptr<Buffer>> CreateEmptyBuffer() { return AllocateBuffer(0); }
+
+ public:
+  NullArrayFactory(MemoryPool* pool, const std::shared_ptr<DataType>& type,
+                   int64_t length)
+      : pool_(pool), type_(type), length_(length) {}
+
+  Result<std::shared_ptr<ArrayData>> Create() {
+    std::vector<std::shared_ptr<ArrayData>> child_data(type_->num_fields());
+    ARROW_ASSIGN_OR_RAISE(auto validity, CreateZeroBitBuffer(1));
+    out_ = ArrayData::Make(type_, length_, {validity}, child_data, length_, 0);
+    RETURN_NOT_OK(VisitTypeInline(*type_, this));
+    return out_;
+  }
+
+  Status Visit(const NullType&) {
+    out_->buffers.resize(1, nullptr);
+    return Status::OK();
+  }
+
+  Status Visit(const FixedWidthType& type) {
+    out_->buffers.resize(2);
+    // values
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1], CreateZeroBitBuffer(type.bit_width()));
+    return Status::OK();
+  }
+
+  template <typename T>
+  enable_if_base_binary<T, Status> Visit(const T&) {
+    out_->buffers.resize(3);
+    // offsets
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1],
+                          CreateZeroOffsetBuffer(sizeof(typename T::offset_type)));
+    // values
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[2], CreateEmptyBuffer());
+    return Status::OK();
+  }
+
+  template <typename T>
+  enable_if_var_size_list<T, Status> Visit(const T& type) {
+    out_->buffers.resize(2);
+    // offsets
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1],
+                          CreateZeroOffsetBuffer(sizeof(typename T::offset_type)));
+    // values
+    ARROW_ASSIGN_OR_RAISE(out_->child_data[0], CreateChild(type, 0, /*length=*/0));
+    return Status::OK();
+  }
+
+  Status Visit(const FixedSizeListType& type) {
+    ARROW_ASSIGN_OR_RAISE(out_->child_data[0],
+                          CreateChild(type, 0, length_ * type.list_size()));
+    return Status::OK();
+  }
+
+  Status Visit(const StructType& type) {
+    for (int i = 0; i < type_->num_fields(); ++i) {
+      ARROW_ASSIGN_OR_RAISE(out_->child_data[i], CreateChild(type, i, length_));
+    }
+    return Status::OK();
+  }
+
+  Status Visit(const UnionType& type) {
+    out_->buffers.resize(2);
+
+    // First buffer is always null
+    out_->buffers[0] = nullptr;
+
+    // type ID buffer
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1], AllocateBuffer(length_, pool_));
+    std::memset(out_->buffers[1]->mutable_data(), type.type_codes()[0], length_);
+
+    // For sparse unions, we now create children with the same length as the
+    // parent
+    int64_t child_length = length_;
+    if (type.mode() == UnionMode::DENSE) {
+      // For dense unions, we set the offsets to all zero and create children
+      // with length 1
+      out_->buffers.resize(3);
+      ARROW_ASSIGN_OR_RAISE(out_->buffers[2], CreateZeroByteBuffer(sizeof(int32_t)));
+
+      child_length = 1;
+    }
+    for (int i = 0; i < type_->num_fields(); ++i) {
+      ARROW_ASSIGN_OR_RAISE(out_->child_data[i], CreateChild(type, i, child_length));
+    }
+    return Status::OK();
+  }
+
+  Status Visit(const DictionaryType& type) {
+    out_->buffers.resize(2);
+    // dictionary indices
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1], CreateZeroBitBuffer(type.bit_width()));
+    // dictionary data
+    ARROW_ASSIGN_OR_RAISE(auto typed_null_dict, MakeArrayOfNull(type.value_type(), 0));
+    out_->dictionary = typed_null_dict->data();
+    return Status::OK();
+  }
+
+  Status Visit(const ExtensionType& type) {
+    out_->child_data.resize(type.storage_type()->num_fields());
+    RETURN_NOT_OK(VisitTypeInline(*type.storage_type(), this));
+    return Status::OK();
+  }
+
+  Status Visit(const DataType& type) {
+    return Status::NotImplemented("construction of all-null ", type);
+  }
+
+  Result<std::shared_ptr<ArrayData>> CreateChild(const DataType& type, int i,
+                                                 int64_t length) {
+    ImmutableNullArrayFactory child_factory(pool_, type.field(i)->type(), length);

Review comment:
       Should the child also use NullArrayFactory?

##########
File path: cpp/src/arrow/memory_pool.h
##########
@@ -63,6 +63,42 @@ class MemoryPoolStats {
 /// take care of the required 64-byte alignment.
 class ARROW_EXPORT MemoryPool {
  public:
+  class ARROW_EXPORT ImmutableZeros {

Review comment:
       I'm probably missing something, but why is the existing Buffer interface not sufficient for this?

##########
File path: cpp/src/arrow/array/util.cc
##########
@@ -534,6 +533,151 @@ class NullArrayFactory {
   std::shared_ptr<Buffer> buffer_;
 };
 
+// mutable version of ImmutableNullArrayFactory
+class NullArrayFactory {
+ private:
+  Result<std::shared_ptr<Buffer>> CreateZeroByteBuffer(size_t scalar_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer(length_ * scalar_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroOffsetBuffer(size_t index_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ + 1) * index_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroBitBuffer(size_t scalar_size_bits) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ * scalar_size_bits + 7) / 8, pool_));

Review comment:
       Could use BytesForBits here: https://github.com/apache/arrow/blob/d59dbbc36c7950e58332d081d47c2d43ea898215/cpp/src/arrow/util/bit_util.h#L82-L86

##########
File path: cpp/src/arrow/array/util.h
##########
@@ -46,7 +59,20 @@ Result<std::shared_ptr<Array>> MakeArrayOfNull(const std::shared_ptr<DataType>&
                                                int64_t length,
                                                MemoryPool* pool = default_memory_pool());
 
-/// \brief Create an Array instance whose slots are the given scalar
+/// \brief Create a mutable Array instance whose slots are initialized with the given
+/// scalar
+/// \param[in] scalar the value with which to fill the array
+/// \param[in] length the array length
+/// \param[in] pool the memory pool to allocate memory from
+ARROW_EXPORT
+Result<std::shared_ptr<Array>> MakeMutableArrayFromScalar(
+    const Scalar& scalar, int64_t length, MemoryPool* pool = default_memory_pool());
+
+/// \brief Create an immutable Array instance whose slots are set to the given scalar
+///
+/// This function may reuse buffers if they contain the same (repeated) value to save
+/// memory, but may also defer to MakeArrayFromScalar().

Review comment:
       ditto here - was this meant to be MakeMutableArrayFromScalar?

##########
File path: cpp/src/arrow/memory_pool.h
##########
@@ -109,6 +157,14 @@ class ARROW_EXPORT MemoryPool {
 
  protected:
   MemoryPool() = default;
+
+  /// Free a memory region allocated by GetImmutableZeros().
+  ///
+  /// @param buffer Pointer to the start of the allocated memory region
+  /// @param size Allocated size located at buffer. An allocator implementation

Review comment:
       nit, but most docstrings use \param syntax




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten edited a comment on pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten edited a comment on pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#issuecomment-1041823183


   @pitrou @lidavidm ~~I think I've now resolved (or defended my choices for) everything that came up thus far, and CI is happy. Let me know if I missed anything and/or you think there's more that needs to be changed.~~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806972002



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       @jvanstraten You're right. So let's keep it like this and just explain why it is in a comment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806981356



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);

Review comment:
       Perhaps `bit_util::RoundUpToPowerOf2(size)` rather than `size`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806166409



##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       In my mind, if a memory manager is associated with a buffer, it should be safe to assume that that memory manager actually *is* managing that buffer, i.e. at the very least it should be possible to create a buffer with the same allocation behavior by calling the associated memory manager's `AllocateBuffer` method. So, I ended up implementing this special memory manager in the end because that otherwise wouldn't be the case for these buffers.
   
   In general, I'm too new to the project to have a good understanding of why there are so many layers of abstractions and methods for creating a buffer and doing memory management... but surely there *are* good reasons for them, so wouldn't not implementing them consistently lead to issues down the line?
   
   I will yield to the idea that a lot of this complexity can be removed if never being able to safely free these buffers is acceptable, because then they'd just become like any other `Buffer` that has no ownership information associated with it. But IMO that's a slippery slope at best.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r805892998



##########
File path: cpp/src/arrow/array/util.cc
##########
@@ -534,6 +533,152 @@ class NullArrayFactory {
   std::shared_ptr<Buffer> buffer_;
 };
 
+// mutable version of NullArrayFactory, i.e. one that doesn't reuse a single buffer
+class MutableNullArrayFactory {
+ private:
+  Result<std::shared_ptr<Buffer>> CreateZeroByteBuffer(size_t scalar_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer(length_ * scalar_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroOffsetBuffer(size_t index_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ + 1) * index_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroBitBuffer(size_t scalar_size_bits) const {
+    ARROW_ASSIGN_OR_RAISE(
+        auto buffer,
+        AllocateBuffer(bit_util::BytesForBits(length_ * scalar_size_bits), pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  static Result<std::shared_ptr<Buffer>> CreateEmptyBuffer() { return AllocateBuffer(0); }

Review comment:
       a4c3413




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806151531



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -536,6 +543,39 @@ int64_t MemoryPool::max_memory() const { return -1; }
 // MemoryPool implementation that delegates its core duty
 // to an Allocator class.
 
+class ImmutableZeros : public Buffer {
+ public:
+  explicit ImmutableZeros(uint8_t* data, int64_t size, MemoryPool* pool)
+      : Buffer(data, size, CPUDevice::immutable_zeros_memory_manager(pool)),
+        pool_(pool) {}
+
+  ImmutableZeros() : Buffer(nullptr, 0), pool_(nullptr) {}
+
+  ~ImmutableZeros() override;
+
+  // Prevent copies and handle moves explicitly to avoid double free

Review comment:
       I'm not sure why or what you mean by that? It's analogous to `PoolBuffer` in that it uses RAII to free memory when it is no longer needed, except two `Buffer` classes are needed to correctly model shared ownership, i.e. `ImmutableZeros` models the data, `ImmutableZerosPoolBuffer` models a shared reference to that data (even in contexts where a `unique_ptr` or raw pointer/reference to a `Buffer` is needed). Writing it down like this though, the names could use some refactoring, especially now that they both implement `Buffer`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r807845759



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       Ah, yeah, that's certainly a lot cleaner. dee486a




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r808372492



##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       I guess the memory manager APIs are fairly new/unused and the contract hasn't been pinned down here. That said, I also don't see the issue with sticking to the regular CPU device/memory manager even for immutable buffers. (After all, regular buffers can also be immutable already.) And to me 'managing' means 'can read the data backed by the buffer', not necessarily 'will allocate you exactly the same buffer'. (For instance: PyBuffer is part of the CPU memory manager, since it can read the data in that buffer, even if it can't allocate you a new Python object.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806992382



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS

Review comment:
       Same here: we can hide behind a compatibility function.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS

Review comment:
       Same here: we can hide this behind a compatibility function.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806098292



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       The difference is that, because of the flags passed to mmap, the kernel doesn't have to allocate (or clear) physical memory at all, as opposed to the allocate + memset that the alternative implementation would do to satisfy mutability. You could allocate several terabytes of virtual memory with this call if you'd want to, and it would cost you zero physical bytes (page table structures aside). At least, it works that way on my kernel.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       Compared to my algorithm:
   
    - `+` In terms of thread primitives, your fast path only involves a memory fence, whereas mine involves a mutex. I'm not sure if it's even needed, though: it isn't if you can safely make a copy of a `shared_ptr` while another thread may be updating the contained pointer. That feels like a true statement, at least on x86, but I couldn't figure it out for sure from the C++ docs. If I can remove it, my fast path is just a `shared_ptr` copy (so, an atomic increment), a null check, and a size check, which I'm pretty sure is the fastest way to do it that implements reference counting for deallocation.
    - `+` Your version doesn't allocate unnecessarily small buffers.
    - `+` Your version is more readable, especially seeing how unnecessarily cryptic I wrote the reallocation logic.
    - `-` Your version has no way to free buffers, so I would argue that it leaks memory. Granted, it's upper-bounded by a bit less than 2x the next larger power-of-two of the largest buffer allocated, so it won't grow without bound. By comparison however, my version will release smaller buffers when they are no longer used, and will free its cache when `ReleaseUnused()` is called and there are no other users. I also considered a version where the cache is a `weak_ptr`, in which case the `ReleaseUnused()` would not be needed, but decided against it mostly because `ReleaseUnused()` already existed.
    - `-` Nit, but your version will allocate small buffers regardless of whether a larger buffer is already available, whereas my version will return the largest buffer allocated thus far, and will automatically free previously allocated smaller buffers when all their users go out of scope.
    - `-` Also kind of a nit, but rounding up to power-of-two-sized buffers means that you might throw an out of memory error even if almost half of the requested memory isn't needed. My algorithm will back off and allocate only as much as is needed if the 2 * previous size allocation fails.
   
   An inability to free something, especially if that something is large, feels like bad news to me, so I'm hesitant to just copy your version in and call it a day. But if nothing else, I'll add a lower bound for allocation size and try to rewrite the allocation algorithm to be less cryptic tomorrow.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -536,6 +543,39 @@ int64_t MemoryPool::max_memory() const { return -1; }
 // MemoryPool implementation that delegates its core duty
 // to an Allocator class.
 
+class ImmutableZeros : public Buffer {
+ public:
+  explicit ImmutableZeros(uint8_t* data, int64_t size, MemoryPool* pool)
+      : Buffer(data, size, CPUDevice::immutable_zeros_memory_manager(pool)),
+        pool_(pool) {}
+
+  ImmutableZeros() : Buffer(nullptr, 0), pool_(nullptr) {}
+
+  ~ImmutableZeros() override;
+
+  // Prevent copies and handle moves explicitly to avoid double free

Review comment:
       I'm not sure why or what you mean by that? It's analogous to `PoolBuffer` in that it uses RAII to free memory when it is no longer needed, except two `Buffer` classes are needed to correctly model shared ownership, i.e. `ImmutableZeros` models the data, `ImmutableZerosPoolBuffer` models a shared reference to that data (even in contexts where a `unique_ptr` or raw pointer/reference to a `Buffer` is needed). Writing it down like this though, the names could use some refactoring, especially now that they both implement `Buffer`.

##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       In my mind, if a memory manager is associated with a buffer, it should be safe to assume that that memory manager actually *is* managing that buffer, i.e. at the very least it should be possible to create a buffer with the same allocation behavior by calling the associated memory manager's `AllocateBuffer` method. So, I ended up implementing this special memory manager in the end because that otherwise wouldn't be the case for these buffers.
   
   In general, I'm too new to the project to have a good understanding of why there are so many layers of abstractions and methods for creating a buffer and doing memory management... but surely there *are* good reasons for them, so wouldn't not implementing them consistently lead to issues down the line?
   
   I will yield to the idea that a lot of this complexity can be removed if never being able to safely free these buffers is acceptable, because then they'd just become like any other `Buffer` that has no ownership information associated with it. But IMO that's a slippery slope at best.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > That is true. However, it shouldn't be a concern if we can ensure that the pages don't actually allocate physical memory (or almost none of it, such as `/dev/zero`).
   
   I don't think we can in general, unfortunately. I would have no idea how to do it on something as ubiquitous as Windows (or if it can be done at all), and I'm sure that in general there are more exotic operating systems and architectures that simply can't do it. Also, for 32-bit systems/builds (if Arrow supports those) virtual memory is also in relatively short supply.
   
   > However, changing the pointer requires use of dedicated atomic access functions: [...]
   
   Ah, great, those functions are exactly what I had missed!
   
   I improved my allocation algorithm accordingly here: e628688

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       In fact, `Allocate()` generally does not return zero-initialized memory, due to the poisoning mechanic. I would rather not dive into this rabbit hole as well right now, so I just added a TODO comment in the code. I can file a followup JIRA for it too, if you want.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > That is true. However, it shouldn't be a concern if we can ensure that the pages don't actually allocate physical memory (or almost none of it, such as `/dev/zero`).
   
   I don't think we can in general, unfortunately. I would have no idea how to do it on something as ubiquitous as Windows (or if it can be done at all), and I'm sure that in general there are more exotic operating systems and architectures that simply can't do it. Also, for 32-bit systems/builds (if Arrow supports those) virtual memory is also in relatively short supply.
   
   > However, changing the pointer requires use of dedicated atomic access functions: [...]
   
   Ah, great, those functions are exactly what I had missed!
   
   I improved my allocation algorithm accordingly here: e628688
   
   ETA: and d89c297 (old habits die hard)

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);
+
+    // Attempt to allocate the block.
+    uint8_t* data = nullptr;
+    auto result = AllocateImmutableZeros(alloc_size, &data);
+
+    // If we fail to do so, fall back to trying to allocate the requested size
+    // exactly as a last-ditch effort.
+    if (!result.ok() || data == nullptr) {

Review comment:
       You shouldn't be able to in the default implementation, but `AllocateImmutableZeros` is virtual, so someone could override it in a custom implementation. Still, I suppose I should have gotten rid of that. As a first pass I had it throw an error properly when data is set to null or not modified, then changed my mind and did a `DCHECK` instead. Now the null check there indeed doesn't really do anything useful anymore. 3dda474

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);
+
+    // Attempt to allocate the block.
+    uint8_t* data = nullptr;
+    auto result = AllocateImmutableZeros(alloc_size, &data);
+
+    // If we fail to do so, fall back to trying to allocate the requested size
+    // exactly as a last-ditch effort.
+    if (!result.ok() || data == nullptr) {
+      alloc_size = size;
+      RETURN_NOT_OK(AllocateImmutableZeros(alloc_size, &data));
+    }
+    DCHECK_NE(data, nullptr);
+
+    // Move ownership of the data block into an ImmutableZeros object. It will
+    // free the block when destroyed, i.e. when all shared_ptr references to it
+    // are reset or go out of scope.
+    current_buffer = std::make_shared<ImmutableZeros>(data, alloc_size, this);
+
+    // Store a reference to the new block in the cache, so subsequent calls to
+    // this function (from this thread or from other threads) can use it, too.
+    atomic_store(&immutable_zeros_cache_, current_buffer);
+
+    return std::move(current_buffer);
+  }
+
+  void ReleaseUnused() override {
+    // Get rid of the ImmutableZeros cache if we're the only one using it. If
+    // there are other pieces of code using it, getting rid of the cache won't
+    // deallocate it anyway, so it's better to hold onto it.
+    {
+      auto cache = atomic_load(&immutable_zeros_cache_);
+
+      // Because we now have a copy in our thread, the use count will be 2 if
+      // nothing else is using it.
+      if (cache.use_count() <= 2) {
+        atomic_store(&immutable_zeros_cache_, std::shared_ptr<ImmutableZeros>());

Review comment:
       It's not entirely thread-safe in the sense that if other threads are doing stuff with the cache while `ReleaseUnused()` is called, `ReleaseUnused()` may or may not flush the cache due to a data race in `use_count`. But it doesn't break anything in either case, and `ReleaseUnused()` is already documented to be best-effort.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806152099



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > * I'm not sure if it's even needed, though: it isn't if you can safely make a copy of a `shared_ptr` while another thread may be updating the contained pointer.
   
   You can safely update the object _pointed to_ (well, except that accessing that object from different threads then becomes racy). However, changing the _pointer_ requires use of dedicated atomic access functions: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic
   
   I guess this means you should favour changing the pointer, since all accesses will be confined in this utility function.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806689894



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > That is true. However, it shouldn't be a concern if we can ensure that the pages don't actually allocate physical memory (or almost none of it, such as `/dev/zero`).
   
   I don't think we can in general, unfortunately. I would have no idea how to do it on something as ubiquitous as Windows (or if it can be done at all), and I'm sure that in general there are more exotic operating systems and architectures that simply can't do it. Also, for 32-bit systems/builds (if Arrow supports those) virtual memory is also in relatively short supply.
   
   > However, changing the pointer requires use of dedicated atomic access functions: [...]
   
   Ah, great, those functions are exactly what I had missed!
   
   I improved my allocation algorithm accordingly here: e628688
   
   ETA: and d89c297 (old habits die hard)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806981356



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);

Review comment:
       Perhaps `bit_util::RoundUpToPowerOf2(size)` rather than `size`?
   
   Edit: perhaps that would be detrimental for large allocations. Nevermind.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806977230



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       Can you hide the entire mmap() handling behind a compatibility wrapper in `arrow/util/io_util.{h,cc}`, so that we can easily add Windows and macOS support?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806979301



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       Note that [VirtualAlloc](https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualalloc) on Windows zero-initializes pages. We can fall back on [calloc](https://pubs.opengroup.org/onlinepubs/9699919799/functions/calloc.html) by default.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r805997122



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       I'm not sure this is useful, because the underlying allocator probably already relies on mmap for large-ish allocations.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       This algorithm seems quite complicated and costly (in terms of inter-thread synchronization).
   Instead, how about maintaining lazily-allocated power-of-two-sized entries (perhaps starting with 4096 to avoid having a ton of tiny size classes)? Something like:
   ```c++
   
   static constexpr int kMinPower2Size = 12;  // 4096 bytes
   static constexpr int kNumAllocClasses = 64 - kMinPower2Size;
   // Addresses of allocated zeros, for each size class
   std::array<std::atomic<uintptr_t>, kNumAllocClasses> allocated_;
   std::mutex allocation_mutex_;
   
   Result<const uint8_t*> GetImmutableZeros(int64_t size) override {
     const auto size_class = std::max(0, bit_util::Log2(size) - kMinPower2Size);
     auto addr = allocated_[size_class].load();
     if (ARROW_PREDICT_TRUE(addr != nullptr)) {
       // Fast path: already allocated
       return reinterpret_cast<const uint8_t*>(addr);
     }
     // Slow path: allocate if not already done
     std::lock_guard<std::mutex> lock(allocation_mutex_);
     auto addr = allocated_[size_class].load();
     if (addr == nullptr) {
       const int64_t alloc_size = static_cast<int64_t>(1) << (size_class + kMinPower2Size);
       ARROW_ASSIGN_OR_RAISE(const uint8_t* data, AllocateImmutableZeros(size));
       allocated_[size_class] = addr = reinterpret_cast<uintptr_t>(data);
     }
     return reinterpret_cast<const uint8_t*>(addr);
   }
   ```

##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       Hmm, I honestly don't understand why it would be useful to implement this, while we're already exposing additional methods to `MemoryPool`.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -536,6 +543,39 @@ int64_t MemoryPool::max_memory() const { return -1; }
 // MemoryPool implementation that delegates its core duty
 // to an Allocator class.
 
+class ImmutableZeros : public Buffer {
+ public:
+  explicit ImmutableZeros(uint8_t* data, int64_t size, MemoryPool* pool)
+      : Buffer(data, size, CPUDevice::immutable_zeros_memory_manager(pool)),
+        pool_(pool) {}
+
+  ImmutableZeros() : Buffer(nullptr, 0), pool_(nullptr) {}
+
+  ~ImmutableZeros() override;
+
+  // Prevent copies and handle moves explicitly to avoid double free

Review comment:
       It sounds hackish to have a separate ImmutableZeros class, IMHO.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       Ah, you're right, we don't know that the allocator returned us zero-allocated memory, unfortunately.
   (note that both jemalloc and mimalloc have zero-initializing allocation APIs, so we could use those)

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > An inability to free something, especially if that something is large, feels like bad news to me, so I'm hesitant to just copy your version in and call it a day.
   
   That is true. However, it shouldn't be a concern if we can ensure that the pages don't actually allocate physical memory (or almost none of it, such as `/dev/zero`).

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > * I'm not sure if it's even needed, though: it isn't if you can safely make a copy of a `shared_ptr` while another thread may be updating the contained pointer.
   
   You can safely update the object _pointed to_. However, changing the _pointer_ requires use of dedicated atomic access functions: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > * I'm not sure if it's even needed, though: it isn't if you can safely make a copy of a `shared_ptr` while another thread may be updating the contained pointer.
   
   You can safely update the object _pointed to_ (well, except that accessing that object from different threads then becomes racy). However, changing the _pointer_ requires use of dedicated atomic access functions: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > * I'm not sure if it's even needed, though: it isn't if you can safely make a copy of a `shared_ptr` while another thread may be updating the contained pointer.
   
   You can safely update the object _pointed to_ (well, except that accessing that object from different threads then becomes racy). However, changing the _pointer_ requires use of dedicated atomic access functions: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic
   
   I guess this means you should favour changing the pointer, since all accesses will be confined in this utility function.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       @jvanstraten You're right. So let's keep it like this and just explain why it is in a comment.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       Can you hide the entire mmap() handling behind a compatibility wrapper in `arrow/util/io_util.{h,cc}`, so that we can easily add Windows and macOS support?
   

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       Note that [VirtualAlloc](https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualalloc) on Windows zero-initializes pages. We can fall back on `calloc` by default.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);

Review comment:
       Perhaps `bit_util::RoundUpToPowerOf2(size)` rather than `size`?

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);

Review comment:
       Perhaps `bit_util::RoundUpToPowerOf2(size)` rather than `size`?
   
   Edit: perhaps that would be detrimental for large allocations. Nevermind.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);
+
+    // Attempt to allocate the block.
+    uint8_t* data = nullptr;
+    auto result = AllocateImmutableZeros(alloc_size, &data);
+
+    // If we fail to do so, fall back to trying to allocate the requested size
+    // exactly as a last-ditch effort.
+    if (!result.ok() || data == nullptr) {

Review comment:
       Hmm, why is it possible to get `nullptr` here?

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);
+
+    // Attempt to allocate the block.
+    uint8_t* data = nullptr;
+    auto result = AllocateImmutableZeros(alloc_size, &data);
+
+    // If we fail to do so, fall back to trying to allocate the requested size
+    // exactly as a last-ditch effort.
+    if (!result.ok() || data == nullptr) {
+      alloc_size = size;
+      RETURN_NOT_OK(AllocateImmutableZeros(alloc_size, &data));
+    }
+    DCHECK_NE(data, nullptr);
+
+    // Move ownership of the data block into an ImmutableZeros object. It will
+    // free the block when destroyed, i.e. when all shared_ptr references to it
+    // are reset or go out of scope.
+    current_buffer = std::make_shared<ImmutableZeros>(data, alloc_size, this);
+
+    // Store a reference to the new block in the cache, so subsequent calls to
+    // this function (from this thread or from other threads) can use it, too.
+    atomic_store(&immutable_zeros_cache_, current_buffer);
+
+    return std::move(current_buffer);
+  }
+
+  void ReleaseUnused() override {
+    // Get rid of the ImmutableZeros cache if we're the only one using it. If
+    // there are other pieces of code using it, getting rid of the cache won't
+    // deallocate it anyway, so it's better to hold onto it.
+    {
+      auto cache = atomic_load(&immutable_zeros_cache_);
+
+      // Because we now have a copy in our thread, the use count will be 2 if
+      // nothing else is using it.
+      if (cache.use_count() <= 2) {
+        atomic_store(&immutable_zeros_cache_, std::shared_ptr<ImmutableZeros>());

Review comment:
       Is it a problem if `immutable_zeros_cache_` was modified in the meantime? Probably not, just checking.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       Note that [VirtualAlloc](https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualalloc) on Windows zero-initializes pages. We can fall back on [calloc](https://pubs.opengroup.org/onlinepubs/9699919799/functions/calloc.html) by default.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS

Review comment:
       Same here: we can hide behind a compatibility function.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS

Review comment:
       Same here: we can hide this behind a compatibility function.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806689894



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > That is true. However, it shouldn't be a concern if we can ensure that the pages don't actually allocate physical memory (or almost none of it, such as `/dev/zero`).
   
   I don't think we can in general, unfortunately. I would have no idea how to do it on something as ubiquitous as Windows (or if it can be done at all), and I'm sure that in general there are more exotic operating systems and architectures that simply can't do it. Also, for 32-bit systems/builds (if Arrow supports those) virtual memory is also in relatively short supply.
   
   > However, changing the pointer requires use of dedicated atomic access functions: [...]
   
   Ah, great, those functions are exactly what I had missed!
   
   I improved my allocation algorithm accordingly here: e628688




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r805997122



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       I'm not sure this is useful, because the underlying allocator probably already relies on mmap for large-ish allocations.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       This algorithm seems quite complicated and costly (in terms of inter-thread synchronization).
   Instead, how about maintaining lazily-allocated power-of-two-sized entries (perhaps starting with 4096 to avoid having a ton of tiny size classes)? Something like:
   ```c++
   
   static constexpr int kMinPower2Size = 12;  // 4096 bytes
   static constexpr int kNumAllocClasses = 64 - kMinPower2Size;
   // Addresses of allocated zeros, for each size class
   std::array<std::atomic<uintptr_t>, kNumAllocClasses> allocated_;
   std::mutex allocation_mutex_;
   
   Result<const uint8_t*> GetImmutableZeros(int64_t size) override {
     const auto size_class = std::max(0, bit_util::Log2(size) - kMinPower2Size);
     auto addr = allocated_[size_class].load();
     if (ARROW_PREDICT_TRUE(addr != nullptr)) {
       // Fast path: already allocated
       return reinterpret_cast<const uint8_t*>(addr);
     }
     // Slow path: allocate if not already done
     std::lock_guard<std::mutex> lock(allocation_mutex_);
     auto addr = allocated_[size_class].load();
     if (addr == nullptr) {
       const int64_t alloc_size = static_cast<int64_t>(1) << (size_class + kMinPower2Size);
       ARROW_ASSIGN_OR_RAISE(const uint8_t* data, AllocateImmutableZeros(size));
       allocated_[size_class] = addr = reinterpret_cast<uintptr_t>(data);
     }
     return reinterpret_cast<const uint8_t*>(addr);
   }
   ```

##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       Hmm, I honestly don't understand why it would be useful to implement this, while we're already exposing additional methods to `MemoryPool`.

##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -536,6 +543,39 @@ int64_t MemoryPool::max_memory() const { return -1; }
 // MemoryPool implementation that delegates its core duty
 // to an Allocator class.
 
+class ImmutableZeros : public Buffer {
+ public:
+  explicit ImmutableZeros(uint8_t* data, int64_t size, MemoryPool* pool)
+      : Buffer(data, size, CPUDevice::immutable_zeros_memory_manager(pool)),
+        pool_(pool) {}
+
+  ImmutableZeros() : Buffer(nullptr, 0), pool_(nullptr) {}
+
+  ~ImmutableZeros() override;
+
+  // Prevent copies and handle moves explicitly to avoid double free

Review comment:
       It sounds hackish to have a separate ImmutableZeros class, IMHO.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806985523



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);
+
+    // Attempt to allocate the block.
+    uint8_t* data = nullptr;
+    auto result = AllocateImmutableZeros(alloc_size, &data);
+
+    // If we fail to do so, fall back to trying to allocate the requested size
+    // exactly as a last-ditch effort.
+    if (!result.ok() || data == nullptr) {
+      alloc_size = size;
+      RETURN_NOT_OK(AllocateImmutableZeros(alloc_size, &data));
+    }
+    DCHECK_NE(data, nullptr);
+
+    // Move ownership of the data block into an ImmutableZeros object. It will
+    // free the block when destroyed, i.e. when all shared_ptr references to it
+    // are reset or go out of scope.
+    current_buffer = std::make_shared<ImmutableZeros>(data, alloc_size, this);
+
+    // Store a reference to the new block in the cache, so subsequent calls to
+    // this function (from this thread or from other threads) can use it, too.
+    atomic_store(&immutable_zeros_cache_, current_buffer);
+
+    return std::move(current_buffer);
+  }
+
+  void ReleaseUnused() override {
+    // Get rid of the ImmutableZeros cache if we're the only one using it. If
+    // there are other pieces of code using it, getting rid of the cache won't
+    // deallocate it anyway, so it's better to hold onto it.
+    {
+      auto cache = atomic_load(&immutable_zeros_cache_);
+
+      // Because we now have a copy in our thread, the use count will be 2 if
+      // nothing else is using it.
+      if (cache.use_count() <= 2) {
+        atomic_store(&immutable_zeros_cache_, std::shared_ptr<ImmutableZeros>());

Review comment:
       Is it a problem if `immutable_zeros_cache_` was modified in the meantime? Probably not, just checking.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r807160841



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);
+
+    // Attempt to allocate the block.
+    uint8_t* data = nullptr;
+    auto result = AllocateImmutableZeros(alloc_size, &data);
+
+    // If we fail to do so, fall back to trying to allocate the requested size
+    // exactly as a last-ditch effort.
+    if (!result.ok() || data == nullptr) {

Review comment:
       You shouldn't be able to in the default implementation, but `AllocateImmutableZeros` is virtual, so someone could override it in a custom implementation. Still, I suppose I should have gotten rid of that. As a first pass I had it throw an error properly when data is set to null or not modified, then changed my mind and did a `DCHECK` instead. Now the null check there indeed doesn't really do anything useful anymore. 3dda474




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r807797367



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       Unfortunately [the atomic operations don't exist in libstdc++ 4.9 and down](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57250), as still used in CI via RTools 35. So I guess in that case, my algorithm has to resort to a mutex after all. f67f75c




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] lidavidm commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

lidavidm commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r809179785



##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       You do have good points. These abstractions are rather new, or at least under-used, and could be refined.
   
   Most APIs do indeed work with MemoryPool, not MemoryManager; MemoryPool has been there for quite a long time, and MemoryManager was only added relatively recently, to support non-CPU memory. The right place would be JIRA and/or the ML, I think. (Or just an ML post referencing this thread; GitHub is OK for discussion, we might just need more eyes.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r803686136



##########
File path: cpp/src/arrow/memory_pool.h
##########
@@ -63,6 +63,42 @@ class MemoryPoolStats {
 /// take care of the required 64-byte alignment.
 class ARROW_EXPORT MemoryPool {
  public:
+  class ARROW_EXPORT ImmutableZeros {

Review comment:
       Okay, `ImmutableZeros` now implements `Buffer` and has its implementation hidden.
   
   As I looked at the Buffer interface a bit better I also figured a specialized memory manager was probably in order, since these buffers have different, managed allocate operations. With that, I *think* the interface for using immutable zero CPU buffers is now fully symmetric to the one for regular, mutable CPU buffers.
   
   81aa078




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806983055



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);
+
+    // Attempt to allocate the block.
+    uint8_t* data = nullptr;
+    auto result = AllocateImmutableZeros(alloc_size, &data);
+
+    // If we fail to do so, fall back to trying to allocate the requested size
+    // exactly as a last-ditch effort.
+    if (!result.ok() || data == nullptr) {

Review comment:
       Hmm, why is it possible to get `nullptr` here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806098292



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       The difference is that, because of the flags passed to mmap, the kernel doesn't have to allocate (or clear) physical memory at all, as opposed to the allocate + memset that the alternative implementation would do to satisfy mutability. You could allocate several terabytes of virtual memory with this call if you'd want to, and it would cost you zero physical bytes (page table structures aside). At least, it works that way on my kernel.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806148832



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > An inability to free something, especially if that something is large, feels like bad news to me, so I'm hesitant to just copy your version in and call it a day.
   
   That is true. However, it shouldn't be a concern if we can ensure that the pages don't actually allocate physical memory (or almost none of it, such as `/dev/zero`).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r802865182



##########
File path: cpp/src/arrow/memory_pool.h
##########
@@ -63,6 +63,42 @@ class MemoryPoolStats {
 /// take care of the required 64-byte alignment.
 class ARROW_EXPORT MemoryPool {
  public:
+  class ARROW_EXPORT ImmutableZeros {

Review comment:
       I think you're right, actually. My reasoning for introducing it was that I needed something to model shared ownership of a region of zero-allocated (virtual) memory for a bunch of Buffers (`ImmutableZerosPoolBuffer` specifically, as created using `MakeBufferOfZeros()`), so some RAII object inside a `shared_ptr` would do the trick. I didn't even consider that that RAII object could also just be a Buffer implementation, with its implementation details hidden that way. It felt very wrong to define a public class for something this specific in such a prominent place to begin with, so I'm glad to get rid of it... The current `ImmutableZerosPoolBuffer` class could then be used much more generically for shared-ownership views of some buffer, unless that already exists (?), in which case I can just reuse that.
   
   I don't think I'll get around to refactoring this today, but I'll take a closer look tomorrow (and then rebase after that).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806152099



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > * I'm not sure if it's even needed, though: it isn't if you can safely make a copy of a `shared_ptr` while another thread may be updating the contained pointer.
   
   You can safely update the object _pointed to_. However, changing the _pointer_ requires use of dedicated atomic access functions: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r802807048



##########
File path: cpp/src/arrow/array/util.h
##########
@@ -37,7 +37,20 @@ namespace arrow {
 ARROW_EXPORT
 std::shared_ptr<Array> MakeArray(const std::shared_ptr<ArrayData>& data);
 
-/// \brief Create a strongly-typed Array instance with all elements null
+/// \brief Create a strongly-typed mutable Array instance with all elements initially set
+/// to null
+/// \param[in] type the array type \param[in] length the array length
+/// \param[in] pool the memory pool to allocate memory from
+ARROW_EXPORT
+Result<std::shared_ptr<Array>> MakeMutableArrayOfNull(
+    const std::shared_ptr<DataType>& type, int64_t length,
+    MemoryPool* pool = default_memory_pool());
+
+/// \brief Create a strongly-typed immutable Array instance with all elements null
+///
+/// This function may reuse a single zero buffer, but may also defer to
+/// MakeArrayOfNull().

Review comment:
       Yeah, it was. I was on the fence about whether `MakeArrayOfNull()` should be the immutable or mutable variant and refactored it once or twice, eventually choosing the former since most existing users of the function didn't need mutability. I guess that one slipped through the cracks. e878edb

##########
File path: cpp/src/arrow/array/util.h
##########
@@ -46,7 +59,20 @@ Result<std::shared_ptr<Array>> MakeArrayOfNull(const std::shared_ptr<DataType>&
                                                int64_t length,
                                                MemoryPool* pool = default_memory_pool());
 
-/// \brief Create an Array instance whose slots are the given scalar
+/// \brief Create a mutable Array instance whose slots are initialized with the given
+/// scalar
+/// \param[in] scalar the value with which to fill the array
+/// \param[in] length the array length
+/// \param[in] pool the memory pool to allocate memory from
+ARROW_EXPORT
+Result<std::shared_ptr<Array>> MakeMutableArrayFromScalar(
+    const Scalar& scalar, int64_t length, MemoryPool* pool = default_memory_pool());
+
+/// \brief Create an immutable Array instance whose slots are set to the given scalar
+///
+/// This function may reuse buffers if they contain the same (repeated) value to save
+/// memory, but may also defer to MakeArrayFromScalar().

Review comment:
       Also in e878edb




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r802860477



##########
File path: cpp/src/arrow/array/util.cc
##########
@@ -534,6 +533,151 @@ class NullArrayFactory {
   std::shared_ptr<Buffer> buffer_;
 };
 
+// mutable version of ImmutableNullArrayFactory
+class NullArrayFactory {
+ private:
+  Result<std::shared_ptr<Buffer>> CreateZeroByteBuffer(size_t scalar_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer(length_ * scalar_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroOffsetBuffer(size_t index_size_bytes) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ + 1) * index_size_bytes, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  Result<std::shared_ptr<Buffer>> CreateZeroBitBuffer(size_t scalar_size_bits) const {
+    ARROW_ASSIGN_OR_RAISE(auto buffer,
+                          AllocateBuffer((length_ * scalar_size_bits + 7) / 8, pool_));
+    std::memset(buffer->mutable_data(), 0, buffer->size());
+    return std::shared_ptr<Buffer>(std::move(buffer));
+  }
+
+  static Result<std::shared_ptr<Buffer>> CreateEmptyBuffer() { return AllocateBuffer(0); }
+
+ public:
+  NullArrayFactory(MemoryPool* pool, const std::shared_ptr<DataType>& type,
+                   int64_t length)
+      : pool_(pool), type_(type), length_(length) {}
+
+  Result<std::shared_ptr<ArrayData>> Create() {
+    std::vector<std::shared_ptr<ArrayData>> child_data(type_->num_fields());
+    ARROW_ASSIGN_OR_RAISE(auto validity, CreateZeroBitBuffer(1));
+    out_ = ArrayData::Make(type_, length_, {validity}, child_data, length_, 0);
+    RETURN_NOT_OK(VisitTypeInline(*type_, this));
+    return out_;
+  }
+
+  Status Visit(const NullType&) {
+    out_->buffers.resize(1, nullptr);
+    return Status::OK();
+  }
+
+  Status Visit(const FixedWidthType& type) {
+    out_->buffers.resize(2);
+    // values
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1], CreateZeroBitBuffer(type.bit_width()));
+    return Status::OK();
+  }
+
+  template <typename T>
+  enable_if_base_binary<T, Status> Visit(const T&) {
+    out_->buffers.resize(3);
+    // offsets
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1],
+                          CreateZeroOffsetBuffer(sizeof(typename T::offset_type)));
+    // values
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[2], CreateEmptyBuffer());
+    return Status::OK();
+  }
+
+  template <typename T>
+  enable_if_var_size_list<T, Status> Visit(const T& type) {
+    out_->buffers.resize(2);
+    // offsets
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1],
+                          CreateZeroOffsetBuffer(sizeof(typename T::offset_type)));
+    // values
+    ARROW_ASSIGN_OR_RAISE(out_->child_data[0], CreateChild(type, 0, /*length=*/0));
+    return Status::OK();
+  }
+
+  Status Visit(const FixedSizeListType& type) {
+    ARROW_ASSIGN_OR_RAISE(out_->child_data[0],
+                          CreateChild(type, 0, length_ * type.list_size()));
+    return Status::OK();
+  }
+
+  Status Visit(const StructType& type) {
+    for (int i = 0; i < type_->num_fields(); ++i) {
+      ARROW_ASSIGN_OR_RAISE(out_->child_data[i], CreateChild(type, i, length_));
+    }
+    return Status::OK();
+  }
+
+  Status Visit(const UnionType& type) {
+    out_->buffers.resize(2);
+
+    // First buffer is always null
+    out_->buffers[0] = nullptr;
+
+    // type ID buffer
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1], AllocateBuffer(length_, pool_));
+    std::memset(out_->buffers[1]->mutable_data(), type.type_codes()[0], length_);
+
+    // For sparse unions, we now create children with the same length as the
+    // parent
+    int64_t child_length = length_;
+    if (type.mode() == UnionMode::DENSE) {
+      // For dense unions, we set the offsets to all zero and create children
+      // with length 1
+      out_->buffers.resize(3);
+      ARROW_ASSIGN_OR_RAISE(out_->buffers[2], CreateZeroByteBuffer(sizeof(int32_t)));
+
+      child_length = 1;
+    }
+    for (int i = 0; i < type_->num_fields(); ++i) {
+      ARROW_ASSIGN_OR_RAISE(out_->child_data[i], CreateChild(type, i, child_length));
+    }
+    return Status::OK();
+  }
+
+  Status Visit(const DictionaryType& type) {
+    out_->buffers.resize(2);
+    // dictionary indices
+    ARROW_ASSIGN_OR_RAISE(out_->buffers[1], CreateZeroBitBuffer(type.bit_width()));
+    // dictionary data
+    ARROW_ASSIGN_OR_RAISE(auto typed_null_dict, MakeArrayOfNull(type.value_type(), 0));
+    out_->dictionary = typed_null_dict->data();
+    return Status::OK();
+  }
+
+  Status Visit(const ExtensionType& type) {
+    out_->child_data.resize(type.storage_type()->num_fields());
+    RETURN_NOT_OK(VisitTypeInline(*type.storage_type(), this));
+    return Status::OK();
+  }
+
+  Status Visit(const DataType& type) {
+    return Status::NotImplemented("construction of all-null ", type);
+  }
+
+  Result<std::shared_ptr<ArrayData>> CreateChild(const DataType& type, int i,
+                                                 int64_t length) {
+    ImmutableNullArrayFactory child_factory(pool_, type.field(i)->type(), length);

Review comment:
       Yes it does, good catch. In fact, it looks like I completely forgot to refactor this part of the code for mutable vs immutable as the implied version. Fixed in b30526c and then refactored in 21507bf.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r807233943



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       I think 3738819 does what you want in terms of refactoring? I left the CMake option to disable the OS-specific stuff and use the regular memory pool allocator in, so people can still opt to use that allocator (and the statistics tracking thereof) in favor of `calloc()`.
   
   I don't have a Windows dev environment set up though, so I'd prefer to open a followup issue for VirtualAlloc instead, so someone else can pick it up if they want. Same for mac; I guess mmap will probably work, but I can't test it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r808167167



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS

Review comment:
       3738819 (already done earlier, forgot to mention in this thread)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#issuecomment-1041823183


   @pitrou @lidavidm I think I've now resolved (or defended my choices for) everything that came up thus far, and CI is happy. Let me know if I missed anything and/or you think there's more that needs to be changed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] github-actions[bot] commented on pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#issuecomment-1009261823


   https://issues.apache.org/jira/browse/ARROW-7051


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r808162182



##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       `CPUImmutableZerosMemoryManager` doesn't really respect the MemoryManager contract, since it only allocates immutable memory. So I don't think it's useful implementing it: you won't be able to pass a `CPUImmutableZerosMemoryManager` to arbitrary Arrow APIs and hope that it works.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r809153416



##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager {
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       > And to me 'managing' means 'can read the data backed by the buffer', not necessarily 'will allocate you exactly the same buffer'.
   
   But that's what `Device` represents according to its [docstring](https://github.com/apache/arrow/blob/74f512260fa69903feac61e1287f6954a3d98204/cpp/src/arrow/device.h#L36-L39)? In contrast, you can have multiple memory managers per device, expressly intended to [provide memory management primitives](https://github.com/apache/arrow/blob/74f512260fa69903feac61e1287f6954a3d98204/cpp/src/arrow/device.h#L81-L85) for some particular allocation method, such as for some particular memory pool. The allocation method for immutable zero buffers is fundamentally different, so I stand by my interpretation of those docstrings that immutable zero buffers warrant a new memory manager class. However, I certainly can't vouch for the correctness of those docstrings or the usefulness/applicability of this layer if defined as such.
   
   Furthermore, if the intention of associating a `MemoryManager` with a buffer is only to specify which device can actually access it and how (which seems very sensible to me) rather than also allowing similar buffers to be allocated (which indeed does not seem very useful to me), the association shouldn't have been a memory manager but a `Device`. After all, the only other information provided by `MemoryManager` on top of its associated `Device` is how to allocate buffers. The rest of its methods should IMO be moved to `Device`. If `Device` should be kept clean of memory-related stuff, there should be an intermediate layer called `MemoryRegion` or `AddressSpace` or something that decouples allocation strategies from address space/the significance and accessibility of a pointer. In general a device can have multiple address spaces, anyway (such as NUMA nodes on CPU, different memory banks on an accelerator, etc.). Whether `MemoryManager` is still relevant on top of `MemoryPool` at t
 hat point is disputable, but I guess the biggest difference is that `MemoryPool` operates on pointers to bytes, whereas `MemoryManager` yields buffers that clean themselves up via RAII.
   
   Nevertheless, I feel like this discussion is getting to the point where it shouldn't be held in a github thread on a mostly unrelated PR (what *would* be the right place? a JIRA issue?), and in general I don't really have the bandwidth or the use cases right now to dive further into this rabbit hole. So, for now, I've just reverted the additions I made relating to `MemoryManager` stuff, and we (or other people) can delve into this if/when it becomes relevant. 976dcf6




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806152099



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.

Review comment:
       > * I'm not sure if it's even needed, though: it isn't if you can safely make a copy of a `shared_ptr` while another thread may be updating the contained pointer.
   
   You can safely update the object _pointed to_ (well, except that accessing that object from different threads then becomes racy). However, changing the _pointer_ requires use of dedicated atomic access functions: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r807164882



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));
+      if (*out == MAP_FAILED) {
+        auto err = errno;
+        return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ",
+                                   strerror(err));
+      }
+      return Status::OK();
+    }
+#endif
+    // TODO: jemalloc and mimalloc support zero-initialized allocations as
+    //  well, which might be faster than allocate + memset.
+    RETURN_NOT_OK(Allocate(size, out));
+    std::memset(*out, 0, size);
+    return Status::OK();
+  }
+
+  void FreeImmutableZeros(uint8_t* buffer, int64_t size) override {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      munmap(buffer, size);
+      return;
+    }
+#endif
+    Free(buffer, size);
+  }
+
+ public:
+  Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override {
+    // Thread-safely get the current largest buffer of zeros.
+    auto current_buffer = atomic_load(&immutable_zeros_cache_);
+
+    // If this buffer satisfies the requirements, return it.
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Acquire the lock for allocating a new buffer.
+    std::lock_guard<std::mutex> gg(immutable_zeros_mutex_);
+
+    // Between our previous atomic load and acquisition of the lock, another
+    // thread may have allocated a buffer. So we need to check again.
+    current_buffer = atomic_load(&immutable_zeros_cache_);
+    if (current_buffer && current_buffer->size() >= size) {
+      return std::move(current_buffer);
+    }
+
+    // Let's now figure out a good size to allocate. This is done
+    // heuristically, with the following rules:
+    //  - allocate at least the requested size (obviously);
+    //  - allocate at least 2x the previous size;
+    //  - allocate at least kMinAllocSize bytes (to avoid lots of small
+    //    allocations).
+    static const int64_t kMinAllocSize = 4096;
+    int64_t alloc_size =
+        std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize);
+
+    // Attempt to allocate the block.
+    uint8_t* data = nullptr;
+    auto result = AllocateImmutableZeros(alloc_size, &data);
+
+    // If we fail to do so, fall back to trying to allocate the requested size
+    // exactly as a last-ditch effort.
+    if (!result.ok() || data == nullptr) {
+      alloc_size = size;
+      RETURN_NOT_OK(AllocateImmutableZeros(alloc_size, &data));
+    }
+    DCHECK_NE(data, nullptr);
+
+    // Move ownership of the data block into an ImmutableZeros object. It will
+    // free the block when destroyed, i.e. when all shared_ptr references to it
+    // are reset or go out of scope.
+    current_buffer = std::make_shared<ImmutableZeros>(data, alloc_size, this);
+
+    // Store a reference to the new block in the cache, so subsequent calls to
+    // this function (from this thread or from other threads) can use it, too.
+    atomic_store(&immutable_zeros_cache_, current_buffer);
+
+    return std::move(current_buffer);
+  }
+
+  void ReleaseUnused() override {
+    // Get rid of the ImmutableZeros cache if we're the only one using it. If
+    // there are other pieces of code using it, getting rid of the cache won't
+    // deallocate it anyway, so it's better to hold onto it.
+    {
+      auto cache = atomic_load(&immutable_zeros_cache_);
+
+      // Because we now have a copy in our thread, the use count will be 2 if
+      // nothing else is using it.
+      if (cache.use_count() <= 2) {
+        atomic_store(&immutable_zeros_cache_, std::shared_ptr<ImmutableZeros>());

Review comment:
       It's not entirely thread-safe in the sense that if other threads are doing stuff with the cache while `ReleaseUnused()` is called, `ReleaseUnused()` may or may not flush the cache due to a data race in `use_count`. But it doesn't break anything in either case, and `ReleaseUnused()` is already documented to be best-effort.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jvanstraten commented on a change in pull request #12116: ARROW-7051: [C++] Improve MakeArrayOfNull to support creation of multiple arrays

Posted by GitBox <gi...@apache.org>.

jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r806700264



##########
File path: cpp/src/arrow/memory_pool.cc
##########
@@ -603,14 +643,109 @@ class BaseMemoryPoolImpl : public MemoryPool {
     stats_.UpdateAllocatedBytes(-size);
   }
 
-  void ReleaseUnused() override { Allocator::ReleaseUnused(); }
+ protected:
+  virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) {
+#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS
+    if (size > 0) {
+      *out = static_cast<uint8_t*>(mmap(
+          nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0));

Review comment:
       In fact, `Allocate()` generally does not return zero-initialized memory, due to the poisoning mechanic. I would rather not dive into this rabbit hole as well right now, so I just added a TODO comment in the code. I can file a followup JIRA for it too, if you want.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org