You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/26 16:02:19 UTC

[GitHub] [arrow] lwhite1 commented on a diff in pull request #14506: ARROW-16673: [Java] WIP: proper handling of imported buffers

lwhite1 commented on code in PR #14506:
URL: https://github.com/apache/arrow/pull/14506#discussion_r1005584924


##########
java/memory/memory-core/src/main/java/org/apache/arrow/memory/ForeignAllocation.java:
##########
@@ -0,0 +1,30 @@
+package org.apache.arrow.memory;
+
+/**
+ * An allocation of memory that does not come from a BufferAllocator, but rather an outside source (like JNI).
+ */
+public abstract class ForeignAllocation extends AllocationManager {

Review Comment:
   For consistency (and to make more clear the extension relationship) should the class name be ForeignAllocationManager?



##########
java/memory/memory-core/src/main/java/org/apache/arrow/memory/ReferenceManager.java:
##########
@@ -18,8 +18,10 @@
 package org.apache.arrow.memory;
 
 /**
- * Reference Manager manages one or more ArrowBufs that share the
- * reference count for the underlying memory chunk.
+ * ReferenceManager is the reference count for one or more allocations.

Review Comment:
   I'm not sure "ReferenceManager is the reference count..." is completely accurate. Maybe "ReferenceManager is the reference count manager..." or "ReferenceManager tracks the reference count..."



##########
java/c/src/main/java/org/apache/arrow/c/ArrayImporter.java:
##########
@@ -44,7 +46,7 @@ final class ArrayImporter {
   private final FieldVector vector;
   private final DictionaryProvider dictionaryProvider;
 
-  private CDataReferenceManager referenceManager;
+  private CDataArrayHandle underlyingAllocation;

Review Comment:
   Where is CDataArrayHandle defined?



##########
java/memory/memory-core/src/main/java/org/apache/arrow/memory/AllocationManager.java:
##########
@@ -22,43 +22,35 @@
 import org.apache.arrow.util.Preconditions;
 
 /**
- * The abstract base class of AllocationManager.
+ * An AllocationManager is the implementation of a physical memory allocation.
  *
- * <p>Manages the relationship between one or more allocators and a particular UDLE. Ensures that
- * one allocator owns the
- * memory that multiple allocators may be referencing. Manages a BufferLedger between each of its
- * associated allocators.
+ * <p>Manages the relationship between the allocators and a particular memory allocation. Ensures that
+ * one allocator owns the memory that multiple allocators may be referencing. Manages a BufferLedger between
+ * each of its associated allocators. It does not track the reference count; that is the role of {@link BufferLedger}
+ * (aka {@link ReferenceManager}).
  *
- * <p>The only reason that this isn't package private is we're forced to put ArrowBuf in Netty's
- * package which need access
- * to these objects or methods.
+ * <p>This is a public interface implemented by concrete allocator implementations (e.g. Netty or Unsafe).
  *
  * <p>Threading: AllocationManager manages thread-safety internally. Operations within the context
- * of a single BufferLedger
- * are lockless in nature and can be leveraged by multiple threads. Operations that cross the
- * context of two ledgers
- * will acquire a lock on the AllocationManager instance. Important note, there is one
- * AllocationManager per
- * UnsafeDirectLittleEndian buffer allocation. As such, there will be thousands of these in a
- * typical query. The
- * contention of acquiring a lock on AllocationManager should be very low.
+ * of a single BufferLedger are lockless in nature and can be leveraged by multiple threads. Operations that cross the
+ * context of two ledgers will acquire a lock on the AllocationManager instance. Important note, there is one
+ * AllocationManager per physical buffer allocation. As such, there will be thousands of these in a
+ * typical query. The contention of acquiring a lock on AllocationManager should be very low.
  */
 public abstract class AllocationManager {
-
-  private static final AtomicLong MANAGER_ID_GENERATOR = new AtomicLong(0);
-
+  // The RootAllocator we are associated with. An allocation can only ever be associated with a single RootAllocator.
   private final BufferAllocator root;
-  private final long allocatorManagerId = MANAGER_ID_GENERATOR.incrementAndGet();
-  // ARROW-1627 Trying to minimize memory overhead caused by previously used IdentityHashMap
-  // see JIRA for details
+  // An allocation can be tracked by multiple allocators. (This is because an allocator is more like a ledger.)
+  // All such allocators track reference counts individually, via BufferLedger instances. When an individual
+  // reference count reaches zero, the allocator will be dissociated from this allocation. If that was via the
+  // owningLedger, then no more allocators should be tracking this allocation, and the allocation will be freed.
+  // ARROW-1627: Trying to minimize memory overhead caused by previously used IdentityHashMap
   private final LowCostIdentityHashMap<BufferAllocator, BufferLedger> map = new LowCostIdentityHashMap<>();
-  private final long amCreationTime = System.nanoTime();
-
-  // The ReferenceManager created at the time of creation of this AllocationManager
-  // is treated as the owning reference manager for the underlying chunk of memory
-  // managed by this allocation manager
+  // The primary BufferLedger (i.e. reference count) tracking this allocation.
+  // This is mostly a semantic constraint on the API user: if the reference count reaches 0 in the owningLedger, then
+  // there are not supposed to be any references through other allocators. In practice, this doesn't do anything
+  // as the implementation just forces ownership to be transferred to one of the other extant references.

Review Comment:
   nit: ... one of the other extent reference managers.



##########
java/memory/memory-core/src/main/java/org/apache/arrow/memory/ReferenceManager.java:
##########
@@ -91,7 +95,7 @@ public interface ReferenceManager {
   /**
    * Transfer the memory accounting ownership of this ArrowBuf to another allocator.
    * This will generate a new ArrowBuf that carries an association with the underlying memory
-   * for the given ArrowBuf

Review Comment:
   I think this method comment is misleading. While it _can_ transfer the memory accounting ownership ... to another allocator, it can also create a new ArrowBuf in the same allocator if the allocator argument is the same as the existing allocator. This can be useful for creating a new Vector over the same underlying memory. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org