You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "hu6360567 (via GitHub)" <gi...@apache.org> on 2023/06/05 06:51:17 UTC

[GitHub] [arrow] hu6360567 opened a new issue, #35905: [Python] pyarrow.jvm.record_batch may cause memory leak with scoped allocator on java side

hu6360567 opened a new issue, #35905:
URL: https://github.com/apache/arrow/issues/35905

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   I was wrapping JDBC to ARROW for python scripts, then the life scoped `RootAllocator` raised `java.lang.IllegalStateException: java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding buffers allocated`, while the exactly same code in Java works as excepted.
   Here is the snippet to reproduce error:
   ```python
   import jpype.imports
   from jpype.types import *
   
   jpype.startJVM("-Darrow.memory.debug.allocator=true", "-Dorg.slf4j.simpleLogger.defaultLogLevel=trace",
                  classpath=["classpath/*"])
   
   import java
   import org
   import pyarrow.jvm
   
   if __name__ == "__main__":
       java.lang.Class.forName("org.mariadb.jdbc.Driver")
   
       connection = java.sql.DriverManager.getConnection(
           "jdbc:mariadb://127.0.0.1:3306/db_test",
           "root", "passwd")
   
       pStmt = connection.prepareStatement("SELECT * FROM test_table LIMIT 10000")
       pStmt.execute()
   
       rs = pStmt.getResultSet()
   
       total = 0
   
       with org.apache.arrow.memory.RootAllocator() as allocator:
           with org.apache.arrow.adapter.jdbc.JdbcToArrow.sqlToArrowVectorIterator(rs, allocator) as it:
               while it.hasNext():
                   with it.next() as rb:
                       # should we wrap the following code with a contextmanager to make it immediately delete when it goes out of scope
                       record_batch = pyarrow.jvm.record_batch(rb)
                       total += record_batch.num_rows
                       print(total)
                       # record_batch is still referenced by python, unwarp the following line works as expected.
                       # del record_batch
   ```
   The output is:
   ```logs
   ...
   [main] INFO org.apache.arrow.memory.BaseAllocator - Debug mode enabled.
   [main] INFO org.apache.arrow.memory.DefaultAllocationManagerOption - allocation manager type not specified, using netty as the default type
   [main] INFO org.apache.arrow.memory.CheckAllocator - Using DefaultAllocationManager at memory-netty-12.0.0.jar!/org/apache/arrow/memory/DefaultAllocationManagerFactory.class
   [main] DEBUG io.netty.util.internal.logging.InternalLoggerFactory - Using SLF4J as the default logging framework
   [main] DEBUG io.netty.util.ResourceLeakDetector - -Dio.netty.leakDetection.level: simple
   [main] DEBUG io.netty.util.ResourceLeakDetector - -Dio.netty.leakDetection.targetRecords: 4
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - -Dio.netty.noUnsafe: false
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - Java version: 11
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - sun.misc.Unsafe.theUnsafe: available
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - sun.misc.Unsafe.copyMemory: available
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - sun.misc.Unsafe.storeFence: available
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - java.nio.Buffer.address: available
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - direct buffer constructor: unavailable
   java.lang.UnsupportedOperationException: Reflective setAccessible(true) disabled
   	at io.netty.util.internal.ReflectionUtil.trySetAccessible(ReflectionUtil.java:31)
   	at io.netty.util.internal.PlatformDependent0$5.run(PlatformDependent0.java:288)
   	at java.base/java.security.AccessController.doPrivileged(Native Method)
   	at io.netty.util.internal.PlatformDependent0.<clinit>(PlatformDependent0.java:282)
   	at io.netty.util.internal.PlatformDependent.isAndroid(PlatformDependent.java:333)
   	at io.netty.util.internal.PlatformDependent.<clinit>(PlatformDependent.java:88)
   	at io.netty.buffer.PooledByteBufAllocator.<clinit>(PooledByteBufAllocator.java:116)
   	at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49)
   	at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51)
   	at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26)
   	at java.base/java.lang.Class.forName0(Native Method)
   	at java.base/java.lang.Class.forName(Class.java:315)
   	at org.apache.arrow.memory.DefaultAllocationManagerOption.getFactory(DefaultAllocationManagerOption.java:108)
   	at org.apache.arrow.memory.DefaultAllocationManagerOption.getDefaultAllocationManagerFactory(DefaultAllocationManagerOption.java:98)
   	at org.apache.arrow.memory.BaseAllocator$Config.getAllocationManagerFactory(BaseAllocator.java:772)
   	at org.apache.arrow.memory.ImmutableConfig.access$801(ImmutableConfig.java:24)
   	at org.apache.arrow.memory.ImmutableConfig$InitShim.getAllocationManagerFactory(ImmutableConfig.java:83)
   	at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:47)
   	at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:24)
   	at org.apache.arrow.memory.ImmutableConfig$Builder.build(ImmutableConfig.java:485)
   	at org.apache.arrow.memory.BaseAllocator.<clinit>(BaseAllocator.java:61)
   	at java.base/java.lang.Class.forName0(Native Method)
   	at java.base/java.lang.Class.forName(Class.java:398)
   	at org.jpype.pkg.JPypePackage.getObject(Unknown Source)
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - java.nio.Bits.unaligned: available, true
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable
   java.lang.IllegalAccessException: class io.netty.util.internal.PlatformDependent0$7 cannot access class jdk.internal.misc.Unsafe (in module java.base) because module java.base does not export jdk.internal.misc to unnamed module @14028087
   	at java.base/jdk.internal.reflect.Reflection.newIllegalAccessException(Reflection.java:361)
   	at java.base/java.lang.reflect.AccessibleObject.checkAccess(AccessibleObject.java:591)
   	at java.base/java.lang.reflect.Method.invoke(Method.java:558)
   	at io.netty.util.internal.PlatformDependent0$7.run(PlatformDependent0.java:410)
   	at java.base/java.security.AccessController.doPrivileged(Native Method)
   	at io.netty.util.internal.PlatformDependent0.<clinit>(PlatformDependent0.java:401)
   	at io.netty.util.internal.PlatformDependent.isAndroid(PlatformDependent.java:333)
   	at io.netty.util.internal.PlatformDependent.<clinit>(PlatformDependent.java:88)
   	at io.netty.buffer.PooledByteBufAllocator.<clinit>(PooledByteBufAllocator.java:116)
   	at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49)
   	at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51)
   	at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26)
   	at java.base/java.lang.Class.forName0(Native Method)
   	at java.base/java.lang.Class.forName(Class.java:315)
   	at org.apache.arrow.memory.DefaultAllocationManagerOption.getFactory(DefaultAllocationManagerOption.java:108)
   	at org.apache.arrow.memory.DefaultAllocationManagerOption.getDefaultAllocationManagerFactory(DefaultAllocationManagerOption.java:98)
   	at org.apache.arrow.memory.BaseAllocator$Config.getAllocationManagerFactory(BaseAllocator.java:772)
   	at org.apache.arrow.memory.ImmutableConfig.access$801(ImmutableConfig.java:24)
   	at org.apache.arrow.memory.ImmutableConfig$InitShim.getAllocationManagerFactory(ImmutableConfig.java:83)
   	at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:47)
   	at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:24)
   	at org.apache.arrow.memory.ImmutableConfig$Builder.build(ImmutableConfig.java:485)
   	at org.apache.arrow.memory.BaseAllocator.<clinit>(BaseAllocator.java:61)
   	at java.base/java.lang.Class.forName0(Native Method)
   	at java.base/java.lang.Class.forName(Class.java:398)
   	at org.jpype.pkg.JPypePackage.getObject(Unknown Source)
   [main] DEBUG io.netty.util.internal.PlatformDependent0 - java.nio.DirectByteBuffer.<init>(long, int): unavailable
   [main] DEBUG io.netty.util.internal.PlatformDependent - sun.misc.Unsafe: available
   [main] DEBUG io.netty.util.internal.PlatformDependent - maxDirectMemory: 4294967296 bytes (maybe)
   [main] DEBUG io.netty.util.internal.PlatformDependent - -Dio.netty.tmpdir: /var/folders/j4/s9dpn3612j70kyt4mv1yzmkm0000gp/T (java.io.tmpdir)
   [main] DEBUG io.netty.util.internal.PlatformDependent - -Dio.netty.bitMode: 64 (sun.arch.data.model)
   [main] DEBUG io.netty.util.internal.PlatformDependent - Platform: MacOS
   [main] DEBUG io.netty.util.internal.PlatformDependent - -Dio.netty.maxDirectMemory: -1 bytes
   [main] DEBUG io.netty.util.internal.PlatformDependent - -Dio.netty.uninitializedArrayAllocationThreshold: -1
   [main] DEBUG io.netty.util.internal.CleanerJava9 - java.nio.ByteBuffer.cleaner(): available
   [main] DEBUG io.netty.util.internal.PlatformDependent - -Dio.netty.noPreferDirect: false
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.numHeapArenas: 16
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.numDirectArenas: 16
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.pageSize: 8192
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.maxOrder: 9
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.chunkSize: 4194304
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.smallCacheSize: 256
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.normalCacheSize: 64
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.maxCachedBufferCapacity: 32768
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.cacheTrimInterval: 8192
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.cacheTrimIntervalMillis: 0
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.useCacheForAllThreads: false
   [main] DEBUG io.netty.buffer.PooledByteBufAllocator - -Dio.netty.allocator.maxCachedByteBuffersPerChunk: 1023
   [main] DEBUG io.netty.util.internal.InternalThreadLocalMap - -Dio.netty.threadLocalMap.stringBuilder.initialSize: 1024
   [main] DEBUG io.netty.util.internal.InternalThreadLocalMap - -Dio.netty.threadLocalMap.stringBuilder.maxSize: 4096
   [allocation.logger] TRACE arrow.allocator - Memory Usage: 
   io.netty.buffer.PooledByteBufAllocatorL@3726977f
   [main] DEBUG io.netty.buffer.AbstractByteBuf - -Dio.netty.buffer.checkAccessible: true
   [main] DEBUG io.netty.buffer.AbstractByteBuf - -Dio.netty.buffer.checkBounds: true
   [main] DEBUG io.netty.util.ResourceLeakDetectorFactory - Loaded default ResourceLeakDetector: io.netty.util.ResourceLeakDetector@1807f5a7
   [main] DEBUG org.apache.arrow.memory.rounding.DefaultRoundingPolicy - -Dorg.apache.memory.allocator.pageSize: 8192
   [main] DEBUG org.apache.arrow.memory.rounding.DefaultRoundingPolicy - -Dorg.apache.memory.allocator.maxOrder: 11
   [main] DEBUG io.netty.util.Recycler - -Dio.netty.recycler.maxCapacityPerThread: 4096
   [main] DEBUG io.netty.util.Recycler - -Dio.netty.recycler.ratio: 8
   [main] DEBUG io.netty.util.Recycler - -Dio.netty.recycler.chunkSize: 32
   [main] DEBUG io.netty.util.Recycler - -Dio.netty.recycler.blocking: false
   [main] DEBUG io.netty.util.internal.PlatformDependent - org.jctools-core.MpscChunkedArrayQueue: available
   WARNING: An illegal reflective access operation has occurred
   WARNING: Illegal reflective access by org.apache.arrow.memory.util.MemoryUtil (file:/Users/lvlng/Code/fair-compute/panama/classpath/arrow-memory-core-12.0.0.jar) to field java.nio.Buffer.address
   WARNING: Please consider reporting this to the maintainers of org.apache.arrow.memory.util.MemoryUtil
   WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
   WARNING: All illegal access operations will be denied in a future release
   [main] DEBUG org.apache.arrow.memory.util.MemoryUtil - Constructor for direct buffer found and made accessible
   [main] DEBUG org.apache.arrow.memory.util.MemoryUtil - direct buffer constructor: available
   1024
   2048
   3072
   4096
   5120
   6144
   7168
   8192
   9216
   10000
   Traceback (most recent call last):
     File "RootAllocator.java", line 29, in org.apache.arrow.memory.RootAllocator.close
   Exception: Java Exception
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/Users/lvlng/Library/Application Support/JetBrains/PyCharm2023.1/scratches/jpype_scratch.py", line 46, in <module>
       print(total)
     File "/Users/lvlng/miniconda3/envs/panama_dev/lib/python3.8/site-packages/jpype/_jio.py", line 55, in __exit__
       raise jex
     File "/Users/lvlng/miniconda3/envs/panama_dev/lib/python3.8/site-packages/jpype/_jio.py", line 51, in __exit__
       self.close()
   java.lang.IllegalStateException: java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding buffers allocated (2).
   Allocator(ROOT) 0/49152/114688/9223372036854775807 (res/actual/peak/limit)
     child allocators: 0
     ledgers: 2
       ledger[120] allocator: ROOT), isOwning: , size: , references: 1, life: 18336056157133..0, allocatorManager: [, life: ] holds 1 buffers. 
           ArrowBuf[141], address:140681044344832, capacity:32768
        event log for: ArrowBuf[141]
          18336056242655 create()
                 at org.apache.arrow.memory.util.HistoricalLog$Event.<init>(HistoricalLog.java:175)
                 at org.apache.arrow.memory.util.HistoricalLog.recordEvent(HistoricalLog.java:83)
                 at org.apache.arrow.memory.ArrowBuf.<init>(ArrowBuf.java:97)
                 at org.apache.arrow.memory.BufferLedger.newArrowBuf(BufferLedger.java:271)
                 at org.apache.arrow.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:339)
                 at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:315)
                 at org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29)
                 at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:279)
                 at org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29)
                 at org.apache.arrow.vector.BaseVariableWidthVector.reallocDataBuffer(BaseVariableWidthVector.java:544)
                 at org.apache.arrow.vector.BaseVariableWidthVector.handleSafe(BaseVariableWidthVector.java:1278)
                 at org.apache.arrow.vector.BaseVariableWidthVector.setSafe(BaseVariableWidthVector.java:1039)
                 at org.apache.arrow.adapter.jdbc.consumer.VarCharConsumer$NonNullableVarCharConsumer.consume(VarCharConsumer.java:82)
                 at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:106)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:159)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:177)
   
       ledger[109] allocator: ROOT), isOwning: , size: , references: 1, life: 18336053574582..0, allocatorManager: [, life: ] holds 3 buffers. 
           ArrowBuf[128], address:140681044189184, capacity:16384
        event log for: ArrowBuf[128]
          18336053853304 create()
                 at org.apache.arrow.memory.util.HistoricalLog$Event.<init>(HistoricalLog.java:175)
                 at org.apache.arrow.memory.util.HistoricalLog.recordEvent(HistoricalLog.java:83)
                 at org.apache.arrow.memory.ArrowBuf.<init>(ArrowBuf.java:97)
                 at org.apache.arrow.memory.BufferLedger.newArrowBuf(BufferLedger.java:271)
                 at org.apache.arrow.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:339)
                 at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:315)
                 at org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29)
                 at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:279)
                 at org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29)
                 at org.apache.arrow.vector.BaseValueVector.allocFixedDataAndValidityBufs(BaseValueVector.java:192)
                 at org.apache.arrow.vector.BaseVariableWidthVector.reallocValidityAndOffsetBuffers(BaseVariableWidthVector.java:585)
                 at org.apache.arrow.vector.BaseVariableWidthVector.handleSafe(BaseVariableWidthVector.java:1273)
                 at org.apache.arrow.vector.BaseVariableWidthVector.setSafe(BaseVariableWidthVector.java:1039)
                 at org.apache.arrow.adapter.jdbc.consumer.VarCharConsumer$NonNullableVarCharConsumer.consume(VarCharConsumer.java:82)
                 at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:106)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:159)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:177)
   
           ArrowBuf[129], address:140681044189184, capacity:15880
        event log for: ArrowBuf[129]
          18336054028211 create()
                 at org.apache.arrow.memory.util.HistoricalLog$Event.<init>(HistoricalLog.java:175)
                 at org.apache.arrow.memory.util.HistoricalLog.recordEvent(HistoricalLog.java:83)
                 at org.apache.arrow.memory.ArrowBuf.<init>(ArrowBuf.java:97)
                 at org.apache.arrow.memory.BufferLedger.deriveBuffer(BufferLedger.java:228)
                 at org.apache.arrow.memory.ArrowBuf.slice(ArrowBuf.java:206)
                 at org.apache.arrow.vector.BaseValueVector.allocFixedDataAndValidityBufs(BaseValueVector.java:200)
                 at org.apache.arrow.vector.BaseVariableWidthVector.reallocValidityAndOffsetBuffers(BaseVariableWidthVector.java:585)
                 at org.apache.arrow.vector.BaseVariableWidthVector.handleSafe(BaseVariableWidthVector.java:1273)
                 at org.apache.arrow.vector.BaseVariableWidthVector.setSafe(BaseVariableWidthVector.java:1039)
                 at org.apache.arrow.adapter.jdbc.consumer.VarCharConsumer$NonNullableVarCharConsumer.consume(VarCharConsumer.java:82)
                 at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:106)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:159)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:177)
   
           ArrowBuf[130], address:140681044205064, capacity:504
        event log for: ArrowBuf[130]
          18336054156038 create()
                 at org.apache.arrow.memory.util.HistoricalLog$Event.<init>(HistoricalLog.java:175)
                 at org.apache.arrow.memory.util.HistoricalLog.recordEvent(HistoricalLog.java:83)
                 at org.apache.arrow.memory.ArrowBuf.<init>(ArrowBuf.java:97)
                 at org.apache.arrow.memory.BufferLedger.deriveBuffer(BufferLedger.java:228)
                 at org.apache.arrow.memory.ArrowBuf.slice(ArrowBuf.java:206)
                 at org.apache.arrow.vector.BaseValueVector.allocFixedDataAndValidityBufs(BaseValueVector.java:200)
                 at org.apache.arrow.vector.BaseVariableWidthVector.reallocValidityAndOffsetBuffers(BaseVariableWidthVector.java:585)
                 at org.apache.arrow.vector.BaseVariableWidthVector.handleSafe(BaseVariableWidthVector.java:1273)
                 at org.apache.arrow.vector.BaseVariableWidthVector.setSafe(BaseVariableWidthVector.java:1039)
                 at org.apache.arrow.adapter.jdbc.consumer.VarCharConsumer$NonNullableVarCharConsumer.consume(VarCharConsumer.java:82)
                 at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:106)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:159)
                 at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:177)
   
     reservations: 0
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org