You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/26 09:20:57 UTC
[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #14516: ARROW-18164: [Python] Honor default memory pool in Dataset scanning
jorisvandenbossche commented on code in PR #14516:
URL: https://github.com/apache/arrow/pull/14516#discussion_r1005433867
##########
python/pyarrow/tests/test_dataset.py:
##########
@@ -491,6 +491,23 @@ def test_scanner(dataset, dataset_reader):
assert sorted_table['__last_in_fragment'].to_pylist() == [True] * 10
+@pytest.mark.parquet
+def test_scanner_memory_pool(dataset):
+ # honor default pool - https://issues.apache.org/jira/browse/ARROW-18164
+ old_pool = pa.default_memory_pool()
+ # pool = pa.proxy_memory_pool(old_pool)
+ pool = pa.system_memory_pool()
Review Comment:
It seems that using the proxy memory pool with datasets crashes. I can't reproduce this in an interactive session, but the following test segfaults:
```
@pytest.mark.parquet
def test_scanner_proxy_memory_pool(dataset):
proxy_pool = pa.proxy_memory_pool(pa.default_memory_pool())
_ = dataset.to_table(memory_pool=proxy_pool)
```
gdb backtrace:
```
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffff37bec9e in arrow::PoolBuffer::~PoolBuffer (this=0x7fff9c001f30, __in_chrg=<optimized out>) at /home/joris/scipy/repos/arrow/cpp/src/arrow/memory_pool.cc:817
817 pool_->Free(ptr, capacity_);
(gdb) bt
#0 0x00007ffff37bec9e in arrow::PoolBuffer::~PoolBuffer (this=0x7fff9c001f30, __in_chrg=<optimized out>) at /home/joris/scipy/repos/arrow/cpp/src/arrow/memory_pool.cc:817
#1 0x00007ffff37becc8 in arrow::PoolBuffer::~PoolBuffer (this=0x7fff9c001f30, __in_chrg=<optimized out>) at /home/joris/scipy/repos/arrow/cpp/src/arrow/memory_pool.cc:819
#2 0x00007ffff5c2545a in std::default_delete<arrow::ResizableBuffer>::operator() (this=0x7fff9c001fa0, __ptr=0x7fff9c001f30)
...
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org