You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/11 16:13:54 UTC

[GitHub] [arrow] pitrou opened a new pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

pitrou opened a new pull request #9161:
URL: https://github.com/apache/arrow/pull/9161


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on a change in pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#discussion_r555263751



##########
File path: python/pyarrow/tests/test_memory.py
##########
@@ -133,3 +133,24 @@ def test_env_var():
     if should_have_mimalloc:
         check_env_var("mimalloc", ["mimalloc"])
     check_env_var("nonexistent", possible_backends, expect_warning=True)
+
+
+def test_specific_memory_pools():

Review comment:
       Will this case fail if we build a wheel without the specific optional allocator enabled (based on the should have flag [definitions](https://github.com/apache/arrow/blob/053f4742c7fde9c1e3b3a0858c4de6dc05efec98/python/pyarrow/tests/test_memory.py#L29-L30))?
   
   If yes we may want to find a better way to retrieve the built allocators in a follow-up.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#issuecomment-758666562


   Ok, I've improved the docs a bit. Will merge if green.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#issuecomment-758120190


   https://issues.apache.org/jira/browse/ARROW-11049


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on a change in pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#discussion_r555245120



##########
File path: python/pyarrow/memory.pxi
##########
@@ -127,6 +127,44 @@ def logging_memory_pool(MemoryPool parent):
     return out
 
 
+def system_memory_pool():
+    """
+    Return a memory pool based on the C malloc heap.
+    """
+    cdef:
+        MemoryPool pool = MemoryPool.__new__(MemoryPool)
+    pool.init(c_system_memory_pool())
+    return pool
+
+
+def jemalloc_memory_pool():
+    """
+    Return a memory pool based on the jemalloc heap.
+
+    NotImplemented is raised if this Arrow build does not enable jemalloc.

Review comment:
       *unix packages

##########
File path: python/pyarrow/memory.pxi
##########
@@ -127,6 +127,44 @@ def logging_memory_pool(MemoryPool parent):
     return out
 
 
+def system_memory_pool():
+    """
+    Return a memory pool based on the C malloc heap.
+    """
+    cdef:
+        MemoryPool pool = MemoryPool.__new__(MemoryPool)
+    pool.init(c_system_memory_pool())
+    return pool
+
+
+def jemalloc_memory_pool():
+    """
+    Return a memory pool based on the jemalloc heap.
+
+    NotImplemented is raised if this Arrow build does not enable jemalloc.

Review comment:
       in our *unix packages




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #9161:
URL: https://github.com/apache/arrow/pull/9161


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on a change in pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#discussion_r555260332



##########
File path: python/pyarrow/memory.pxi
##########
@@ -127,6 +127,44 @@ def logging_memory_pool(MemoryPool parent):
     return out
 
 
+def system_memory_pool():
+    """
+    Return a memory pool based on the C malloc heap.
+    """
+    cdef:
+        MemoryPool pool = MemoryPool.__new__(MemoryPool)
+    pool.init(c_system_memory_pool())
+    return pool
+
+
+def jemalloc_memory_pool():
+    """
+    Return a memory pool based on the jemalloc heap.
+
+    NotImplemented is raised if this Arrow build does not enable jemalloc.

Review comment:
       Right




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#issuecomment-758151625


   Looks good, no comments on the Python code. Maybe not critical for *this* PR, but I have some suggestions (or questions) for expanded docs:
   
   - I would add some more text to the `python/memory.rst`. There is now "PyArrow uses a default built-in memory pool, but depending on how Arrow was built, there may be additional memory pools to choose from."  
     -> Could mention which different options are possible (system, jemalloc, mimalloc), referencing the API page with the overview of the related functions  
     -> Is there a way to see interactively which one is being used?
     -> Can you switch the pool being used? (eg do `pa.set_memory_pool(pa.mimalloc_memory_pool())` ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#issuecomment-758780764


   Thanks for the doc update!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#discussion_r555265463



##########
File path: python/pyarrow/tests/test_memory.py
##########
@@ -133,3 +133,24 @@ def test_env_var():
     if should_have_mimalloc:
         check_env_var("mimalloc", ["mimalloc"])
     check_env_var("nonexistent", possible_backends, expect_warning=True)
+
+
+def test_specific_memory_pools():

Review comment:
       Unless the case happens, I don't think we should worry.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on a change in pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#discussion_r555185906



##########
File path: python/pyarrow/memory.pxi
##########
@@ -127,6 +127,44 @@ def logging_memory_pool(MemoryPool parent):
     return out
 
 
+def system_memory_pool():
+    """
+    Return a memory pool based on the C malloc heap.
+    """
+    cdef:
+        MemoryPool pool = MemoryPool.__new__(MemoryPool)
+    pool.init(c_system_memory_pool())
+    return pool
+
+
+def jemalloc_memory_pool():
+    """
+    Return a memory pool based on the jemalloc heap.
+
+    NotImplemented is raised if this Arrow build does not enable jemalloc.

Review comment:
       Perhaps you should add a bit more context about where we use the allocators. Like jemalloc is used and enabled in our packages. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#discussion_r555202401



##########
File path: python/pyarrow/memory.pxi
##########
@@ -127,6 +127,44 @@ def logging_memory_pool(MemoryPool parent):
     return out
 
 
+def system_memory_pool():
+    """
+    Return a memory pool based on the C malloc heap.
+    """
+    cdef:
+        MemoryPool pool = MemoryPool.__new__(MemoryPool)
+    pool.init(c_system_memory_pool())
+    return pool
+
+
+def jemalloc_memory_pool():
+    """
+    Return a memory pool based on the jemalloc heap.
+
+    NotImplemented is raised if this Arrow build does not enable jemalloc.

Review comment:
       I don't think jemalloc is enabled on all platforms, is it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on a change in pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#discussion_r555263751



##########
File path: python/pyarrow/tests/test_memory.py
##########
@@ -133,3 +133,24 @@ def test_env_var():
     if should_have_mimalloc:
         check_env_var("mimalloc", ["mimalloc"])
     check_env_var("nonexistent", possible_backends, expect_warning=True)
+
+
+def test_specific_memory_pools():

Review comment:
       Will this case fail if we build a wheel without the specific optional allocator enabled (based on the should have flag [definitions](https://github.com/apache/arrow/blob/053f4742c7fde9c1e3b3a0858c4de6dc05efec98/python/pyarrow/tests/test_memory.py#L29-L30))?
   
   If yes we may want to find a better way to retrieve the built allocator in a follow-up.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #9161: ARROW-11049: [Python] Expose alternate memory pools

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #9161:
URL: https://github.com/apache/arrow/pull/9161#discussion_r555254647



##########
File path: python/pyarrow/memory.pxi
##########
@@ -127,6 +127,44 @@ def logging_memory_pool(MemoryPool parent):
     return out
 
 
+def system_memory_pool():
+    """
+    Return a memory pool based on the C malloc heap.
+    """
+    cdef:
+        MemoryPool pool = MemoryPool.__new__(MemoryPool)
+    pool.init(c_system_memory_pool())
+    return pool
+
+
+def jemalloc_memory_pool():
+    """
+    Return a memory pool based on the jemalloc heap.
+
+    NotImplemented is raised if this Arrow build does not enable jemalloc.

Review comment:
       Well, we'll certain forgot to update this docstring if we change the build settings, so I'd rather not be more precise.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org