You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "assignUser (via GitHub)" <gi...@apache.org> on 2023/06/10 03:25:09 UTC

[GitHub] [arrow] assignUser commented on issue #36019: Cannot perform gc on pyarrow object

assignUser commented on issue #36019:
URL: https://github.com/apache/arrow/issues/36019#issuecomment-1585444814

   The issue is simply that del+gc is not instant. If you add delay you can easily see that:
   ```python
       sleep(1)
       show_memory_info('after 1s:')
   
       sleep(1)
       show_memory_info('after 2s:')
       
       sleep(1)
       show_memory_info('after 3s:')
   ```
   ```
   after 1s: -- current(MB): 352.238
   after 1s: -- total(MB): 31827.660
   after 1s: -- account(MB): 26.000
   
   after 2s: -- current(MB): 131.945
   after 2s: -- total(MB): 31827.660
   after 2s: -- account(MB): 24.800
   
   after 3s: -- current(MB): 68.965
   after 3s: -- total(MB): 31827.660
   after 3s: -- account(MB): 24.600
   ```
   
   Also if you have to process data across a bunch of flies maybe the dataset api could useful for you: https://arrow.apache.org/docs/python/dataset.html
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org