You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "assignUser (via GitHub)" <gi...@apache.org> on 2023/06/10 03:25:09 UTC
[GitHub] [arrow] assignUser commented on issue #36019: Cannot perform gc on pyarrow object
assignUser commented on issue #36019:
URL: https://github.com/apache/arrow/issues/36019#issuecomment-1585444814
The issue is simply that del+gc is not instant. If you add delay you can easily see that:
```python
sleep(1)
show_memory_info('after 1s:')
sleep(1)
show_memory_info('after 2s:')
sleep(1)
show_memory_info('after 3s:')
```
```
after 1s: -- current(MB): 352.238
after 1s: -- total(MB): 31827.660
after 1s: -- account(MB): 26.000
after 2s: -- current(MB): 131.945
after 2s: -- total(MB): 31827.660
after 2s: -- account(MB): 24.800
after 3s: -- current(MB): 68.965
after 3s: -- total(MB): 31827.660
after 3s: -- account(MB): 24.600
```
Also if you have to process data across a bunch of flies maybe the dataset api could useful for you: https://arrow.apache.org/docs/python/dataset.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org