You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "paleolimbot (via GitHub)" <gi...@apache.org> on 2023/06/19 19:28:53 UTC

[GitHub] [arrow] paleolimbot commented on issue #36161: R - Problem retrieving memory used after gc() using arrow library

paleolimbot commented on issue #36161:
URL: https://github.com/apache/arrow/issues/36161#issuecomment-1597661645

   The Windows Task Manager and `memory.size()` and `gc()` all get their numbers from different places, so I'm not surprised that there are differences (although I'm not familiar with the details on Windows). I do know that any allocations made by Arrow C++ won't show up in `gc()`; however you can track these allocations using `default_memory_pool()$bytes_allocated`. Note that there are some hidden references to objects that are not always apparent (for example, when converting a Table to a data.frame, some columns may be zero-copy shells around Arrow arrays).
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   default_memory_pool()$bytes_allocated
   #> [1] 0
   default_memory_pool()$max_memory
   #> [1] 0
   
   # no bytes allocated because it has re-used R's memory
   array <- as_arrow_array(1:10)
   default_memory_pool()$bytes_allocated
   #> [1] 0
   default_memory_pool()$max_memory
   #> [1] 0
   
   # Can't re-use R memory for decimal type, so this will trigger an Arrow allocation
   array <- as_arrow_array(1:10, type = decimal(10, 3))
   default_memory_pool()$bytes_allocated
   #> [1] 192
   default_memory_pool()$max_memory
   #> [1] 256
   
   rm(array)
   gc()
   #>           used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
   #> Ncells  803037 42.9    1418702 75.8         NA  1418702 75.8
   #> Vcells 1370077 10.5    8388608 64.0      16384  2707166 20.7
   default_memory_pool()$bytes_allocated
   #> [1] 0
   default_memory_pool()$max_memory
   #> [1] 256
   ```
   
   <sup>Created on 2023-06-19 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org