You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "paleolimbot (via GitHub)" <gi...@apache.org> on 2023/06/19 19:28:53 UTC
[GitHub] [arrow] paleolimbot commented on issue #36161: R - Problem retrieving memory used after gc() using arrow library
paleolimbot commented on issue #36161:
URL: https://github.com/apache/arrow/issues/36161#issuecomment-1597661645
The Windows Task Manager and `memory.size()` and `gc()` all get their numbers from different places, so I'm not surprised that there are differences (although I'm not familiar with the details on Windows). I do know that any allocations made by Arrow C++ won't show up in `gc()`; however you can track these allocations using `default_memory_pool()$bytes_allocated`. Note that there are some hidden references to objects that are not always apparent (for example, when converting a Table to a data.frame, some columns may be zero-copy shells around Arrow arrays).
``` r
library(arrow, warn.conflicts = FALSE)
default_memory_pool()$bytes_allocated
#> [1] 0
default_memory_pool()$max_memory
#> [1] 0
# no bytes allocated because it has re-used R's memory
array <- as_arrow_array(1:10)
default_memory_pool()$bytes_allocated
#> [1] 0
default_memory_pool()$max_memory
#> [1] 0
# Can't re-use R memory for decimal type, so this will trigger an Arrow allocation
array <- as_arrow_array(1:10, type = decimal(10, 3))
default_memory_pool()$bytes_allocated
#> [1] 192
default_memory_pool()$max_memory
#> [1] 256
rm(array)
gc()
#> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
#> Ncells 803037 42.9 1418702 75.8 NA 1418702 75.8
#> Vcells 1370077 10.5 8388608 64.0 16384 2707166 20.7
default_memory_pool()$bytes_allocated
#> [1] 0
default_memory_pool()$max_memory
#> [1] 256
```
<sup>Created on 2023-06-19 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org