You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/23 15:39:34 UTC

[GitHub] [arrow-rs] alamb opened a new issue, #1729: Add memory size estimation for `ParquetMetadata`

alamb opened a new issue, #1729:
URL: https://github.com/apache/arrow-rs/issues/1729

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   In https://github.com/influxdata/influxdb_iox, caching is very important to performance. Part of what is cached in memory is a `ParquetMetadata` structure. In order to effectively cache this data (and free it when under memory pressure) we need to know an accurate estimate of the heap it is using. 
   
   
   **Describe the solution you'd like**
   I would like a function such as the following that accurately estimates the memory usage of parquet metadata:
   
   ```rust
   impl ParquetMetadata {
       ...
   
       /// In-memory size in bytes, including `self`.
       pub fn size(&self) -> usize {
           // recursively 
       }
   
      ...
   }
   ```
   
   **Describe alternatives you've considered**
   I believe @domodwyer is considering caching only the parts that we need (rather than the `ParquetMetadata` object itself) in which case we would likely not need this feature in IOx. I think it would still be generally helpful though
   
   **Additional context**
   See https://github.com/influxdata/influxdb_iox/pull/4661 for example
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org