You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/30 22:18:11 UTC

[GitHub] [arrow-rs] Tudyx commented on issue #1760: Best way to convert arrow to Rust native type

Tudyx commented on issue #1760:
URL: https://github.com/apache/arrow-rs/issues/1760#issuecomment-1141513104

   Thanks a lot for your response, it help me a lot to better understand how to deal with `arrow` format.
   
   - Concerning the workload, it can be quite big. I'm currently  working on my spare times on a port of `PyTorch` `dataloader` in Rust. I've implemented all the base functionalities.  I want to play with dataset from [huggingFace](https://huggingface.co/datasets) which contains a ton of `arrow` dataset, to do more advanced test with my library. The typical workflow is to process some contiguous rows at the time, so i think slicing is an important operation
   The idea is to propose an option for loading the dataset in RAM or use `arrow` memory map depending on the size of the dataset.
   
   - About the data representation, i have a little question that may sound stupid. When you say that `Arrow` also supports zero-copy slicing of arrays, something which cannot be performed with `Vec`, does using a slice with vector is not actually doing a zero-copy slicing also? Like in this example
   ```rust
   let vector = vec!["foo", "bar", "bax"];
   let slice = &vector[1..2];
   assert_eq!(slice[0], "bar");
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org