You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/28 08:36:39 UTC

[GitHub] [arrow-rs] tustvold commented on issue #1955: Support multi diskRanges for ChunkReader

tustvold commented on issue #1955:
URL: https://github.com/apache/arrow-rs/issues/1955#issuecomment-1168412426

   Why not just call get_read for each page instead of for the entire column chunk? There is no requirement for get_read to delimit column chunks, after all the same trait is used to read the footer, etc...
   
   Somewhat related, but something to keep in mind is how this will all work with `ParquetRecordBatchStream`. This does not make use of `ChunkReader`, and is instead push-based, needing to know the ranges to fetch up-front. It should just be a case of making `InMemoryColumnChunk` sparse and teaching `InMemoryColumnChunkReader` to read it correctly, but it is probably worth thinking about how this will work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org