You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/07/22 08:40:39 UTC

[GitHub] [arrow] zhztheplayer commented on pull request #13673: ARROW-17159: [C++][JAVA] Dataset: Support reading from fixed offset of a file for Parquet format

zhztheplayer commented on PR #13673:
URL: https://github.com/apache/arrow/pull/13673#issuecomment-1192330080

   @lidavidm We are opening a new PR replacing this. Please see https://github.com/apache/arrow/pull/13688.
   
   > Do you have a little more context for what's going on here? A byte-based offset/length seems quite strange to me (even if it is in Substrait…)
   
   The original workload we adding this small feature was to work with Spark properly: Spark splits a single file to multiple partitions using byte offset. This is a common behavior in Spark for different file formats.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org