You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "suremarc (via GitHub)" <gi...@apache.org> on 2023/03/23 19:27:03 UTC

[GitHub] [arrow-rs] suremarc opened a new issue, #3922: Support reverse order for Parquet streams

suremarc opened a new issue, #3922:
URL: https://github.com/apache/arrow-rs/issues/3922

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   <!--
   A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 
   (This section helps Arrow developers understand the context and *why* for this feature, in addition to  the *what*)
   -->
   
   I have been evaluating Parquet and Arrow Datafusion for use at my company. While testing out Datafusion I noticed that queries like `SELECT * FROM table ORDER BY field DESC LIMIT n` causes it to read the whole file, even though the existing data was sorted in ascending order. 
   
   Upon further investigation, this made sense because the Parquet reader can't return data in any order except the ordering that it was stored in. But this makes it hard to minimize the amount of work done while executing certain queries, e.g. get the last N events sorted by time before a certain known timestamp (especially for small N).
   
   **Describe the solution you'd like**
   <!--
   A clear and concise description of what you want to happen.
   -->
   
   A new function added to `ArrowReaderBuilder`, something like this:
   ```rust
   pub fn with_reverse(self, reverse: bool) -> Self {
       Self { reverse, ..self }
   }
   ```
   
   which would cause the data to be streamed in the reverse of its native order as well as individual record batches being reversed, respecting limits all the while. E.g. `with_limit(100).with_reverse(true)` would return the last 100 rows satisfying the query. 
   
   Setting `with_reverse` should probably not affect the order of the row groups, since there are no guarantees on the organization of Parquet row groups anyway. 
   
   **Describe alternatives you've considered**
   <!--
   A clear and concise description of any alternative solutions or features you've considered.
   -->
   
   After realizing that implementing this feature would be non-trivial, I tried implementing my own querying code by fetching a whole row group at a time, using the existing query builder, then reversing the entire row group. See [here](https://github.com/suremarc/polygon-arrow-rs/blob/master/src/main.rs). It works, but it has to deserialize the entire row group, even though the limit might be 1. A more sophisticated implementation would deserialize only the minimum number of pages before stopping early, as the existing code in the Parquet library does. 
   
   If the library had lower-level API it might be possible to support specific use cases like reverse ordering without overloading the existing logic (which is already quite complex by the looks of it). However I am not sure what such an API would look like. 
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #3922: Support reverse order for Parquet streams

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #3922:
URL: https://github.com/apache/arrow-rs/issues/3922#issuecomment-1482375773

   > Unless I am misunderstanding, I do not think it is possible to select the last N rows subject to a predicate with a RowSelection.
   
   This is actually possible, see https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/ for more background on how predicate pushdown works for parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #3922: Support reverse order for Parquet streams

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #3922:
URL: https://github.com/apache/arrow-rs/issues/3922#issuecomment-1482890381

   >  I do not think it is possible to select the last N rows subject to a predicate with a RowSelection.
   
   Correct, [RowFilter](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.RowFilter.html) is the machinery that underpins late materialization, which is what would be necessary here. The output of this process is a `RowSelection` that is then used to read the output columns.
   
   Although now that I write this, whilst there is a mechanism to limit this final computed `RowSelection` to the [first N rows](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.ArrowReaderBuilder.html#method.with_limit), there isn't a public API to limit this to the last N rows. This should be a very simple addition, I'll add it to my list.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #3922: Support reverse order for Parquet streams

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #3922:
URL: https://github.com/apache/arrow-rs/issues/3922#issuecomment-1481983517

   > which would cause the data to be streamed in the reverse of its native order as well as individual record batches being reversed, respecting limits all the while
   
   Unfortunately the nature of the way parquet data is encoded would make doing this at anything below the row group level likely impractical for a couple of reasons:
   
   * With exception to PLAIN encoding, there is no easy way to decode pages in reverse order, as the underlying [encodings](https://github.com/apache/parquet-format/blob/master/Encodings.md) are length prefixed blocks
   * The dremel record shredding, especially for repetition levels, is order sensitive
   
   That being said, it is possible to just decode the last n rows using [`RowFilter`](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.RowFilter.html), potentially reducing this to a query optimisation problem in DataFusion, as opposed to something needing new functionality in the parquet crate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] suremarc commented on issue #3922: Support reverse order for Parquet streams

Posted by "suremarc (via GitHub)" <gi...@apache.org>.
suremarc commented on issue #3922:
URL: https://github.com/apache/arrow-rs/issues/3922#issuecomment-1482076248

   Thank you for the speedy reply. It sounds like this feature doesn't really agree with Parquet very much, unfortunately. 
   
   > That being said, it is possible to just decode the last n rows using [`RowSelection`](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.RowSelection.html). This means if the DataFusion query optimiser could be taught to push this down, it should work without requiring any additional functionality in the parquet crate.
   
   This makes sense, but I think I should have been more specific — the queries I was testing also involved filtering, e.g. `SELECT * FROM table WHERE attribute = 'value' ORDER BY field DESC LIMIT n`. Unless I am misunderstanding, I do not think it is possible to select the last N rows subject to a predicate with a `RowSelection`. 
   
   I am starting to think maybe Parquet and Datafusion are not ideal for my company's use case — we are interested in its analytical capabilities, but our existing products support queries of the form described above (essentially, filter + limit + sort ascending/descending on time only). Do you think it would be worth opening an issue on Datafusion about this though?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] suremarc commented on issue #3922: Support reverse order for Parquet streams

Posted by "suremarc (via GitHub)" <gi...@apache.org>.
suremarc commented on issue #3922:
URL: https://github.com/apache/arrow-rs/issues/3922#issuecomment-1482865710

   > > Unless I am misunderstanding, I do not think it is possible to select the last N rows subject to a predicate with a RowSelection.
   > 
   > This is actually possible, see https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/ for more background on how predicate pushdown works for parquet
   
   I have read this article and am familiar with the RowSelection API. To my understanding, a RowSelection generated by a filter can rule out ranges based on the page statistics but cannot tell you how many matches for a predicate are actually in each page — it can only tell you that a page definitely has zero matches. In the worst case there might only be one match per page that wasn't pruned.  So if I wanted to retrieve exactly N rows satisfying my predicate, I would have to include offsets from the last N pages of the column in the RowSelection, which is maximally pessimistic. 
   
   I apologize if I'm wrong, in which case I probably will look like a fool... nonetheless, I would love to be wrong on this particular issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org