You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by SHI BEI <sh...@foxmail.com> on 2023/01/11 02:34:02 UTC
Predicate Pushdown/Arrow-rs Usage Question
Hi arrow community,
I'm new to the arrow project and am trying to use arrow and parquet in a C/C++ project. To improve the query peformance, I plan to take the advantage of parquet row-group level and page level statistics when querying data, but GLib/C++ SDK is lack of implement for parquet predicates pushdown. I have noticed that some works are in process to support parquet predicates pushdown, but it will take some time. So I want to know whether if it's possible to use arrow-rs instead, and is there any one have some pricate in the same scene. Any one can help will be appricated!
SHI BEI
shibei.lh@foxmail.com
Re: Predicate Pushdown/Arrow-rs Usage Question
Posted by Adam Lippai <ad...@rigo.sk>.
Row group level predicate pushdowns should be supported in both C++ and
Rust. What’s the use case / query you want to speed up?
Page index and bloom filters are brand new and low level in arrow-rs, but
there is support for them. AFAIK C++ doesn’t have full standard coverage
for either.
Best regards,
Adam Lippai
On Tue, Jan 10, 2023 at 9:35 PM SHI BEI <sh...@foxmail.com> wrote:
> Hi arrow community,
>
>
>
>
> I'm new to the arrow project and am trying to use arrow and parquet in a
> C/C++ project. To improve the query peformance, I plan to take the
> advantage of parquet row-group level and page level statistics when
> querying data, but GLib/C++ SDK is lack of implement for parquet predicates
> pushdown. I have noticed that some works are in process to support parquet
> predicates pushdown, but it will take some time. So I want to know whether
> if it's possible to use arrow-rs instead, and is there any one have some
> pricate in the same scene. Any one can help will be appricated!
>
>
>
> SHI BEI
> shibei.lh@foxmail.com
Re: Predicate Pushdown/Arrow-rs Usage Question
Posted by Raphael Taylor-Davies <r....@googlemail.com.INVALID>.
Hi Shi
Arrow-rs has full support for predicate pushdown and late materialisation. You can find some more information about it here [1]
You can possibly also use DataFusion for inspiration [2]
Feel free to get in touch should you run into any issues
Kind Regards,
Raphael
[1]: https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/
[2]: https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/src/physical_plan/file_format/parquet.rs
On 11 January 2023 03:34:02 CET, SHI BEI <sh...@foxmail.com> wrote:
>Hi arrow community,
>
>
>
>
>I'm new to the arrow project and am trying to use arrow and parquet in a C/C++ project. To improve the query peformance, I plan to take the advantage of parquet row-group level and page level statistics when querying data, but GLib/C++ SDK is lack of implement for parquet predicates pushdown. I have noticed that some works are in process to support parquet predicates pushdown, but it will take some time. So I want to know whether if it's possible to use arrow-rs instead, and is there any one have some pricate in the same scene. Any one can help will be appricated!
>
>
>
>SHI BEI
>shibei.lh@foxmail.com