You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "David Li (Jira)" <ji...@apache.org> on 2021/11/09 15:55:00 UTC

[jira] [Commented] (ARROW-14642) [C++] ScanNode is not using the filter expression

    [ https://issues.apache.org/jira/browse/ARROW-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441243#comment-17441243 ] 

David Li commented on ARROW-14642:
----------------------------------

Is this the same as ARROW-13498, at least for the filter part?

The scan can't guarantee that it applies the filter or even fully applies the filter, so in general you need to filter again afterwards. (e.g. CSV - you can't push down any filter there, the only filtering would be based on partitioning info.)

The crash should be fixed, though.

> [C++] ScanNode is not using the filter expression
> -------------------------------------------------
>
>                 Key: ARROW-14642
>                 URL: https://issues.apache.org/jira/browse/ARROW-14642
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 7.0.0
>            Reporter: Percy Camilo Triveño Aucahuasi
>            Assignee: Percy Camilo Triveño Aucahuasi
>            Priority: Major
>              Labels: dataset, query-engine
>             Fix For: 7.0.0
>
>
> The ScanNode can not apply predicate push-down, it seems the are some fragments that are not able to use or setting up properly the filter expression. 
> {code:c++}
> auto options = std::make_shared<ScanOptions>();
> options->filter = filter_expression;
> compute::MakeExecNode("scan", plan.get(), {}, ScanNodeOptions{dataset, options})); 
> {code}
> Also, if the ScanOptions doesn't have a projection expression the code just crash and I think I should create a default expression to include all the fields from the dataset schema.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)