You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by 1057445597 <10...@qq.com> on 2022/10/24 15:01:53 UTC

is Scanner Filter support expression cloumn_name != NULL?

my Filter Expression:
expression-&gt;ToString() get this result:(predict_model != null[string])
That's how I got this expression:


auto null_expr = arrow::compute::Expression(MakeNullScalar(arrow::utf8()));

call(not_equal(field_ref("predict_model"),&nbsp;null_expr))


I then use this expression to filter, but end up with an empty batch


          if (!dataset()-&gt;filter_.empty()) {
            auto scanner_builder =
                arrow::dataset::ScannerBuilder::FromRecordBatchReader(
                    batch_reader);
            scanner_builder-&gt;Filter(dataset()-&gt;filter_expr_);
            auto scanner_result = scanner_builder-&gt;Finish();
            if (!scanner_result.ok()) {
              res = errors::Internal(scanner_result.status().ToString());
              break;
            }
            auto scanner = scanner_result.ValueOrDie();
            auto batch_reader_result = scanner-&gt;ToRecordBatchReader();
            if (!batch_reader_result.ok()) {
              res = errors::Internal(batch_reader_result.status().ToString());
              break;
            }
            batch_reader = batch_reader_result.ValueOrDie();
          }

          arrow_status = batch_reader-&gt;ReadNext(&amp;batch);




batch == nullptr


Is there any other way to filter out things that are not null?









1057445597
1057445597@qq.com



&nbsp;

Re: is Scanner Filter support expression cloumn_name != NULL?

Posted by Weston Pace <we...@gmail.com>.
To check for null you can use the `is_null` function:

```
import pyarrow as pa
import pyarrow.compute as pc
import pyarrow.dataset as ds

tab = pa.Table.from_pydict({"x": [1, 2, 3, None], "y": ["a", "b", "c",
"d"]})
filtered = ds.dataset(tab).to_table(filter=pc.is_null(pc.field("x")))
print(filtered)
```

Does that help?

On Mon, Oct 24, 2022 at 8:02 AM 1057445597 <10...@qq.com> wrote:

> my Filter Expression:
> expression->ToString() get this result:(predict_model != null[string])
> That's how I got this expression:
>
> auto null_expr = arrow::compute::Expression(MakeNullScalar(arrow::utf8
> ()));
> call(not_equal(field_ref("predict_model"), null_expr))
>
> I then use this expression to filter, but end up with an empty batch
>
> if (!dataset()->filter_.empty()) {
> auto scanner_builder =
> arrow::dataset::ScannerBuilder::FromRecordBatchReader(
> batch_reader);
> scanner_builder->Filter(dataset()->filter_expr_);
> auto scanner_result = scanner_builder->Finish();
> if (!scanner_result.ok()) {
> res = errors::Internal(scanner_result.status().ToString());
> break;
> }
> auto scanner = scanner_result.ValueOrDie();
> auto batch_reader_result = scanner->ToRecordBatchReader();
> if (!batch_reader_result.ok()) {
> res = errors::Internal(batch_reader_result.status().ToString());
> break;
> }
> batch_reader = batch_reader_result.ValueOrDie();
> }
>
> arrow_status = batch_reader->ReadNext(&batch);
>
> batch == nullptr
>
> Is there any other way to filter out things that are not null?
>
>
>
> ------------------------------
> 1057445597
> 1057445597@qq.com
>
> <https://wx.mail.qq.com/home/index?t=readmail_businesscard_midpage&nocheck=true&name=1057445597&icon=http%3A%2F%2Fthirdqq.qlogo.cn%2Fg%3Fb%3Dsdk%26k%3DIlyZtc5eQb1ZfPd0rzpQlQ%26s%3D100%26t%3D1551800738%3Frand%3D1648208978&mail=1057445597%40qq.com&code=>
>
>