You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by 1057445597 <10...@qq.com> on 2022/10/24 15:01:53 UTC
is Scanner Filter support expression cloumn_name != NULL?
my Filter Expression:
expression->ToString() get this result:(predict_model != null[string])
That's how I got this expression:
auto null_expr = arrow::compute::Expression(MakeNullScalar(arrow::utf8()));
call(not_equal(field_ref("predict_model"), null_expr))
I then use this expression to filter, but end up with an empty batch
if (!dataset()->filter_.empty()) {
auto scanner_builder =
arrow::dataset::ScannerBuilder::FromRecordBatchReader(
batch_reader);
scanner_builder->Filter(dataset()->filter_expr_);
auto scanner_result = scanner_builder->Finish();
if (!scanner_result.ok()) {
res = errors::Internal(scanner_result.status().ToString());
break;
}
auto scanner = scanner_result.ValueOrDie();
auto batch_reader_result = scanner->ToRecordBatchReader();
if (!batch_reader_result.ok()) {
res = errors::Internal(batch_reader_result.status().ToString());
break;
}
batch_reader = batch_reader_result.ValueOrDie();
}
arrow_status = batch_reader->ReadNext(&batch);
batch == nullptr
Is there any other way to filter out things that are not null?
1057445597
1057445597@qq.com
Re: is Scanner Filter support expression cloumn_name != NULL?
Posted by Weston Pace <we...@gmail.com>.
To check for null you can use the `is_null` function:
```
import pyarrow as pa
import pyarrow.compute as pc
import pyarrow.dataset as ds
tab = pa.Table.from_pydict({"x": [1, 2, 3, None], "y": ["a", "b", "c",
"d"]})
filtered = ds.dataset(tab).to_table(filter=pc.is_null(pc.field("x")))
print(filtered)
```
Does that help?
On Mon, Oct 24, 2022 at 8:02 AM 1057445597 <10...@qq.com> wrote:
> my Filter Expression:
> expression->ToString() get this result:(predict_model != null[string])
> That's how I got this expression:
>
> auto null_expr = arrow::compute::Expression(MakeNullScalar(arrow::utf8
> ()));
> call(not_equal(field_ref("predict_model"), null_expr))
>
> I then use this expression to filter, but end up with an empty batch
>
> if (!dataset()->filter_.empty()) {
> auto scanner_builder =
> arrow::dataset::ScannerBuilder::FromRecordBatchReader(
> batch_reader);
> scanner_builder->Filter(dataset()->filter_expr_);
> auto scanner_result = scanner_builder->Finish();
> if (!scanner_result.ok()) {
> res = errors::Internal(scanner_result.status().ToString());
> break;
> }
> auto scanner = scanner_result.ValueOrDie();
> auto batch_reader_result = scanner->ToRecordBatchReader();
> if (!batch_reader_result.ok()) {
> res = errors::Internal(batch_reader_result.status().ToString());
> break;
> }
> batch_reader = batch_reader_result.ValueOrDie();
> }
>
> arrow_status = batch_reader->ReadNext(&batch);
>
> batch == nullptr
>
> Is there any other way to filter out things that are not null?
>
>
>
> ------------------------------
> 1057445597
> 1057445597@qq.com
>
> <https://wx.mail.qq.com/home/index?t=readmail_businesscard_midpage&nocheck=true&name=1057445597&icon=http%3A%2F%2Fthirdqq.qlogo.cn%2Fg%3Fb%3Dsdk%26k%3DIlyZtc5eQb1ZfPd0rzpQlQ%26s%3D100%26t%3D1551800738%3Frand%3D1648208978&mail=1057445597%40qq.com&code=>
>
>