You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andrew Lamb (Jira)" <ji...@apache.org> on 2021/04/26 12:49:02 UTC
[jira] [Closed] (ARROW-11846) [Rust] Specify behavior of filter
kernel on `null`
[ https://issues.apache.org/jira/browse/ARROW-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Lamb closed ARROW-11846.
-------------------------------
Resolution: Invalid
> [Rust] Specify behavior of filter kernel on `null`
> --------------------------------------------------
>
> Key: ARROW-11846
> URL: https://issues.apache.org/jira/browse/ARROW-11846
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Rust
> Reporter: Ben Chambers
> Priority: Minor
>
> Currently, the behavior of `filter` is undefined on null values.
> This leads to a few issues in cases where you may have a `boolean` array containing `null` values. For instance, I created a `null_to_false` which has to manipulate the underlying buffers in order to combine the null-bits with false. The C++ `filter` kernel allows specifying the behavior on nulls. Thoughts on adding a method that takes an additional parameter to configure the behavior, and then picking a "default" behavior for the existing implementation?
> {code:java}
> pub enum NullFilterBehavior {
> // Include values where the filter was NULL.
> EMIT,
> // Exclude values where the filter was NULL.
> SKIP,
> // Ignore the null bits. Behavior is undefined.
> UNDEFINED,
> }
> pub struct FilterConfig {
> null_behavior: NullFilterBehavior
> }
> impl Default for FilterConfig {
> fn default() -> Self {
> Self {
> null_behavior: NullFilterBehavior::UNDEFINED,
> }
> }
> }
> pub fn filter(array: &Array, filter: &BooleanArray) -> Result<ArrayRef> {
> filter_config(array, filter, FilterConfig::default()
> }
> pub fn filter(array: &Array, filter: &BooleanArray, config: FilterConfig) -> Result<ArrayRef> {
> ...
> }
> {code}
> It seems like implementing such a method could be done by allowing the BitChunksIterator to AND / OR each of the chunks before passing it to the BitSlices iterator.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)