You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Nicolas Trinquier (JIRA)" <ji...@apache.org> on 2019/02/18 21:38:00 UTC
[jira] [Comment Edited] (ARROW-4605) [Rust] Move filter and limit
code from DataFusion into compute module
[ https://issues.apache.org/jira/browse/ARROW-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771388#comment-16771388 ]
Nicolas Trinquier edited comment on ARROW-4605 at 2/18/19 9:37 PM:
-------------------------------------------------------------------
[~andygrove] did you mean to move the functions as they are? They both do some kind of filtering on the data, what do you think of re-implementing them as a more generic function?
Some along those lines:
{code:java}
fn filter(a: &Array, predicate: Fn(usize) -> bool) -> Result<ArrayRef> {
...
for i in 0..b.len() {
if predicate(i) {
builder.append_value(a.value(i))?
}
}
...
}
{code}
Predicates would look like this:
{code:java}
let limit_predicate = |index| { index < limit_value }
let filter_predicate = |index| { filter_bools.value(index) }
{code}
I do not know a/ if this pattern is very rustacean, and b/ if the abstraction is worth it (i.e. in the case of limit we would still allocate a buffer for the full size and iterate through all the elements whereas we could save space and return early).
was (Author: ntrinquier):
[~andygrove] did you mean to move the functions as they are? They both do some kind of filtering on the data, what do you think of re-implementing them as a more generic function? Some along those lines:
{code:java}
fn filter(a: &Array, predicate: Fn(usize) -> bool) -> Result<ArrayRef> {
...
for i in 0..b.len() {
if predicate(i) {
builder.append_value(a.value(i))?
}
}
...
}
{code}
Predicates would look like this:
{code:java}
let limit_predicate = |index| { index < limit_value }
let filter_predicate = |index| { filter_bools(index) }
{code}
I do not know a/ if this pattern is very rustacean, and b/ if the abstraction is worth it (i.e. in the case of limit we would still allocate a buffer for the full size and iterate through all the elements whereas we could save space and return early).
> [Rust] Move filter and limit code from DataFusion into compute module
> ---------------------------------------------------------------------
>
> Key: ARROW-4605
> URL: https://issues.apache.org/jira/browse/ARROW-4605
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Rust
> Affects Versions: 0.12.0
> Reporter: Andy Grove
> Priority: Major
> Fix For: 0.13.0
>
>
> FilterRelation and the new LimitRelation (in ARROW-4464) contain code for filtering and limiting arrays that could now be pushed down into the compute module.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)