You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Arthur Maciejewicz (Jira)" <ji...@apache.org> on 2019/09/17 15:28:00 UTC

[jira] [Created] (ARROW-6583) Question and Request for Examples of Array Operations

Arthur Maciejewicz created ARROW-6583:
-----------------------------------------

             Summary: Question and Request for Examples of Array Operations
                 Key: ARROW-6583
                 URL: https://issues.apache.org/jira/browse/ARROW-6583
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Rust
            Reporter: Arthur Maciejewicz


Hi all, thank you for your excellent work on Arrow.

As I was going through the example for the Rust Arrow implementation, specifically the read_csv example [https://github.com/apache/arrow/blob/master/rust/arrow/examples/read_csv.rs] , as well as the generated Rustdocs, and unit tests, it was not quite clear what the intended usage is for operations such as filtering and masking over Arrays.

One particular use-case I'm interested in is finding all values in an Array such that x >= N for all x. I came across arrow::compute::array_ops::filter, which seems to be similar to what I want, although it's expecting a mask to already be constructed before performing the filter operation, and it was not obviously visible in the documentation, leading me to believe this might not be idiomatic usage.

More generally, is the expectation for Arrays on the Rust side that they are just simple data abstractions, without exposing higher-order methods such as filtering/masking? Is the intent to leave that to users? If I missed some piece of documentation, please let me know. For my use-case I ended up trying something like:


{code:java}
let column = batch.column(0).as_any().downcast_ref::<Float64Array>().unwrap();
let mut builder = BooleanBuilder::new(batch.num_rows());
let N = 5.0;
for i in 0..batch.num_rows() {
   if column.value(i).unwrap() > N {
      builder.append_value(true).unwrap();
   } else {
      builder.append_value(false).unwrap();
   }
}

let mask = builder.finish();
let filtered_column = filter(column, mask);{code}

If possible, could you provide examples of intended usage of Arrays? Thank you!

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)