You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/09/18 02:01:32 UTC

[GitHub] [arrow] jorgecarleitao opened a new pull request #8215: ARROW-9977: [Rust] Added min/max of [Large]StringArray

jorgecarleitao opened a new pull request #8215:
URL: https://github.com/apache/arrow/pull/8215


   Note that, contrarily to horizontal operations, that we can represent as `f(&ArrayRef) -> ArrayRef`, in vertical operations like `min` and `sum`, we do not have a dynamically-typed struct to return, and thus need to have a function signature for each type. :(
   
   I wonder if we could have in Rust a dynamically struct for single values, like DataFusion has ScalarValue.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8215: ARROW-9977: [Rust] Added min/max of [Large]StringArray

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8215:
URL: https://github.com/apache/arrow/pull/8215#issuecomment-694605111


   https://issues.apache.org/jira/browse/ARROW-9977


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8215: ARROW-9977: [Rust] Added min/max of [Large]StringArray

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8215:
URL: https://github.com/apache/arrow/pull/8215#issuecomment-695812820


   This was merged as part of #8172 and will thus be closed. I also marked the respective Jira issue as done.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao closed pull request #8215: ARROW-9977: [Rust] Added min/max of [Large]StringArray

Posted by GitBox <gi...@apache.org>.
jorgecarleitao closed pull request #8215:
URL: https://github.com/apache/arrow/pull/8215


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8215: ARROW-9977: [Rust] Added min/max of [Large]StringArray

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8215:
URL: https://github.com/apache/arrow/pull/8215#issuecomment-695812820


   This was merged as part of #8172 and will thus be closed. I also marked the respective Jira issue as done.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao closed pull request #8215: ARROW-9977: [Rust] Added min/max of [Large]StringArray

Posted by GitBox <gi...@apache.org>.
jorgecarleitao closed pull request #8215:
URL: https://github.com/apache/arrow/pull/8215


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8215: ARROW-9977: [Rust] Added min/max of [Large]StringArray

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8215:
URL: https://github.com/apache/arrow/pull/8215#issuecomment-694614886


   Rebased.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #8215: ARROW-9977: [Rust] Added min/max of [Large]StringArray

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #8215:
URL: https://github.com/apache/arrow/pull/8215#discussion_r491188026



##########
File path: rust/arrow/src/compute/kernels/aggregate.rs
##########
@@ -19,9 +19,44 @@
 
 use std::ops::Add;
 
-use crate::array::{Array, PrimitiveArray};
+use crate::array::{
+    Array, LargeStringArray, PrimitiveArray, StringArray, StringArrayOps,
+};
 use crate::datatypes::ArrowNumericType;
 
+/// Helper macro to perform min/max of strings
+macro_rules! min_max_string_helper {
+    ($array:expr, $cmp:tt) => {{
+        let null_count = $array.null_count();
+
+        if null_count == $array.len() {
+            return None
+        }
+        let mut n = "";
+        let mut has_value = false;

Review comment:
       I think you could also use an option here rather than a flag + value.
   
   So something like `let mut n = None;`
   
   And then instead of 
   
   ```
                       has_value = true;
                       n = item;
   ``` 
   something like
   ```
                       n = Some(item);
   ```

##########
File path: rust/arrow/src/compute/kernels/aggregate.rs
##########
@@ -19,9 +19,44 @@
 
 use std::ops::Add;
 
-use crate::array::{Array, PrimitiveArray};
+use crate::array::{
+    Array, LargeStringArray, PrimitiveArray, StringArray, StringArrayOps,
+};
 use crate::datatypes::ArrowNumericType;
 
+/// Helper macro to perform min/max of strings
+macro_rules! min_max_string_helper {
+    ($array:expr, $cmp:tt) => {{
+        let null_count = $array.null_count();
+
+        if null_count == $array.len() {
+            return None
+        }
+        let mut n = "";
+        let mut has_value = false;
+        let data = $array.data();
+
+        if null_count == 0 {
+            for i in 0..data.len() {
+                let item = $array.value(i);
+                if !has_value || (&n $cmp &item) {
+                    has_value = true;
+                    n = item;
+                }
+            }
+        } else {
+            for i in 0..data.len() {
+                let item = $array.value(i);
+                if data.is_valid(i) && (!has_value || (&n $cmp &item)) {
+                    has_value = true;
+                    n = item;
+                }
+            }
+        }
+        Some(n)

Review comment:
       then you could return n directly




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org