You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/27 18:11:29 UTC

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #786: Remove GroupByScalar and use ScalarValue in preparation for supporting null values in GroupBy

alamb commented on a change in pull request #786:
URL: https://github.com/apache/arrow-datafusion/pull/786#discussion_r677687067



##########
File path: datafusion/src/physical_plan/hash_aggregate.rs
##########
@@ -1088,132 +1086,17 @@ fn finalize_aggregation(
     }
 }
 
-/// Extract the value in `col[row]` from a dictionary a GroupByScalar
-fn dictionary_create_group_by_value<K: ArrowDictionaryKeyType>(
-    col: &ArrayRef,
-    row: usize,
-) -> Result<GroupByScalar> {
-    let dict_col = col.as_any().downcast_ref::<DictionaryArray<K>>().unwrap();
-
-    // look up the index in the values dictionary
-    let keys_col = dict_col.keys();
-    let values_index = keys_col.value(row).to_usize().ok_or_else(|| {
-        DataFusionError::Internal(format!(
-            "Can not convert index to usize in dictionary of type creating group by value {:?}",
-            keys_col.data_type()
-        ))
-    })?;
-
-    create_group_by_value(dict_col.values(), values_index)
-}
-
 /// Extract the value in `col[row]` as a GroupByScalar
-fn create_group_by_value(col: &ArrayRef, row: usize) -> Result<GroupByScalar> {
-    match col.data_type() {

Review comment:
       Note that none of this code that created `GroupByScalar` checks `col.is_valid(row)` and thus is using whatever value happens to be in the array slots when they are NULL




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org