You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/17 16:55:38 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4255: Support parquet page filtering on min_max for `decimal128` and `string` columns

alamb commented on code in PR #4255:
URL: https://github.com/apache/arrow-datafusion/pull/4255#discussion_r1025451489


##########
datafusion/core/src/physical_plan/file_format/parquet/page_filter.rs:
##########
@@ -390,16 +405,50 @@ macro_rules! get_min_max_values_for_page_index {
         match $self.col_page_indexes {
             Index::NONE => None,
             Index::INT32(index) => {
-                let vec = &index.indexes;
-                Some(Arc::new(Int32Array::from_iter(
-                    vec.iter().map(|x| x.$func().cloned()),
-                )))
+                match $self.target_type {
+                    // int32 to decimal with the precision and scale
+                    Some(DataType::Decimal128(precision, scale)) => {
+                        let vec = &index.indexes;
+                        if let Ok(arr) = Decimal128Array::from_iter_values(
+                            vec.iter().map(|x| *x.$func().unwrap() as i128),
+                        )
+                        .with_precision_and_scale(*precision, *scale)
+                        {
+                            return Some(Arc::new(arr));
+                        } else {
+                            return None;
+                        }
+                    }
+                    _ => {
+                        let vec = &index.indexes;
+                        Some(Arc::new(Int32Array::from_iter(
+                            vec.iter().map(|x| x.$func().cloned()),
+                        )))
+                    }
+                }
             }
             Index::INT64(index) => {
-                let vec = &index.indexes;
-                Some(Arc::new(Int64Array::from_iter(
-                    vec.iter().map(|x| x.$func().cloned()),
-                )))
+                match $self.target_type {
+                    // int64 to decimal with the precision and scale
+                    Some(DataType::Decimal128(precision, scale)) => {
+                        let vec = &index.indexes;
+                        if let Ok(arr) = Decimal128Array::from_iter_values(
+                            vec.iter().map(|x| *x.$func().unwrap() as i128),

Review Comment:
   I wonder if it would be better to follow the model of below and rather than unwrapping turn it into 'NULL' 
   
   something like 🤔 
   
   ```suggestion
                               vec.iter().map(|x| *x.$func().ok()).map(|v| v as i128),
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org