You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/10 16:44:26 UTC

[GitHub] [arrow-rs] nevi-me commented on a change in pull request #258: support full u32 and u64 roundtrip through parquet

nevi-me commented on a change in pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#discussion_r629513251



##########
File path: parquet/src/column/writer.rs
##########
@@ -925,31 +934,51 @@ impl<T: DataType> ColumnWriterImpl<T> {
     fn update_page_min_max(&mut self, val: &T::T) {
         // simple "isNaN" check that works for all types
         if val == val {
-            if self.min_page_value.as_ref().map_or(true, |min| min > val) {
+            if self
+                .min_page_value
+                .as_ref()
+                .map_or(true, |min| self.compare_greater(min, val))
+            {
                 self.min_page_value = Some(val.clone());
             }
-            if self.max_page_value.as_ref().map_or(true, |max| max < val) {
+            if self
+                .max_page_value
+                .as_ref()
+                .map_or(true, |max| self.compare_greater(val, max))
+            {
                 self.max_page_value = Some(val.clone());
             }
         }
     }
 
     fn update_column_min_max(&mut self) {
-        if self
-            .min_column_value
-            .as_ref()
-            .map_or(true, |min| min > self.min_page_value.as_ref().unwrap())
-        {
+        let update_min = self.min_column_value.as_ref().map_or(true, |min| {
+            let page_value = self.min_page_value.as_ref().unwrap();
+            self.compare_greater(min, page_value)
+        });
+        if update_min {
             self.min_column_value = self.min_page_value.clone();
         }
-        if self
-            .max_column_value
-            .as_ref()
-            .map_or(true, |max| max < self.max_page_value.as_ref().unwrap())
-        {
+
+        let update_max = self.max_column_value.as_ref().map_or(true, |max| {
+            let page_value = self.max_page_value.as_ref().unwrap();
+            self.compare_greater(page_value, max)
+        });
+        if update_max {
             self.max_column_value = self.max_page_value.clone();
         }
     }
+
+    /// Evaluate `a > b` according to underlying logical type.
+    fn compare_greater(&self, a: &T::T, b: &T::T) -> bool {
+        if let Some(LogicalType::INTEGER(int_type)) = self.descr.logical_type() {

Review comment:
       Note: not all implementations might write LogicalType, even though ConvertedType is deprecated. I had to think a bit about this, but it's fine because it's on the write-side, where we will always write LogicalType going forward.
   
   If this was on the read side, we'd have to also check `ConvertedType` as a fallback.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org