You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/02 14:58:31 UTC

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2671: If statistics of column Max/Min value does not exists in parquet file, sent Min/Max to None

Ted-Jiang commented on code in PR #2671:
URL: https://github.com/apache/arrow-datafusion/pull/2671#discussion_r888051780


##########
datafusion/core/src/datasource/file_format/parquet.rs:
##########
@@ -344,6 +362,10 @@ fn fetch_statistics(
                             table_idx,
                             stats,
                         )
+                    } else {
+                        // If none statistics of current column exists, set the Max/Min Accumulator to None.

Review Comment:
   > > If i miss something, plz tell me
   > 
   > Once max_values and min_values are set to none, even if the column statistics exists in next row_group, summarize_min_max will do nothing because the max_values and min_values are none.
   Ohh, thanks i will check later.
   



##########
datafusion/core/src/datasource/file_format/parquet.rs:
##########
@@ -344,6 +362,10 @@ fn fetch_statistics(
                             table_idx,
                             stats,
                         )
+                    } else {
+                        // If none statistics of current column exists, set the Max/Min Accumulator to None.

Review Comment:
   > > If i miss something, plz tell me
   > 
   > Once max_values and min_values are set to none, even if the column statistics exists in next row_group, summarize_min_max will do nothing because the max_values and min_values are none.
   
   Ohh, thanks i will check later.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org