You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/27 12:26:14 UTC

[GitHub] [arrow] alamb commented on a change in pull request #8998: ARROW-11018: [Rust][DataFusion] Add support for column-level statistics, null count.

alamb commented on a change in pull request #8998:
URL: https://github.com/apache/arrow/pull/8998#discussion_r549106965



##########
File path: rust/datafusion/src/datasource/datasource.rs
##########
@@ -33,6 +33,14 @@ pub struct Statistics {
     pub num_rows: Option<usize>,
     /// total byte of the table rows
     pub total_byte_size: Option<usize>,
+    /// Statistics on a column level
+    pub column_statistics: Option<Vec<ColumnStatistics>>,
+}
+/// This table statistics are estimates about column

Review comment:
       Eventually the use of these statistics is probably more general than just datasources (aka for a cost based optimizer we would probably want the estimates to be attached to the output of all LogicalPlan nodes). 
   
   But this is a good start for now!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org