You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/08 13:33:36 UTC

[GitHub] [arrow] alamb commented on a change in pull request #8866: ARROW-10781:[Rust] [DataFusion] add the 'Statistics' interface in data source

alamb commented on a change in pull request #8866:
URL: https://github.com/apache/arrow/pull/8866#discussion_r538371128



##########
File path: rust/datafusion/src/datasource/datasource.rs
##########
@@ -24,6 +24,15 @@ use crate::arrow::datatypes::SchemaRef;
 use crate::error::Result;
 use crate::physical_plan::ExecutionPlan;
 
+/// The table statistics
+#[derive(Clone)]
+pub struct Statistics {
+    /// The number of table rows
+    pub num_rows: i64,

Review comment:
       ```suggestion
       /// The number of table rows
       pub num_rows: u64,
   ```

##########
File path: rust/datafusion/src/datasource/datasource.rs
##########
@@ -24,6 +24,15 @@ use crate::arrow::datatypes::SchemaRef;
 use crate::error::Result;
 use crate::physical_plan::ExecutionPlan;
 
+/// The table statistics

Review comment:
       I suggest clarifying in the comment if the statistics are meant to be a hint or accurate. Specifically, it would help to know if other parts of the system should rely on them being correct or if they are simply a hit.
   
   Maybe @andygrove  has some thoughts as he was the one who filed https://issues.apache.org/jira/browse/ARROW-10781

##########
File path: rust/datafusion/src/datasource/datasource.rs
##########
@@ -24,6 +24,15 @@ use crate::arrow::datatypes::SchemaRef;
 use crate::error::Result;
 use crate::physical_plan::ExecutionPlan;
 
+/// The table statistics
+#[derive(Clone)]
+pub struct Statistics {
+    /// The number of table rows
+    pub num_rows: i64,
+    /// total byte of the table rows
+    pub total_byte_size: i64,

Review comment:
       ```suggestion
       /// total size of the table, in bytes
       pub total_byte_size: u64,
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org