You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/26 13:10:48 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2791: Add config option for coalesce_batches physical optimization rule, make optional

alamb commented on code in PR #2791:
URL: https://github.com/apache/arrow-datafusion/pull/2791#discussion_r906813144


##########
datafusion/core/src/execution/context.rs:
##########
@@ -1247,16 +1250,26 @@ impl SessionState {
         rules.push(Arc::new(LimitPushDown::new()));
         rules.push(Arc::new(SingleDistinctToGroupBy::new()));
 
+        let mut physical_optimizers: Vec<Arc<dyn PhysicalOptimizerRule + Sync + Send>> = vec![
+            Arc::new(AggregateStatistics::new()),
+            Arc::new(HashBuildProbeOrder::new()),
+        ];
+        if config.config_options.get_bool(OPT_COALESCE_BATCHES) {

Review Comment:
   I wonder if it would be helpful to add a test to ensure the option to disable coalsce'ing batches doesn't get broken in the future



##########
datafusion/core/src/config.rs:
##########
@@ -27,6 +28,13 @@ pub const OPT_FILTER_NULL_JOIN_KEYS: &str = "datafusion.optimizer.filter_null_jo
 /// Configuration option "datafusion.execution.batch_size"
 pub const OPT_BATCH_SIZE: &str = "datafusion.execution.batch_size";
 
+/// Configuration option "datafusion.execution.coalesce_batches"

Review Comment:
   I really like how this new option framework is coming together ❤️ 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org