You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/04/26 13:31:27 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue #179: Move SortExec partition check to constructor

alamb opened a new issue #179:
URL: https://github.com/apache/arrow-datafusion/issues/179


   *Note*: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-11625
   
   SortExec has the following error check at execution time and this could be moved into the try_new constructor so the error check happens at planning time instead.
   
    
   {code:java}
   if 1 != self.input.output_partitioning().partition_count() {
       return Err(DataFusionError::Internal(
           "SortExec requires a single input partition".to_owned(),
       ));
   } {code}


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #179: Move SortExec partition check to constructor

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #179:
URL: https://github.com/apache/arrow-datafusion/issues/179#issuecomment-826837720


   Comment from Hendrik Makait(hendrik.makait) @ 2021-02-14T17:46:12.067+0000:
   <pre>I'd love to check that out tomorrow.</pre>
   
   Comment from Hendrik Makait(hendrik.makait) @ 2021-02-15T20:05:17.782+0000:
   <pre>Moving this check into the constructor leads to failing tests. As far as I can see, this is because the planner does an optimization step that inserts a merge for children that contain multiple partitions.
   {code:java}
   match plan.required_child_distribution() {
       Distribution::UnspecifiedDistribution => plan.with_new_children(children),
       Distribution::SinglePartition => plan.with_new_children(
           children
               .iter()
               .map(|child| {
                   if child.output_partitioning().partition_count() == 1 {
                       child.clone()
                   } else {
                       Arc::new(MergeExec::new(child.clone()))
                   }
               })
               .collect(),
       ),
   }
   {code}
    What's the reason for moving this check into planning time? How should I proceed?</pre>
   
   Comment from Andy Grove(andygrove) @ 2021-02-16T14:44:11.773+0000:
   <pre>I see. Ok, maybe we can't do this at planning time then.</pre>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org