You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2020/08/22 20:06:00 UTC
[jira] [Created] (ARROW-9832) [Rust] [DataFusion] Refactor
PhysicalPlan to remove Partition
Andy Grove created ARROW-9832:
---------------------------------
Summary: [Rust] [DataFusion] Refactor PhysicalPlan to remove Partition
Key: ARROW-9832
URL: https://issues.apache.org/jira/browse/ARROW-9832
Project: Apache Arrow
Issue Type: Improvement
Components: Rust, Rust - DataFusion
Reporter: Andy Grove
Assignee: Andy Grove
Fix For: 2.0.0
As a step towards supporting an improved threading model, I would like to refactor to remove the redundant `Partition` trait. The implementations of these partition traits really just duplicate the state of their operator and just add the partition number. It would be better to just pass the partition number to the execute() method in the PhysicalPlan trait.
This means it will also be necessary for each ExecutionPlan to state its output partitioning (and this is needed for other reasons when we get into the physical optimizer).
Proposed trait:
{code:java}
/// Partition-aware execution plan for a relation
pub trait ExecutionPlan: Debug {
/// Get the schema for this execution plan
fn schema(&self) -> SchemaRef;
/// Specifies the output partitioning of this execution plan
fn output_partitioning(&self) -> Partitioning;
/// Execute this plan for a single partition and return a stream of results
fn execute(&self, partition: usize) -> Result<Arc<Mutex<dyn RecordBatchReader + Send + Sync>>>;
}
/// Partitioning schemes supported by operators.
#[derive(Debug, Clone)]
pub enum Partitioning {
UnknownPartitioning(usize),
}
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)