You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/05 18:28:00 UTC

[GitHub] [arrow-ballista] thinkharderdev commented on a diff in pull request #59: [Draft] Support for multi-scheduler deployments

thinkharderdev commented on code in PR #59:
URL: https://github.com/apache/arrow-ballista/pull/59#discussion_r889722012


##########
ballista/rust/core/proto/ballista.proto:
##########
@@ -622,6 +622,37 @@ enum JoinSide{
 ///////////////////////////////////////////////////////////////////////////////////////////////////
 // Ballista Scheduling
 ///////////////////////////////////////////////////////////////////////////////////////////////////
+message TaskInputPartitions {
+  uint32 partition = 1;
+  repeated PartitionLocation partition_location = 2;
+}
+
+message GraphStageInput {
+  uint32 stage_id = 1;
+  repeated TaskInputPartitions partition_locations = 2;
+  bool complete = 3;
+}
+
+
+message ExecutionGraphStage {
+  uint64 stage_id = 1;
+  uint32 partitions = 2;
+  PhysicalHashRepartition output_partitioning = 3;
+  repeated  GraphStageInput inputs = 4;
+  bytes plan = 5;
+  repeated TaskStatus task_statuses = 6;
+  uint32 output_link = 7;
+  bool resolved = 8;
+}
+
+message ExecutionGraph {
+  string job_id = 1;
+  string session_id = 2;
+  JobStatus status = 3;
+  repeated ExecutionGraphStage stages = 4;

Review Comment:
   No, sorry I should have explained that. I'll add better docs to this struct, but for now each stage has an `output_link: Option<usize>` which specifies where it sends it's output. If `output_link` is `None` then the stage is final and it sends its output to the `ExecutionGraph`s `output_locations`. Likewise, each stage has a `inputs: HashMap<usize,StageOuput>` which "collects" input locations from its input stages. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org