You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/09/16 04:23:36 UTC
[GitHub] [druid] LakshSingla commented on a diff in pull request #13062: Use worker number instead of task id in MSQ for communication to/from workers.

LakshSingla commented on code in PR #13062:
URL: https://github.com/apache/druid/pull/13062#discussion_r972600501


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/shuffle/DurableStorageInputChannelFactory.java:
##########
@@ -112,11 +119,33 @@ public ReadableFrameChannel openChannel(StageId stageId, int workerNumber, int p
     catch (Exception e) {
       throw new IOE(
           e,
-          "Could not find remote output of worker task[%s] stage[%d] partition[%d]",
-          workerTaskId,
+          "Could not find remote output of worker task[%d] stage[%d] partition[%d]",
+          workerNumber,
           stageId.getStageNumber(),
           partitionNumber
       );
     }
   }
+
+  @Nullable
+  public String findSuccessfulPartitionOutput(
+      final String controllerTaskId,
+      final int workerNo,
+      final int stageNumber,
+      final int partitionNumber
+  ) throws IOException
+  {
+    List<String> fileNames = storageConnector.lsFiles(
+        DurableStorageOutputChannelFactory.getPartitionOutputsFolderName(
+            controllerTaskId,
+            workerNo,
+            stageNumber,
+            partitionNumber
+        )
+    );
+    Optional<String> maybeFileName = fileNames.stream()

Review Comment:
   > As rows in both of them may not follow the same order we might be in a soup if worker one reads zombie_task files and worker 2 reads good_id files.
   
   I was under the assumption that the row ordering in both of the successful writes should be the same.
   Considering a non shuffling case, it doesn't change the row order, therefore the output of the successful write should be identical to the input (which if we track back to the original input source, should have a fixed row order.
   In a shuffling case, we might sort the rows; therefore, the output must be similar to the sort done. I think there might be some indeterminism there (which I doubt considering that the FrameChannelMerger for both workers should produce the same output), but when we read the sorted data, we pass it through the `FrameChannelMerger` so as long as the rows are identical (and not the order) in the files we shouldn't have an issue.
   
   This is all considering that the successful files have different outputs, which I don't think should be the case 🤔. WDYT?  



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/shuffle/DurableStorageInputChannelFactory.java:
##########
@@ -112,11 +119,33 @@ public ReadableFrameChannel openChannel(StageId stageId, int workerNumber, int p
     catch (Exception e) {
       throw new IOE(
           e,
-          "Could not find remote output of worker task[%s] stage[%d] partition[%d]",
-          workerTaskId,
+          "Could not find remote output of worker task[%d] stage[%d] partition[%d]",
+          workerNumber,
           stageId.getStageNumber(),
           partitionNumber
       );
     }
   }
+
+  @Nullable
+  public String findSuccessfulPartitionOutput(
+      final String controllerTaskId,
+      final int workerNo,
+      final int stageNumber,
+      final int partitionNumber
+  ) throws IOException
+  {
+    List<String> fileNames = storageConnector.lsFiles(
+        DurableStorageOutputChannelFactory.getPartitionOutputsFolderName(
+            controllerTaskId,
+            workerNo,
+            stageNumber,
+            partitionNumber
+        )
+    );
+    Optional<String> maybeFileName = fileNames.stream()

Review Comment:
   > As rows in both of them may not follow the same order we might be in a soup if worker one reads zombie_task files and worker 2 reads good_id files.
   
   I was under the assumption that the row ordering in both of the successful writes should be the same.
   
   Considering a non shuffling case, it doesn't change the row order, therefore the output of the successful write should be identical to the input (which if we track back to the original input source, should have a fixed row order.
   
   In a shuffling case, we might sort the rows; therefore, the output must be similar to the sort done. I think there might be some indeterminism there (which I doubt considering that the FrameChannelMerger for both workers should produce the same output), but when we read the sorted data, we pass it through the `FrameChannelMerger` so as long as the rows are identical (and not the order) in the files we shouldn't have an issue.
   
   This is all considering that the successful files have different outputs, which I don't think should be the case 🤔. WDYT?  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org