You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/01/03 03:08:52 UTC

[GitHub] jon-wei commented on a change in pull request #6724: Fix issue that tasks failed because of no sink for identifier

jon-wei commented on a change in pull request #6724: Fix issue that tasks failed because of no sink for identifier
URL: https://github.com/apache/incubator-druid/pull/6724#discussion_r244906073
 
 

 ##########
 File path: server/src/main/java/org/apache/druid/segment/realtime/appenderator/AppenderatorImpl.java
 ##########
 @@ -485,8 +486,12 @@ public void clear() throws InterruptedException
     final List<Pair<FireHydrant, SegmentIdentifier>> indexesToPersist = new ArrayList<>();
     int numPersistedRows = 0;
     long bytesPersisted = 0L;
-    for (SegmentIdentifier identifier : sinks.keySet()) {
-      final Sink sink = sinks.get(identifier);
+    Iterator<Map.Entry<SegmentIdentifier, Sink>> iterator = sinks.entrySet().iterator();
+
+    while (iterator.hasNext()) {
 
 Review comment:
   @QiuMM 
   
   > I have never observed any exceptions caused by this. And I think there is no need to worry about it because the program will wait for any outstanding pushes to finish, then abandon the segment inside the persist thread:
   
   I believe you're right that `push` is okay.
   
   In the `persistAll` case I'm guessing what's happening is something like:
   1. Task does an incremental publish for some sequence
   2. Publish completes, `driver.registerHandoff` adds a callback after handoff finishes that calls `appenderator.drop` -> `abandonSegment`
   3. Task does another incremental publish for another sequence
   4. Task calls `persistAll` which iterates through sinks, but during this iteration, the handoff callback from the previous incremental publish finishes and removes a sink
   
   For `push`, the input Sinks are read from the Sequence->Segments map instead of looking at the entire set of Sinks, and `StreamAppenderatorDriver.publish` removes the sequence's segments after the publish, so on the next incremental publish, it doesn't have the same problem of accessing Sinks that were handled/dropped as part of the previous incremental publish
   
   Does that sound correct given what you're seeing?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org