You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/26 23:45:28 UTC

[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7558: BaseAppenderatorDriver: Fix potentially overeager segment cleanup.

jihoonson commented on a change in pull request #7558: BaseAppenderatorDriver: Fix potentially overeager segment cleanup.
URL: https://github.com/apache/incubator-druid/pull/7558#discussion_r279131777
 
 

 ##########
 File path: server/src/main/java/org/apache/druid/segment/realtime/appenderator/BaseAppenderatorDriver.java
 ##########
 @@ -554,39 +554,49 @@ protected AppenderatorDriverAddResult append(
 
             try {
               final Object metadata = segmentsAndMetadata.getCommitMetadata();
+              final ImmutableSet<DataSegment> ourSegments = ImmutableSet.copyOf(segmentsAndMetadata.getSegments());
               final SegmentPublishResult publishResult = publisher.publishSegments(
-                  ImmutableSet.copyOf(segmentsAndMetadata.getSegments()),
+                  ourSegments,
                   metadata == null ? null : ((AppenderatorDriverMetadata) metadata).getCallerMetadata()
               );
 
               if (publishResult.isSuccess()) {
                 log.info("Published segments.");
               } else {
-                if (publishResult.getErrorMsg() == null) {
-                  log.warn(
-                      "Transaction failure while publishing segments. Please check the overlord log."
-                      + " Removing them from deep storage and checking if someone else beat us to publishing."
-                  );
-                } else {
-                  log.warn(
-                      "Transaction failure while publishing segments because of [%s]. Please check the overlord log."
-                      + " Removing them from deep storage and checking if someone else beat us to publishing.",
-                      publishResult.getErrorMsg()
-                  );
-                }
-
-                segmentsAndMetadata.getSegments().forEach(dataSegmentKiller::killQuietly);
+                // Publishing didn't affirmatively succeed. However, segments with our identifiers may still be active
+                // now after all, for two possible reasons:
+                //
+                // 1) A replica may have beat us to publishing these segments. In this case we want to delete the
+                //    segments we pushed (if they had unique paths) to avoid wasting space on deep storage.
+                // 2) We may have actually succeeded, but not realized it due to missing the confirmation response
+                //    from the overlord. In this case we do not want to delete the segments we pushed, since they are
+                //    now live!
 
                 final Set<SegmentIdWithShardSpec> segmentsIdentifiers = segmentsAndMetadata
                     .getSegments()
                     .stream()
                     .map(SegmentIdWithShardSpec::fromDataSegment)
                     .collect(Collectors.toSet());
 
-                if (usedSegmentChecker.findUsedSegments(segmentsIdentifiers)
-                                      .equals(Sets.newHashSet(segmentsAndMetadata.getSegments()))) {
-                  log.info("Our segments really do exist, awaiting handoff.");
+                final Set<DataSegment> activeSegments = usedSegmentChecker.findUsedSegments(segmentsIdentifiers);
+
+                if (activeSegments.equals(ourSegments)) {
+                  log.info("Could not publish segments, but checked and found them already published. Continuing.");
+
+                  // Clean up pushed segments if they are physically disjoint from the published ones (this means
+                  // they were probably pushed by a replica, and with the unique paths option).
+                  final boolean physicallyDisjoint = Sets.intersection(
+                      activeSegments.stream().map(DataSegment::getLoadSpec).collect(Collectors.toSet()),
+                      ourSegments.stream().map(DataSegment::getLoadSpec).collect(Collectors.toSet())
+                  ).isEmpty();
+
+                  if (physicallyDisjoint) {
+                    segmentsAndMetadata.getSegments().forEach(dataSegmentKiller::killQuietly);
+                  }
                 } else {
+                  // Our segments aren't active. Publish failed for some reason. Clean them up and then throw an error.
 
 Review comment:
   Is it still worth to print the error message in `publishResult` as https://github.com/apache/incubator-druid/pull/7558/files#diff-f18d4b23ee86a1255737ab0b2da247b5L574? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org