You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by jl...@apache.org on 2022/12/02 22:10:48 UTC

[pinot] branch fix-segment-upload-race-condition updated (f8f6a9fdfe -> d5f494503c)

This is an automated email from the ASF dual-hosted git repository.

jlli pushed a change to branch fix-segment-upload-race-condition
in repository https://gitbox.apache.org/repos/asf/pinot.git


 discard f8f6a9fdfe Fix race condition when 2 segment upload occurred for the same segment
     new d5f494503c Fix race condition when 2 segment upload occurred for the same segment

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (f8f6a9fdfe)
            \
             N -- N -- N   refs/heads/fix-segment-upload-race-condition (d5f494503c)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../java/org/apache/pinot/controller/api/upload/ZKOperator.java     | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[pinot] 01/01: Fix race condition when 2 segment upload occurred for the same segment

Posted by jl...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jlli pushed a commit to branch fix-segment-upload-race-condition
in repository https://gitbox.apache.org/repos/asf/pinot.git

commit d5f494503ce0a9988a7664c04fa5c52b30b6d9d8
Author: Jack Li(Analytics Engineering) <jl...@jlli-mn1.linkedin.biz>
AuthorDate: Fri Dec 2 14:06:07 2022 -0800

    Fix race condition when 2 segment upload occurred for the same segment
---
 .../pinot/controller/api/upload/ZKOperator.java    | 28 ++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/ZKOperator.java b/pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/ZKOperator.java
index 3d33a358f3..3089a072ea 100644
--- a/pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/ZKOperator.java
+++ b/pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/ZKOperator.java
@@ -261,6 +261,9 @@ public class ZKOperator {
   }
 
   private void checkCRC(HttpHeaders headers, String tableNameWithType, String segmentName, long existingCrc) {
+    if (headers == null) {
+      return;
+    }
     String expectedCrcStr = headers.getHeaderString(HttpHeaders.IF_MATCH);
     if (expectedCrcStr != null) {
       long expectedCrc;
@@ -344,10 +347,27 @@ public class ZKOperator {
       // Release lock. Expected version will be 0 as we hold a lock and no updates could take place meanwhile.
       newSegmentZKMetadata.setSegmentUploadStartTime(-1);
       if (!_pinotHelixResourceManager.updateZkMetadata(tableNameWithType, newSegmentZKMetadata, 0)) {
-        _pinotHelixResourceManager.deleteSegment(tableNameWithType, segmentName);
-        LOGGER.info("Deleted zk entry and segment {} for table {}.", segmentName, tableNameWithType);
-        throw new RuntimeException(
-            String.format("Failed to update ZK metadata for segment: %s of table: %s", segmentFile, tableNameWithType));
+        // There is a race condition when it took too much time for the 1st segment upload to process (due to slow
+        // PinotFS access), which leads to the 2nd attempt of segment upload, and the 2nd segment upload succeeded.
+        // In this case, when the 1st upload comes back, it shouldn't blindly delete the segment when it failed to
+        // update the zk metadata. Instead, the 1st attempt should validate the crc one more time. If crc remains the
+        // same, segment deletion should be skipped.
+        ZNRecord existingSegmentMetadataZNRecord =
+            _pinotHelixResourceManager.getSegmentMetadataZnRecord(tableNameWithType, segmentName);
+        // Check if CRC match when IF-MATCH header is set
+        SegmentZKMetadata segmentZKMetadata = new SegmentZKMetadata(existingSegmentMetadataZNRecord);
+        long existingCrc = segmentZKMetadata.getCrc();
+        try {
+          checkCRC(headers, tableNameWithType, segmentName, existingCrc);
+          LOGGER.info("CRC is the same as the one in ZK. Skip updating the zk metadata for segment: " + segmentName);
+        } catch (ControllerApplicationException e) {
+          LOGGER.error("Failed to validate CRC for segment: " + segmentName, e);
+          _pinotHelixResourceManager.deleteSegment(tableNameWithType, segmentName);
+          LOGGER.info("Deleted zk entry and segment {} for table {}.", segmentName, tableNameWithType);
+          throw new RuntimeException(
+              String.format("Failed to update ZK metadata for segment: %s of table: %s", segmentFile,
+                  tableNameWithType));
+        }
       }
     }
   }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org