You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/03/15 15:32:29 UTC

[GitHub] [pulsar] dave2wave opened a new issue #14693: JClouds Offloader Fails to Delete S3 Objects

dave2wave opened a new issue #14693:
URL: https://github.com/apache/pulsar/issues/14693


   **Description**
   When an Offloaded Ledger's retention period has expired the JClouds call to remove a Blob from the BlobStore silently fails.
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Setup `broker.conf` to include the following aggressive offloading options
      ```
      # configure offloading to S3
      managedLedgerOffloadDriver=aws-s3
      managedLedgerMinLedgerRolloverTimeMinutes=1
      managedLedgerMaxLedgerRolloverTimeMinutes=2
      managedLedgerMaxEntriesPerLedger=500000
      managedLedgerOffloadAutoTriggerSizeThresholdBytes=0
      managedLedgerOffloadedReadPriority=tiered-storage-first
      # Default message retention time - assure offloaded s3 bucket files are deleted.
      defaultRetentionTimeInMinutes=10
      offloadersDirectory=./offloaders
      s3ManagedLedgerOffloadBucket={{ s3_bucket }}
      s3ManagedLedgerOffloadRegion={{ s3_region }}
      s3ManagedLedgerOffloadServiceEndpoint={{ s3_url }}
      loadBalancerAutoUnloadSplitBundlesEnabled=false
      ```
   2. Run a pulsar workload with [OpenMessaging Benchmark](https://github.com/datastax/openmessaging-benchmark) in AWS using the broker configuration above.
   3. Wait for 30 minutes after the worklo
   4. Visit your [AWS S3 Console](https://s3.console.aws.amazon.com/s3/buckets) and see that the offloaded blobs are still in your S3 Bucket.
   
   **Expected behavior**
   All of the S3 blobs except for the most recent should be removed.
   
   **Additional context**
   I have a fix and will be submitting a PR shortly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] dave2wave commented on issue #14693: JClouds Offloader Fails to Delete S3 Objects

Posted by GitBox <gi...@apache.org>.
dave2wave commented on issue #14693:
URL: https://github.com/apache/pulsar/issues/14693#issuecomment-1068469569


   @michaeljmarshall when I say it silently fails I mean that the api call to `BlogStore.removeBlobs` fails without throwing an exception. That means that the managed ledger thinks that the S3 Blob is removed when in fact it is orphaned. The `ManagedLedgerImpl` retry mechanism is never invoked. I proved this by putting `log.info` calls around the `removeBlogs` call and running a full test. I watched the logs and after Pulsar thought the S3 blobs were gone I observed that all the Blobs were still in the S3 Bucket.
   
   Once I applied the changes in the PR I made the same test and when I saw the `log.info` I was able to confirm in the S3 Bucket that the Blobs were removed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] michaeljmarshall commented on issue #14693: JClouds Offloader Fails to Delete S3 Objects

Posted by GitBox <gi...@apache.org>.
michaeljmarshall commented on issue #14693:
URL: https://github.com/apache/pulsar/issues/14693#issuecomment-1068444706


   @dave2wave - great find. When you say that it silently fails, how long did you let the broker try to delete the S3 object? I just noticed this retry code:
   
   https://github.com/apache/pulsar/blob/9a88508426cd2f6baf8d25a225f5f27de5bc1a7a/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L3169-L3177
   
   If I am reading it correctly, it will take an average of 5 hours before it considers an offloaded ledger to have failed deletion, and I don't think we log anything until the 10 retries have been attempted. Does that interpretation align with the behavior you're observing?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org