You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/03/17 01:46:51 UTC

[GitHub] [druid] zachjsh opened a new pull request #9523: Ability to Delete task logs and segments from Azure Storage

zachjsh opened a new pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523
 
 
   ### Description
   
   * implement ability to delete all tasks logs or all task logs
     written before a particular date when written to Azure storage
   
   * implement ability to delete all segments from Azure deep storage
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394680498
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureTaskLogs.java
 ##########
 @@ -151,14 +167,36 @@ private String getTaskReportsKey(String taskid)
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: %s    prefix: %s].",
+             config.getContainer(), config.getPrefix()
+    );
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394680565
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentKiller.java
 ##########
 @@ -72,9 +86,26 @@ public void kill(DataSegment segment) throws SegmentLoadingException
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all segment files from Azure storage location [bucket: '%s' prefix: '%s']",
+             segmentConfig.getContainer(), segmentConfig.getPrefix()
+    );
+    try {
+      AzureUtils.deleteObjectsInPath(
+          azureStorage,
+          inputDataConfig,
+          accountConfig,
+          azureCloudBlobIterableFactory,
+          segmentConfig.getContainer(),
+          segmentConfig.getPrefix(),
+          Predicates.alwaysTrue()
+      );
+    }
+    catch (Exception e) {
+      log.error("Error occurred while deleting segment files from s3. Error: %s", e.getMessage());
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394680536
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureTaskLogs.java
 ##########
 @@ -151,14 +167,36 @@ private String getTaskReportsKey(String taskid)
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: %s    prefix: %s].",
+             config.getContainer(), config.getPrefix()
+    );
+
+    long now = timeSupplier.getAsLong();
+    killOlderThan(now);
   }
 
   @Override
-  public void killOlderThan(long timestamp)
+  public void killOlderThan(long timestamp) throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: '%s' prefix: '%s'] older than %s.",
+             config.getContainer(), config.getPrefix(), new Date(timestamp)
+    );
+    try {
+      AzureUtils.deleteObjectsInPath(
+          azureStorage,
+          inputDataConfig,
+          accountConfig,
+          azureCloudBlobIterableFactory,
+          config.getContainer(),
+          config.getPrefix(),
+          (object) -> object.getLastModifed().getTime() < timestamp
+      );
+    }
+    catch (Exception e) {
+      log.error("Error occurred while deleting task log files from s3. Error: %s", e.getMessage());
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394182304
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentKiller.java
 ##########
 @@ -72,9 +86,26 @@ public void kill(DataSegment segment) throws SegmentLoadingException
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all segment files from Azure storage location [bucket: '%s' prefix: '%s']",
+             segmentConfig.getContainer(), segmentConfig.getPrefix()
+    );
 
 Review comment:
   nit: formatting
   
   ```suggestion
       log.info(
           "Deleting all segment files from Azure storage location [bucket: '%s' prefix: '%s']",
           segmentConfig.getContainer(),
           segmentConfig.getPrefix()
       );
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394182851
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentKiller.java
 ##########
 @@ -72,9 +86,26 @@ public void kill(DataSegment segment) throws SegmentLoadingException
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all segment files from Azure storage location [bucket: '%s' prefix: '%s']",
+             segmentConfig.getContainer(), segmentConfig.getPrefix()
+    );
+    try {
+      AzureUtils.deleteObjectsInPath(
+          azureStorage,
+          inputDataConfig,
+          accountConfig,
+          azureCloudBlobIterableFactory,
+          segmentConfig.getContainer(),
+          segmentConfig.getPrefix(),
+          Predicates.alwaysTrue()
+      );
+    }
+    catch (Exception e) {
+      log.error("Error occurred while deleting segment files from s3. Error: %s", e.getMessage());
 
 Review comment:
   Heh, should probably be
   
   ```suggestion
         log.error("Error occurred while deleting segment files from Azure. Error: %s", e.getMessage());
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis merged pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
clintropolis merged pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394680588
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentKiller.java
 ##########
 @@ -72,9 +86,26 @@ public void kill(DataSegment segment) throws SegmentLoadingException
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all segment files from Azure storage location [bucket: '%s' prefix: '%s']",
+             segmentConfig.getContainer(), segmentConfig.getPrefix()
+    );
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394184380
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureTaskLogs.java
 ##########
 @@ -151,14 +167,36 @@ private String getTaskReportsKey(String taskid)
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: %s    prefix: %s].",
+             config.getContainer(), config.getPrefix()
+    );
+
+    long now = timeSupplier.getAsLong();
+    killOlderThan(now);
   }
 
   @Override
-  public void killOlderThan(long timestamp)
+  public void killOlderThan(long timestamp) throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: '%s' prefix: '%s'] older than %s.",
+             config.getContainer(), config.getPrefix(), new Date(timestamp)
+    );
 
 Review comment:
   same formatting nit:
   ```suggestion
       log.info(
           "Deleting all task logs from Azure storage location [bucket: '%s' prefix: '%s'] older than %s.",
           config.getContainer(),
           config.getPrefix(),
           new Date(timestamp)
       );
   ```
   sorry there isn't a style rule for this, last we checked it wasn't possible to do, but it's been a while so it might be worth checking on again...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394183307
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureTaskLogs.java
 ##########
 @@ -151,14 +167,36 @@ private String getTaskReportsKey(String taskid)
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: %s    prefix: %s].",
+             config.getContainer(), config.getPrefix()
+    );
+
+    long now = timeSupplier.getAsLong();
+    killOlderThan(now);
   }
 
   @Override
-  public void killOlderThan(long timestamp)
+  public void killOlderThan(long timestamp) throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: '%s' prefix: '%s'] older than %s.",
+             config.getContainer(), config.getPrefix(), new Date(timestamp)
+    );
+    try {
+      AzureUtils.deleteObjectsInPath(
+          azureStorage,
+          inputDataConfig,
+          accountConfig,
+          azureCloudBlobIterableFactory,
+          config.getContainer(),
+          config.getPrefix(),
+          (object) -> object.getLastModifed().getTime() < timestamp
+      );
+    }
+    catch (Exception e) {
+      log.error("Error occurred while deleting task log files from s3. Error: %s", e.getMessage());
 
 Review comment:
   same wrong cloud
   ```suggestion
         log.error("Error occurred while deleting task log files from Azure. Error: %s", e.getMessage());
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394183754
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureTaskLogs.java
 ##########
 @@ -151,14 +167,36 @@ private String getTaskReportsKey(String taskid)
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: %s    prefix: %s].",
+             config.getContainer(), config.getPrefix()
+    );
 
 Review comment:
   same comment about 1 line per arg formatting if spans multiple lines
   ```suggestion
       log.info(
           "Deleting all task logs from Azure storage location [bucket: %s    prefix: %s].",
           config.getContainer(),
           config.getPrefix()
       );
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9523: Ability to Delete task logs and segments from Azure Storage
URL: https://github.com/apache/druid/pull/9523#discussion_r394680467
 
 

 ##########
 File path: extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureTaskLogs.java
 ##########
 @@ -151,14 +167,36 @@ private String getTaskReportsKey(String taskid)
   }
 
   @Override
-  public void killAll()
+  public void killAll() throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: %s    prefix: %s].",
+             config.getContainer(), config.getPrefix()
+    );
+
+    long now = timeSupplier.getAsLong();
+    killOlderThan(now);
   }
 
   @Override
-  public void killOlderThan(long timestamp)
+  public void killOlderThan(long timestamp) throws IOException
   {
-    throw new UnsupportedOperationException("not implemented");
+    log.info("Deleting all task logs from Azure storage location [bucket: '%s' prefix: '%s'] older than %s.",
+             config.getContainer(), config.getPrefix(), new Date(timestamp)
+    );
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org