You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/10/28 06:26:54 UTC

[GitHub] [druid] LakshSingla opened a new pull request, #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

LakshSingla opened a new pull request, #13269:
URL: https://github.com/apache/druid/pull/13269

   ### Description
   
   (Updating the PR with docs and tests)
   
   MSQ uses durable storage for storing temporary data and stage outputs in fault-tolerant mode. In case of a successful exit, the durable storage would get cleaned up once these files are no longer required. However if the worker/controller tasks exit abruptly, the durable storage wouldn't get cleaned up automatically which can result in unrequited temporary folders polluting the durable storage. 
   This PR adds an Overlord helper which periodically cleans up these stray directories from the durable storage if enabled.
   
   
   #### Release note
   To be added
   <hr>
   
   ##### Key changed/added classes in this PR
    * `DurableStorageCleaner`
    * `DurableStorageCleanerConfig`
    * `TheirBaz`
   
   <hr>
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist below are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   This PR has:
   
   - [x] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1016391968


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/guice/MSQDurableStorageModule.java:
##########
@@ -47,6 +51,7 @@ public class MSQDurableStorageModule implements DruidModule
   @Inject
   private Properties properties;
 
+  // Dummy constructor so that it can get injected dynamically at runtime

Review Comment:
   I don't have an idea of the same, but I will check. Thanks for the suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1015069820


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleanerConfig.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.base.Preconditions;
+
+public class DurableStorageCleanerConfig
+{
+
+  private static final long DEFAULT_INITIAL_DELAY_SECONDS = 86400L;
+  private static final long DEFAULT_DELAY_SECONDS = 86400L;
+
+  /**
+   * Whether the {@link DurableStorageCleaner} helper should be enabled or not
+   */
+  @JsonProperty
+  private final boolean enabled;
+
+  /**
+   * Initial delay in seconds post which the durable storage cleaner should run
+   */
+  @JsonProperty
+  private final long initialDelaySeconds;
+
+  /**
+   * The delay (in seconds) after the last run post which the durable storage cleaner would clean the outputs
+   */
+  @JsonProperty
+  private final long delaySeconds;
+
+  @JsonCreator
+  public DurableStorageCleanerConfig(
+      @JsonProperty("enabled") final Boolean enabled,
+      @JsonProperty("initialDelay") final Long initialDelaySeconds,
+      @JsonProperty("delay") final Long delaySeconds
+  )
+  {
+    this.enabled = enabled != null && enabled;

Review Comment:
   IMHO let's bake it in with disabled and enable it by default in future releases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] adarshsanjeev commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
adarshsanjeev commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1010377540


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleaner.java:
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.google.common.collect.Sets;
+import com.google.inject.Inject;
+import org.apache.druid.indexing.overlord.TaskRunner;
+import org.apache.druid.indexing.overlord.TaskRunnerWorkItem;
+import org.apache.druid.indexing.overlord.helpers.OverlordHelper;
+import org.apache.druid.java.util.common.concurrent.ScheduledExecutors;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.msq.shuffle.DurableStorageOutputChannelFactory;
+import org.apache.druid.storage.StorageConnector;
+import org.joda.time.Duration;
+
+import java.io.IOException;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.stream.Collectors;
+
+/**
+ * This method polls the durable storage for any stray directories, i.e. the ones that donot have a controller task
+ * associated with it and cleans them periodically.
+ * This ensures that the tasks which that have exited abruptly or have failed to clean up the durable storage themselves
+ * donot pollute it with worker outputs and temporary files. See {@link DurableStorageCleanerConfig} for the configs.
+ */
+public class DurableStorageCleaner implements OverlordHelper
+{
+
+  private static final Logger LOG = new Logger(DurableStorageCleaner.class);
+
+  private final DurableStorageCleanerConfig config;
+  private final StorageConnector storageConnector;
+  private final TaskRunner taskRunner;
+
+  @Inject
+  public DurableStorageCleaner(
+      final DurableStorageCleanerConfig config,
+      final StorageConnector storageConnector,
+      final TaskRunner taskRunner
+  )
+  {
+    this.config = config;
+    this.storageConnector = storageConnector;
+    this.taskRunner = taskRunner;
+  }
+
+  @Override
+  public boolean isEnabled()
+  {
+    return config.isEnabled();
+  }
+
+  @Override
+  public void schedule(ScheduledExecutorService exec)
+  {
+    LOG.info("Starting the DurableStorageCleaner with the config [%s]", config);
+
+    ScheduledExecutors.scheduleWithFixedDelay(
+        exec,
+        Duration.standardSeconds(config.getInitialDelaySeconds()),
+        Duration.standardSeconds(config.getDelaySeconds()),
+        () -> {
+          try {
+            Set<String> allDirectories = new HashSet<>(storageConnector.listDir(""));
+            Set<String> runningTaskIds = taskRunner.getRunningTasks()
+                                                   .stream()
+                                                   .map(TaskRunnerWorkItem::getTaskId)
+                                                   .map(DurableStorageOutputChannelFactory::getControllerDirectory)
+                                                   .collect(Collectors.toSet());
+            Set<String> unknownDirectories = Sets.difference(allDirectories, runningTaskIds);
+            LOG.info(
+                "Following directories donot have a corresponding MSQ task associated with it:\n%s\nThese will get cleaned up.",

Review Comment:
   Nit: do not



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1012070101


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleanerConfig.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.base.Preconditions;
+
+public class DurableStorageCleanerConfig
+{
+
+  private static final long DEFAULT_INITIAL_DELAY_SECONDS = 86400L;
+  private static final long DEFAULT_DELAY_SECONDS = 86400L;
+
+  /**
+   * Whether the {@link DurableStorageCleaner} helper should be enabled or not
+   */
+  @JsonProperty
+  private final boolean enabled;
+
+  /**
+   * Initial delay in seconds post which the durable storage cleaner should run
+   */
+  @JsonProperty
+  private final long initialDelaySeconds;
+
+  /**
+   * The delay (in seconds) after the last run post which the durable storage cleaner would clean the outputs
+   */
+  @JsonProperty
+  private final long delaySeconds;
+
+  @JsonCreator
+  public DurableStorageCleanerConfig(
+      @JsonProperty("enabled") final Boolean enabled,
+      @JsonProperty("initialDelay") final Long initialDelaySeconds,
+      @JsonProperty("delay") final Long delaySeconds
+  )
+  {
+    this.enabled = enabled != null && enabled;
+    this.initialDelaySeconds = initialDelaySeconds != null ? initialDelaySeconds : DEFAULT_INITIAL_DELAY_SECONDS;

Review Comment:
   Nit: I would be curious to understand why is an InitialDelay required. Maybe I am missing a use case?
   



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleanerConfig.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.base.Preconditions;
+
+public class DurableStorageCleanerConfig
+{
+
+  private static final long DEFAULT_INITIAL_DELAY_SECONDS = 86400L;
+  private static final long DEFAULT_DELAY_SECONDS = 86400L;
+
+  /**
+   * Whether the {@link DurableStorageCleaner} helper should be enabled or not
+   */
+  @JsonProperty
+  private final boolean enabled;
+
+  /**
+   * Initial delay in seconds post which the durable storage cleaner should run
+   */
+  @JsonProperty
+  private final long initialDelaySeconds;
+
+  /**
+   * The delay (in seconds) after the last run post which the durable storage cleaner would clean the outputs
+   */
+  @JsonProperty
+  private final long delaySeconds;
+
+  @JsonCreator
+  public DurableStorageCleanerConfig(
+      @JsonProperty("enabled") final Boolean enabled,
+      @JsonProperty("initialDelay") final Long initialDelaySeconds,
+      @JsonProperty("delay") final Long delaySeconds
+  )
+  {
+    this.enabled = enabled != null && enabled;

Review Comment:
   Nit: Shouldn't this be enabled by default?



##########
docs/multi-stage-query/reference.md:
##########
@@ -205,6 +205,16 @@ The following table lists the context parameters for the MSQ task engine:
 | sqlTimeZone | Sets the time zone for this connection, which affects how time functions and timestamp literals behave. Use a time zone name like "America/Los_Angeles" or offset like "-08:00".| `druid.sql.planner.sqlTimeZone` on the Broker (default: UTC)|
 | useApproximateCountDistinct | Whether to use an approximate cardinality algorithm for `COUNT(DISTINCT foo)`.| `druid.sql.planner.useApproximateCountDistinct` on the Broker (default: true)|
 
+## Durable Storage
+This section enumates the advantages and performance implications of enabling durable storage while executing MSQ tasks.
+
+|Parameter          |Default                                 | Description          |
+|-------------------|----------------------------------------|----------------------|
+|`druid.msq.intermediate.storage.enable` | false | Whether or not to enable durable storage for the cluster |
+|`druid.msq.intermediate.storage.cleaner.enabled`| false | Whether durable storage cleaner should be enabled for the cluster|

Review Comment:
   We might want to mention that this properties are overlord only. 



##########
extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/S3StorageConnector.java:
##########
@@ -127,10 +129,43 @@ public void deleteRecursively(String dirName)
     }
   }
 
+  @Override
+  public List<String> listDir(String dirName)
+  {
+    ListObjectsV2Request listObjectsRequest = new ListObjectsV2Request()
+        .withBucketName(config.getBucket())
+        .withPrefix(objectPath(dirName))
+        .withDelimiter(DELIM);
+
+    List<String> lsResult = new ArrayList<>();
+
+    ListObjectsV2Result objectListing = s3Client.listObjectsV2(listObjectsRequest);

Review Comment:
   Nit: can we add some test cases here ?



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleaner.java:
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.google.common.collect.Sets;
+import com.google.inject.Inject;
+import org.apache.druid.indexing.overlord.TaskRunner;
+import org.apache.druid.indexing.overlord.TaskRunnerWorkItem;
+import org.apache.druid.indexing.overlord.helpers.OverlordHelper;
+import org.apache.druid.java.util.common.concurrent.ScheduledExecutors;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.msq.shuffle.DurableStorageOutputChannelFactory;
+import org.apache.druid.storage.StorageConnector;
+import org.joda.time.Duration;
+
+import java.io.IOException;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.stream.Collectors;
+
+/**
+ * This method polls the durable storage for any stray directories, i.e. the ones that donot have a controller task
+ * associated with it and cleans them periodically.
+ * This ensures that the tasks which that have exited abruptly or have failed to clean up the durable storage themselves
+ * donot pollute it with worker outputs and temporary files. See {@link DurableStorageCleanerConfig} for the configs.
+ */
+public class DurableStorageCleaner implements OverlordHelper
+{
+
+  private static final Logger LOG = new Logger(DurableStorageCleaner.class);
+
+  private final DurableStorageCleanerConfig config;
+  private final StorageConnector storageConnector;
+  private final TaskRunner taskRunner;
+
+  @Inject
+  public DurableStorageCleaner(
+      final DurableStorageCleanerConfig config,
+      final StorageConnector storageConnector,
+      final TaskRunner taskRunner
+  )
+  {
+    this.config = config;
+    this.storageConnector = storageConnector;
+    this.taskRunner = taskRunner;
+  }
+
+  @Override
+  public boolean isEnabled()
+  {
+    return config.isEnabled();
+  }
+
+  @Override
+  public void schedule(ScheduledExecutorService exec)
+  {
+    LOG.info("Starting the DurableStorageCleaner with the config [%s]", config);
+
+    ScheduledExecutors.scheduleWithFixedDelay(
+        exec,
+        Duration.standardSeconds(config.getInitialDelaySeconds()),
+        Duration.standardSeconds(config.getDelaySeconds()),
+        () -> {
+          try {
+            Set<String> allDirectories = new HashSet<>(storageConnector.listDir(""));
+            Set<String> runningTaskIds = taskRunner.getRunningTasks()
+                                                   .stream()
+                                                   .map(TaskRunnerWorkItem::getTaskId)
+                                                   .map(DurableStorageOutputChannelFactory::getControllerDirectory)
+                                                   .collect(Collectors.toSet());
+            Set<String> unknownDirectories = Sets.difference(allDirectories, runningTaskIds);
+            LOG.info(
+                "Following directories donot have a corresponding MSQ task associated with it:\n%s\nThese will get cleaned up.",
+                unknownDirectories
+            );
+            for (String unknownDirectory : unknownDirectories) {
+              storageConnector.deleteRecursively(unknownDirectory);
+            }
+          }
+          catch (IOException e) {
+            throw new RuntimeException("Error while running the scheduled durable storage cleanup helper", e);

Review Comment:
   Does the overlord fail if the exception is thrown?
   
   if yes? then we might want to change this behavior to not cause an overlord switch. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1009061243


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleaner.java:
##########
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.google.common.collect.Sets;
+import com.google.inject.Inject;
+import org.apache.druid.indexing.overlord.TaskRunner;
+import org.apache.druid.indexing.overlord.TaskRunnerWorkItem;
+import org.apache.druid.indexing.overlord.helpers.OverlordHelper;
+import org.apache.druid.java.util.common.concurrent.ScheduledExecutors;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.msq.shuffle.DurableStorageOutputChannelFactory;
+import org.apache.druid.storage.StorageConnector;
+import org.joda.time.Duration;
+
+import java.io.IOException;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.stream.Collectors;
+
+/**
+ * This method polls the durable storage for any stray directories, i.e. the ones that donot have a controller task
+ * associated with it and cleans them periodically.
+ * This ensures that the tasks which that have exited abruptly or have failed to clean up the durable storage themselves
+ * donot pollute it with worker outputs and temporary files. See {@link DurableStorageCleanerConfig} for the configs.
+ */
+public class DurableStorageCleaner implements OverlordHelper
+{
+
+  private static final Logger LOG = new Logger(DurableStorageCleaner.class);
+
+  private final DurableStorageCleanerConfig config;
+  private final StorageConnector storageConnector;
+  private final TaskRunner taskRunner;
+
+  @Inject
+  public DurableStorageCleaner(
+      final DurableStorageCleanerConfig config,
+      final StorageConnector storageConnector,
+      final TaskRunner taskRunner
+  )
+  {
+    this.config = config;
+    this.storageConnector = storageConnector;
+    this.taskRunner = taskRunner;
+  }
+
+  @Override
+  public boolean isEnabled()
+  {
+    return config.isEnabled();
+  }
+
+  @Override
+  public void schedule(ScheduledExecutorService exec)
+  {
+    LOG.info("Starting the DurableStorageCleaner with the config [%s]", config);
+
+    ScheduledExecutors.scheduleWithFixedDelay(
+        exec,
+        Duration.standardSeconds(config.getInitialDelaySeconds()),
+        Duration.standardSeconds(config.getDelaySeconds()),
+        () -> {
+          try {
+            Set<String> allDirectories = new HashSet<>(storageConnector.ls(""));
+            Set<String> runningTaskIds = taskRunner.getRunningTasks()
+                                                   .stream()
+                                                   .map(TaskRunnerWorkItem::getTaskId)
+                                                   .map(DurableStorageOutputChannelFactory::getControllerDirectory)
+                                                   .collect(Collectors.toSet());
+            Set<String> unknownDirectories = Sets.difference(allDirectories, runningTaskIds);
+            LOG.info(
+                "Following directories donot have a corresponding MSQ task associated with it:\n%s\nThese will get cleaned up.",
+                unknownDirectories
+            );
+            for (String unknownDirectory : unknownDirectories) {
+              LOG.info("");

Review Comment:
   Accidentally left the stray line 😓 . The directories to be deleted are getting logged a couple of lines above this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1012523992


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleanerConfig.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.base.Preconditions;
+
+public class DurableStorageCleanerConfig
+{
+
+  private static final long DEFAULT_INITIAL_DELAY_SECONDS = 86400L;
+  private static final long DEFAULT_DELAY_SECONDS = 86400L;
+
+  /**
+   * Whether the {@link DurableStorageCleaner} helper should be enabled or not
+   */
+  @JsonProperty
+  private final boolean enabled;
+
+  /**
+   * Initial delay in seconds post which the durable storage cleaner should run
+   */
+  @JsonProperty
+  private final long initialDelaySeconds;
+
+  /**
+   * The delay (in seconds) after the last run post which the durable storage cleaner would clean the outputs
+   */
+  @JsonProperty
+  private final long delaySeconds;
+
+  @JsonCreator
+  public DurableStorageCleanerConfig(
+      @JsonProperty("enabled") final Boolean enabled,
+      @JsonProperty("initialDelay") final Long initialDelaySeconds,
+      @JsonProperty("delay") final Long delaySeconds
+  )
+  {
+    this.enabled = enabled != null && enabled;
+    this.initialDelaySeconds = initialDelaySeconds != null ? initialDelaySeconds : DEFAULT_INITIAL_DELAY_SECONDS;

Review Comment:
   I added the `initialDelay` based off the existing TaskLogCleaner's configs. The use case that I thought of was that setting it to a lower value (or 0) for the initial run would ensure that the durable storage is cleaned as soon as the helper is scheduled for an initial cleanup.
   If it is adding complexity, I can remove this and set the initial cleanup to either `0` or `delaySeconds`. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1015069070


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleanerConfig.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.base.Preconditions;
+
+public class DurableStorageCleanerConfig
+{
+
+  private static final long DEFAULT_INITIAL_DELAY_SECONDS = 86400L;
+  private static final long DEFAULT_DELAY_SECONDS = 86400L;
+
+  /**
+   * Whether the {@link DurableStorageCleaner} helper should be enabled or not
+   */
+  @JsonProperty
+  private final boolean enabled;
+
+  /**
+   * Initial delay in seconds post which the durable storage cleaner should run
+   */
+  @JsonProperty
+  private final long initialDelaySeconds;
+
+  /**
+   * The delay (in seconds) after the last run post which the durable storage cleaner would clean the outputs
+   */
+  @JsonProperty
+  private final long delaySeconds;
+
+  @JsonCreator
+  public DurableStorageCleanerConfig(
+      @JsonProperty("enabled") final Boolean enabled,
+      @JsonProperty("initialDelay") final Long initialDelaySeconds,
+      @JsonProperty("delay") final Long delaySeconds
+  )
+  {
+    this.enabled = enabled != null && enabled;
+    this.initialDelaySeconds = initialDelaySeconds != null ? initialDelaySeconds : DEFAULT_INITIAL_DELAY_SECONDS;

Review Comment:
   I think lets keep it simple and remove it. The initial delay can be `delaySeconds`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1017834549


##########
core/src/main/java/org/apache/druid/storage/local/LocalFileStorageConnector.java:
##########
@@ -107,6 +111,23 @@ public void deleteRecursively(String dirName) throws IOException
     FileUtils.deleteDirectory(fileWithBasePath(dirName));
   }
 
+  @Override
+  public List<String> listDir(String dirName)
+  {
+    File directory = fileWithBasePath(dirName);
+    if (!directory.exists()) {
+      throw new IAE("No directory exists on path [%s]", dirName);
+    }
+    if (!directory.isDirectory()) {
+      throw new IAE("Cannot list contents of [%s] since it is not a directory", dirName);
+    }
+    File[] files = directory.listFiles();
+    if (files == null) {

Review Comment:
   I think empty files should not return an exception but that can be done as part of a follow up PR



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] a2l007 commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
a2l007 commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1008503684


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleaner.java:
##########
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.google.common.collect.Sets;
+import com.google.inject.Inject;
+import org.apache.druid.indexing.overlord.TaskRunner;
+import org.apache.druid.indexing.overlord.TaskRunnerWorkItem;
+import org.apache.druid.indexing.overlord.helpers.OverlordHelper;
+import org.apache.druid.java.util.common.concurrent.ScheduledExecutors;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.msq.shuffle.DurableStorageOutputChannelFactory;
+import org.apache.druid.storage.StorageConnector;
+import org.joda.time.Duration;
+
+import java.io.IOException;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.stream.Collectors;
+
+/**
+ * This method polls the durable storage for any stray directories, i.e. the ones that donot have a controller task
+ * associated with it and cleans them periodically.
+ * This ensures that the tasks which that have exited abruptly or have failed to clean up the durable storage themselves
+ * donot pollute it with worker outputs and temporary files. See {@link DurableStorageCleanerConfig} for the configs.
+ */
+public class DurableStorageCleaner implements OverlordHelper
+{
+
+  private static final Logger LOG = new Logger(DurableStorageCleaner.class);
+
+  private final DurableStorageCleanerConfig config;
+  private final StorageConnector storageConnector;
+  private final TaskRunner taskRunner;
+
+  @Inject
+  public DurableStorageCleaner(
+      final DurableStorageCleanerConfig config,
+      final StorageConnector storageConnector,
+      final TaskRunner taskRunner
+  )
+  {
+    this.config = config;
+    this.storageConnector = storageConnector;
+    this.taskRunner = taskRunner;
+  }
+
+  @Override
+  public boolean isEnabled()
+  {
+    return config.isEnabled();
+  }
+
+  @Override
+  public void schedule(ScheduledExecutorService exec)
+  {
+    LOG.info("Starting the DurableStorageCleaner with the config [%s]", config);
+
+    ScheduledExecutors.scheduleWithFixedDelay(
+        exec,
+        Duration.standardSeconds(config.getInitialDelaySeconds()),
+        Duration.standardSeconds(config.getDelaySeconds()),
+        () -> {
+          try {
+            Set<String> allDirectories = new HashSet<>(storageConnector.ls(""));
+            Set<String> runningTaskIds = taskRunner.getRunningTasks()
+                                                   .stream()
+                                                   .map(TaskRunnerWorkItem::getTaskId)
+                                                   .map(DurableStorageOutputChannelFactory::getControllerDirectory)
+                                                   .collect(Collectors.toSet());
+            Set<String> unknownDirectories = Sets.difference(allDirectories, runningTaskIds);
+            LOG.info(
+                "Following directories donot have a corresponding MSQ task associated with it:\n%s\nThese will get cleaned up.",
+                unknownDirectories
+            );
+            for (String unknownDirectory : unknownDirectories) {
+              LOG.info("");

Review Comment:
   Can we log the directory name here?



##########
core/src/main/java/org/apache/druid/storage/StorageConnector.java:
##########
@@ -104,4 +105,6 @@
    * @throws IOException
    */
   void deleteRecursively(String path) throws IOException;
+
+  List<String> ls(String dirName) throws IOException;

Review Comment:
   Can we make the method name more readable, `listFiles` maybe? Javadocs and unit tests would be useful for this method as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
rohangarg commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1016519903


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/guice/MSQDurableStorageModule.java:
##########
@@ -47,6 +51,7 @@ public class MSQDurableStorageModule implements DruidModule
   @Inject
   private Properties properties;
 
+  // Dummy constructor so that it can get injected dynamically at runtime

Review Comment:
   discussed offline - removed constructors and moved the properties injection to a separate method like other extensions. thanks for the changes!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe merged pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
cryptoe merged PR #13269:
URL: https://github.com/apache/druid/pull/13269


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1017462893


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleaner.java:
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.google.common.collect.Sets;
+import com.google.inject.Inject;
+import org.apache.druid.indexing.overlord.TaskRunner;
+import org.apache.druid.indexing.overlord.TaskRunnerWorkItem;
+import org.apache.druid.indexing.overlord.helpers.OverlordHelper;
+import org.apache.druid.java.util.common.concurrent.ScheduledExecutors;
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.msq.shuffle.DurableStorageOutputChannelFactory;
+import org.apache.druid.storage.StorageConnector;
+import org.joda.time.Duration;
+
+import java.io.IOException;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.stream.Collectors;
+
+/**
+ * This method polls the durable storage for any stray directories, i.e. the ones that donot have a controller task
+ * associated with it and cleans them periodically.
+ * This ensures that the tasks which that have exited abruptly or have failed to clean up the durable storage themselves
+ * donot pollute it with worker outputs and temporary files. See {@link DurableStorageCleanerConfig} for the configs.
+ */
+public class DurableStorageCleaner implements OverlordHelper
+{
+
+  private static final Logger LOG = new Logger(DurableStorageCleaner.class);
+
+  private final DurableStorageCleanerConfig config;
+  private final StorageConnector storageConnector;
+  private final TaskRunner taskRunner;
+
+  @Inject
+  public DurableStorageCleaner(
+      final DurableStorageCleanerConfig config,
+      final StorageConnector storageConnector,
+      final TaskRunner taskRunner
+  )
+  {
+    this.config = config;
+    this.storageConnector = storageConnector;
+    this.taskRunner = taskRunner;
+  }
+
+  @Override
+  public boolean isEnabled()
+  {
+    return config.isEnabled();
+  }
+
+  @Override
+  public void schedule(ScheduledExecutorService exec)
+  {
+    LOG.info("Starting the DurableStorageCleaner with the config [%s]", config);
+
+    ScheduledExecutors.scheduleWithFixedDelay(
+        exec,
+        Duration.standardSeconds(config.getInitialDelaySeconds()),
+        Duration.standardSeconds(config.getDelaySeconds()),
+        () -> {
+          try {
+            Set<String> allDirectories = new HashSet<>(storageConnector.listDir(""));
+            Set<String> runningTaskIds = taskRunner.getRunningTasks()
+                                                   .stream()
+                                                   .map(TaskRunnerWorkItem::getTaskId)
+                                                   .map(DurableStorageOutputChannelFactory::getControllerDirectory)
+                                                   .collect(Collectors.toSet());
+            Set<String> unknownDirectories = Sets.difference(allDirectories, runningTaskIds);
+            LOG.info(
+                "Following directories donot have a corresponding MSQ task associated with it:\n%s\nThese will get cleaned up.",
+                unknownDirectories
+            );
+            for (String unknownDirectory : unknownDirectories) {
+              storageConnector.deleteRecursively(unknownDirectory);
+            }
+          }
+          catch (IOException e) {
+            throw new RuntimeException("Error while running the scheduled durable storage cleanup helper", e);

Review Comment:
   Overlord won't fail if the exception is thrown inside the method. The exception would get logged.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on PR #13269:
URL: https://github.com/apache/druid/pull/13269#issuecomment-1296590889

   Thanks for the review @a2l007. I could not find the documentation for durable storage in the docs (probably since it's a newer feature). I have created a separate section for it and added the relevant configuration documentation for the cleaner. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
rohangarg commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1016389633


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/guice/MSQDurableStorageModule.java:
##########
@@ -47,6 +51,7 @@ public class MSQDurableStorageModule implements DruidModule
   @Inject
   private Properties properties;
 
+  // Dummy constructor so that it can get injected dynamically at runtime

Review Comment:
   does moving `@Inject` from member variable `properties` to `MSQDurableStorageModule(Properties properties)` constructor also work, which could make the default constructor obsolete?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1016482579


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/guice/MSQDurableStorageModule.java:
##########
@@ -47,6 +51,7 @@ public class MSQDurableStorageModule implements DruidModule
   @Inject
   private Properties properties;
 
+  // Dummy constructor so that it can get injected dynamically at runtime

Review Comment:
   I tried the suggestion and it didn't work. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1017838463


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleanerConfig.java:
##########
@@ -19,59 +19,31 @@
 
 package org.apache.druid.msq.indexing;
 
-import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
-import com.google.common.base.Preconditions;
+
+import javax.validation.constraints.Min;
 
 public class DurableStorageCleanerConfig
 {
 
-  private static final long DEFAULT_INITIAL_DELAY_SECONDS = 86400L;
-  private static final long DEFAULT_DELAY_SECONDS = 86400L;
-
   /**
    * Whether the {@link DurableStorageCleaner} helper should be enabled or not
    */
   @JsonProperty
-  private final boolean enabled;
-
-  /**
-   * Initial delay in seconds post which the durable storage cleaner should run
-   */
-  @JsonProperty
-  private final long initialDelaySeconds;
+  public boolean enabled = false;

Review Comment:
   Yeah, it should be private but I think that can be done as part of a follow-up PR. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1012082815


##########
docs/multi-stage-query/reference.md:
##########
@@ -205,6 +205,16 @@ The following table lists the context parameters for the MSQ task engine:
 | sqlTimeZone | Sets the time zone for this connection, which affects how time functions and timestamp literals behave. Use a time zone name like "America/Los_Angeles" or offset like "-08:00".| `druid.sql.planner.sqlTimeZone` on the Broker (default: UTC)|
 | useApproximateCountDistinct | Whether to use an approximate cardinality algorithm for `COUNT(DISTINCT foo)`.| `druid.sql.planner.useApproximateCountDistinct` on the Broker (default: true)|
 
+## Durable Storage
+This section enumates the advantages and performance implications of enabling durable storage while executing MSQ tasks.
+
+|Parameter          |Default                                 | Description          |
+|-------------------|----------------------------------------|----------------------|
+|`druid.msq.intermediate.storage.enable` | false | Whether or not to enable durable storage for the cluster |
+|`druid.msq.intermediate.storage.cleaner.enabled`| false | Whether durable storage cleaner should be enabled for the cluster|

Review Comment:
   We might want to mention that these properties are overlord only. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1012524958


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleanerConfig.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.msq.indexing;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.base.Preconditions;
+
+public class DurableStorageCleanerConfig
+{
+
+  private static final long DEFAULT_INITIAL_DELAY_SECONDS = 86400L;
+  private static final long DEFAULT_DELAY_SECONDS = 86400L;
+
+  /**
+   * Whether the {@link DurableStorageCleaner} helper should be enabled or not
+   */
+  @JsonProperty
+  private final boolean enabled;
+
+  /**
+   * Initial delay in seconds post which the durable storage cleaner should run
+   */
+  @JsonProperty
+  private final long initialDelaySeconds;
+
+  /**
+   * The delay (in seconds) after the last run post which the durable storage cleaner would clean the outputs
+   */
+  @JsonProperty
+  private final long delaySeconds;
+
+  @JsonCreator
+  public DurableStorageCleanerConfig(
+      @JsonProperty("enabled") final Boolean enabled,
+      @JsonProperty("initialDelay") final Long initialDelaySeconds,
+      @JsonProperty("delay") final Long delaySeconds
+  )
+  {
+    this.enabled = enabled != null && enabled;

Review Comment:
   I kept it disabled in case the location for the temp storage is also used for storing any other files.
   This might be not be prevalant in local storage but for S3 or other cloud storages, I think it would be better if this property is explicitly setup. I don't have a hard opinion on this but this was my reasoning behind this. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
rohangarg commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1015069934


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/guice/MSQDurableStorageModule.java:
##########
@@ -76,6 +75,16 @@ public void configure(Binder binder)
       binder.bind(Key.get(StorageConnector.class, MultiStageQuery.class))
             .toProvider(Key.get(StorageConnectorProvider.class, MultiStageQuery.class))
             .in(LazySingleton.class);
+
+      Multibinder.newSetBinder(binder, OverlordHelper.class)
+                 .addBinding()
+                 .to(OverlordHelper.class);

Review Comment:
   should this be `DurableStorageCleaner`? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] adarshsanjeev commented on a diff in pull request #13269: Add a OverlordHelper that cleans up durable storage objects in MSQ

Posted by GitBox <gi...@apache.org>.
adarshsanjeev commented on code in PR #13269:
URL: https://github.com/apache/druid/pull/13269#discussion_r1017502240


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/DurableStorageCleanerConfig.java:
##########
@@ -19,59 +19,31 @@
 
 package org.apache.druid.msq.indexing;
 
-import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
-import com.google.common.base.Preconditions;
+
+import javax.validation.constraints.Min;
 
 public class DurableStorageCleanerConfig
 {
 
-  private static final long DEFAULT_INITIAL_DELAY_SECONDS = 86400L;
-  private static final long DEFAULT_DELAY_SECONDS = 86400L;
-
   /**
    * Whether the {@link DurableStorageCleaner} helper should be enabled or not
    */
   @JsonProperty
-  private final boolean enabled;
-
-  /**
-   * Initial delay in seconds post which the durable storage cleaner should run
-   */
-  @JsonProperty
-  private final long initialDelaySeconds;
+  public boolean enabled = false;

Review Comment:
   The config properties could be private



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org