You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2020/10/26 16:17:35 UTC

[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #3140: [GOBBLIN-1302] add COMBINE_RETENTION_POLICIES config

sv2000 commented on a change in pull request #3140:
URL: https://github.com/apache/incubator-gobblin/pull/3140#discussion_r512081332



##########
File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/policy/CombineRetentionPolicy.java
##########
@@ -62,10 +66,16 @@
  */
 public class CombineRetentionPolicy<T extends DatasetVersion> implements RetentionPolicy<T> {
 
+  public static final String COMBINE_RETENTION_POLICIES =
+      DatasetCleaner.CONFIGURATION_KEY_PREFIX + "combine.retention.policy.classes";
+  /**
+   * @Deprecated , use COMBINE_RETENTION_POLICIES instead.

Review comment:
       +1. 

##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/HelixUtils.java
##########
@@ -204,7 +204,7 @@ static void waitJobCompletion(HelixManager helixManager, String workFlowName, St
       } else {
         // We have waited for WorkflowContext to get initialized,
         // so it is found null here, it must have been deleted in job cancellation process.
-        log.info("WorkflowContext not found. Job is probably cancelled.");
+        log.info("WorkflowContext not found. Job {} is probably cancelled.", jobName);

Review comment:
       Seems unrelated to the change proposed in this PR. Can we remove this?

##########
File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/policy/CombineRetentionPolicy.java
##########
@@ -110,6 +105,30 @@ public CombineRetentionPolicy(Properties props) throws IOException {
 
   }
 
+  private List<RetentionPolicy<T>> findRetentionPolicies(Properties props) {
+    List<String> retentionPolicyClasses;
+    ImmutableList.Builder<RetentionPolicy<T>> builder = ImmutableList.builder();
+    ClassAliasResolver<?> aliasResolver = new ClassAliasResolver<>(RetentionPolicy.class);
+
+    if (props.containsKey(COMBINE_RETENTION_POLICIES)) {
+      retentionPolicyClasses = COMMA_BASED_SPLITTER.splitToList(props.getProperty(COMBINE_RETENTION_POLICIES));
+    } else {
+      retentionPolicyClasses = PropertiesUtils.getValuesAsList(props, Optional.of(RETENTION_POLICIES_PREFIX));
+    }
+
+    for (String retentionPolicyClass : retentionPolicyClasses) {
+      try {
+        builder.add((RetentionPolicy<T>) ConstructorUtils.invokeConstructor(
+            Class.forName(aliasResolver.resolve(retentionPolicyClass)), props));
+      } catch (NoSuchMethodException | IllegalAccessException | InvocationTargetException | InstantiationException

Review comment:
       catch can be simplified to catch(ReflectiveOperationException).

##########
File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/policy/CombineRetentionPolicy.java
##########
@@ -110,6 +105,30 @@ public CombineRetentionPolicy(Properties props) throws IOException {
 
   }
 
+  private List<RetentionPolicy<T>> findRetentionPolicies(Properties props) {
+    List<String> retentionPolicyClasses;
+    ImmutableList.Builder<RetentionPolicy<T>> builder = ImmutableList.builder();
+    ClassAliasResolver<?> aliasResolver = new ClassAliasResolver<>(RetentionPolicy.class);
+
+    if (props.containsKey(COMBINE_RETENTION_POLICIES)) {
+      retentionPolicyClasses = COMMA_BASED_SPLITTER.splitToList(props.getProperty(COMBINE_RETENTION_POLICIES));

Review comment:
       We should strip leading and trailing whitespaces using Splitter.on(',').trimResults().splitToList(...).

##########
File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/policy/NewestKRetentionPolicy.java
##########
@@ -27,13 +27,15 @@
 import com.google.common.collect.Lists;
 import com.typesafe.config.Config;
 
+import org.apache.gobblin.annotation.Alias;
 import org.apache.gobblin.data.management.retention.DatasetCleaner;
 import org.apache.gobblin.data.management.version.DatasetVersion;
 
 
 /**
  * Retains the newest k versions of the dataset.
  */
+@Alias("NewestK")

Review comment:
       +1. Can we go all the way and add aliases to other retention policies too? There must be 3-4 more of them. 

##########
File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/policy/CombineRetentionPolicy.java
##########
@@ -110,6 +105,30 @@ public CombineRetentionPolicy(Properties props) throws IOException {
 
   }
 
+  private List<RetentionPolicy<T>> findRetentionPolicies(Properties props) {
+    List<String> retentionPolicyClasses;
+    ImmutableList.Builder<RetentionPolicy<T>> builder = ImmutableList.builder();
+    ClassAliasResolver<?> aliasResolver = new ClassAliasResolver<>(RetentionPolicy.class);
+
+    if (props.containsKey(COMBINE_RETENTION_POLICIES)) {
+      retentionPolicyClasses = COMMA_BASED_SPLITTER.splitToList(props.getProperty(COMBINE_RETENTION_POLICIES));

Review comment:
       Perhaps a good idea to do omitEmptyStrings() too. 

##########
File path: gobblin-utility/src/main/java/org/apache/gobblin/util/PropertiesUtils.java
##########
@@ -84,6 +84,20 @@ public static long getPropAsLong(Properties properties, String key, long default
     return LIST_SPLITTER.splitToList(properties.getProperty(key));
   }
 
+  /**
+   * Extract all the values whose keys start with a <code>prefix</code>
+   * @param properties the given {@link Properties} instance
+   * @param prefix of keys to be extracted
+   * @return a list of values in the properties
+   */
+  public static List<String> getValuesAsList(Properties properties, Optional<String> prefix) {

Review comment:
       Can we add a unit test for this method? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org