You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "ZihanLi58 (via GitHub)" <gi...@apache.org> on 2023/03/08 20:00:51 UTC

[GitHub] [gobblin] ZihanLi58 commented on a diff in pull request #3656: [GOBBLIN-1797] Skip scheduling flows far into future

ZihanLi58 commented on code in PR #3656:
URL: https://github.com/apache/gobblin/pull/3656#discussion_r1129951496


##########
gobblin-api/src/main/java/org/apache/gobblin/configuration/ConfigurationKeys.java:
##########
@@ -90,7 +90,9 @@ public class ConfigurationKeys {
   public static final String JOB_RETRIGGERING_ENABLED = "job.retriggering.enabled";
   public static final String DEFAULT_JOB_RETRIGGERING_ENABLED = "true";
   public static final String LOAD_SPEC_BATCH_SIZE = "load.spec.batch.size";
-  public static final int DEFAULT_LOAD_SPEC_BATCH_SIZE = 100;
+  public static final int DEFAULT_LOAD_SPEC_BATCH_SIZE = 500;
+  public static final String SKIP_SCHEDULING_FLOWS_AFTER_NUM_DAYS = "skip.scheduling.flows.after.num.days";
+  public static final int DEFAULT_NUM_DAYS_TO_SKIP_AFTER = 100;

Review Comment:
   Given one year to make it safer? 



##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/scheduler/GobblinServiceJobScheduler.java:
##########
@@ -290,19 +345,30 @@ private void scheduleSpecsFromCatalog() {
 
       while (batchOfSpecsIterator.hasNext()) {
         Spec spec = batchOfSpecsIterator.next();
-        try {
-          addSpecHelperMethod(spec);
-          urisLeftToSchedule.remove(spec.getUri());
-        } catch (Exception e) {
-          // If there is an uncaught error thrown during compilation, log it and continue adding flows
-          _log.error("Could not schedule spec {} from flowCatalog due to ", spec, e);
+        FlowSpec flowSpec = (FlowSpec) spec;
+        String cronExpression = flowSpec.getConfig().getString(ConfigurationKeys.JOB_SCHEDULE_KEY);
+        // Empty string cron expressions should be scheduled by default
+        if (isNextRunWithinRangeToSchedule(cronExpression, this.thresholdToSkipSchedulingFlowsAfter)) {

Review Comment:
   Do you want to remove it from urisLeftToSchedule even if it's not scheduled because of the junk flowspec?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org