You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2021/04/27 19:16:31 UTC

[GitHub] [gobblin] aplex commented on a change in pull request #3268: [GOBBLIN-1437] Segregate FlowConfigs/Delete and FlowExecutions/Delete functions

aplex commented on a change in pull request #3268:
URL: https://github.com/apache/gobblin/pull/3268#discussion_r621508901



##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/job_monitor/KafkaJobMonitor.java
##########
@@ -100,17 +100,29 @@ protected void shutdownMetrics()
   @Override
   protected void processMessage(DecodeableKafkaRecord<byte[],byte[]> message) {
     try {
-      Collection<Either<JobSpec, URI>> parsedCollection = parseJobSpec(message.getValue());
-      for (Either<JobSpec, URI> parsedMessage : parsedCollection) {
-        if (parsedMessage instanceof Either.Left) {
-          this.newSpecs.inc();
-          this.jobCatalog.put(((Either.Left<JobSpec, URI>) parsedMessage).getLeft());
-        } else if (parsedMessage instanceof Either.Right) {
-          this.removedSpecs.inc();
-          URI jobSpecUri = ((Either.Right<JobSpec, URI>) parsedMessage).getRight();
-          this.jobCatalog.remove(jobSpecUri, true);
-          // Delete the job state if it is a delete spec request
-          deleteStateStore(jobSpecUri);
+      Collection<JobSpec> parsedCollection = parseJobSpec(message.getValue());
+      for (JobSpec parsedMessage : parsedCollection) {
+        SpecExecutor.Verb verb = SpecExecutor.Verb.valueOf(parsedMessage.getMetadata().get(SpecExecutor.VERB_KEY));
+
+        switch (verb) {
+          case ADD:
+          case UPDATE:
+          case UNKNOWN: // unknown are considered as add request to maintain backward compatibility

Review comment:
       Treating all unknown messages as "add" seems risky. I'm thinking about the case when we add a new verb, messages are generated for it, and then we need to rollback GaaS. Will the old GaaS see those messages as "unknown" and treat them as "add"?
   
   Also, is this a backward-compatibility with recent version of the service, or  with some ancient one? If we still need this compatibility, does it make sense to log something, so we'll at least see that this branch is executed?

##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/job_catalog/NonObservingFSJobCatalog.java
##########
@@ -105,6 +105,14 @@ public synchronized void remove(URI jobURI) {
       remove(jobURI, false);
   }
 
+  /**
+   * Removes a job from job catalog.
+   * If it is a flow delete request from service side, then alwaysTriggerListeners should be set to true so that a
+   * running job will also be cancelled. If it is a remove request because a job finishes or has been submitted to helix,
+   * then alwaysTriggerListeners should be set to false so that job cancellation is not triggered.
+   * @param jobURI job uri
+   * @param alwaysTriggerListeners if it is true, a running job will be cancelled.

Review comment:
       This architecture looks a bit strange to me. And looks like this API ambiguity was the source of the original bug.
   
   Since removal from the catalog does not always mean cancelling the job, should this job cancellation be moved out of the catalog code and be done explicitly by the callers? E.g. caller can first cancel the job, if they want, and then remove it from the catalog.

##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/job_monitor/KafkaJobMonitor.java
##########
@@ -100,17 +100,29 @@ protected void shutdownMetrics()
   @Override
   protected void processMessage(DecodeableKafkaRecord<byte[],byte[]> message) {
     try {
-      Collection<Either<JobSpec, URI>> parsedCollection = parseJobSpec(message.getValue());
-      for (Either<JobSpec, URI> parsedMessage : parsedCollection) {
-        if (parsedMessage instanceof Either.Left) {
-          this.newSpecs.inc();
-          this.jobCatalog.put(((Either.Left<JobSpec, URI>) parsedMessage).getLeft());
-        } else if (parsedMessage instanceof Either.Right) {
-          this.removedSpecs.inc();
-          URI jobSpecUri = ((Either.Right<JobSpec, URI>) parsedMessage).getRight();
-          this.jobCatalog.remove(jobSpecUri, true);
-          // Delete the job state if it is a delete spec request
-          deleteStateStore(jobSpecUri);
+      Collection<JobSpec> parsedCollection = parseJobSpec(message.getValue());
+      for (JobSpec parsedMessage : parsedCollection) {
+        SpecExecutor.Verb verb = SpecExecutor.Verb.valueOf(parsedMessage.getMetadata().get(SpecExecutor.VERB_KEY));
+
+        switch (verb) {
+          case ADD:
+          case UPDATE:
+          case UNKNOWN: // unknown are considered as add request to maintain backward compatibility
+            this.newSpecs.inc();
+            this.jobCatalog.put(parsedMessage);
+            break;
+          case DELETE:
+            this.removedSpecs.inc();
+            URI jobSpecUri = parsedMessage.getUri();
+            this.jobCatalog.remove(jobSpecUri, true);
+            // Delete the job state if it is a delete spec request
+            deleteStateStore(jobSpecUri);

Review comment:
       What will happen in the service process fails after removal of the job, but before state store is deleted? Will the state store eventually be deleted/cleaned up?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org