You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2020/08/26 21:56:57 UTC

[GitHub] [incubator-gobblin] autumnust opened a new pull request #3092: [GOBBLIN-1251] Propagate exception in TaskStateTracker for caller to trigger Helix retry

autumnust opened a new pull request #3092:
URL: https://github.com/apache/incubator-gobblin/pull/3092


   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
   
   
   ### JIRA
   - https://issues.apache.org/jira/browse/GOBBLIN-1251
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if applicable):
   `registerNewTask` called in `org.apache.gobblin.runtime.GobblinMultiTaskAttempt#runWorkUnits(org.apache.gobblin.runtime.CountUpAndDownLatch)` would swallow the exception if scheduling failed before this change. Post the change, the exception-handling logic within `runWorkunits` will handle it and retry if necessary. 
   
   Also did some comment-enhancement and variable renaming which doesn't break atomicity for this PR since they are relevant.
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #3092: [GOBBLIN-1251] Propagate exception in TaskStateTracker for caller to trigger Helix retry

Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #3092:
URL: https://github.com/apache/incubator-gobblin/pull/3092#discussion_r477621504



##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixTaskStateTracker.java
##########
@@ -59,7 +58,11 @@ public void registerNewTask(Task task) {
     try {
       this.scheduledReporters.put(task.getTaskId(), scheduleTaskMetricsUpdater(new TaskMetricsUpdater(task), task));
     } catch (RejectedExecutionException ree) {
+      // Propagate the exception to caller that has full control of the life-cycle of a helix task.
       LOGGER.error(String.format("Scheduling of task state reporter for task %s was rejected", task.getTaskId()));
+      Throwables.propagate(ree);

Review comment:
       One question worth considering: Do we want to make the failure in scheduling TaskMetricsUpdater a fatal failure? or can it be made configurable? 

##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java
##########
@@ -426,7 +431,7 @@ private boolean taskSuccessfulInPriorAttempt(String taskId) {
         if (task == null) {
           if (e instanceof RetryException) {
             // Indicating task being null due to failure in creation even after retrying.
-            isTaskCreatedSuccessfully = false;
+            areAllTaskSubmitted = false;

Review comment:
       areAllTaskSubmitted -> areAllTasksSubmitted.

##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java
##########
@@ -385,7 +387,10 @@ private boolean taskSuccessfulInPriorAttempt(String taskId) {
   private Pair<List<Task>, Boolean> runWorkUnits(CountUpAndDownLatch countDownLatch) {
 
     List<Task> tasks = Lists.newArrayList();
-    boolean isTaskCreatedSuccessfully = true;
+
+    // A flag indicating if there are any tasks not submitted successfully.
+    // Caller of this method should handler un-submitted task accordingly.

Review comment:
       handler -> handle.
   
   un-submitted task -> tasks with submission failures




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] asfgit closed pull request #3092: [GOBBLIN-1251] Propagate exception in TaskStateTracker for caller to trigger Helix retry

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #3092:
URL: https://github.com/apache/incubator-gobblin/pull/3092


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #3092: [GOBBLIN-1251] Propagate exception in TaskStateTracker for caller to trigger Helix retry

Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #3092:
URL: https://github.com/apache/incubator-gobblin/pull/3092#discussion_r478716594



##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixTaskStateTracker.java
##########
@@ -42,24 +45,39 @@
  *
  * @author Yinan Li
  */
-@Alpha
+@Slf4j
 public class GobblinHelixTaskStateTracker extends AbstractTaskStateTracker {
-
-  private static final Logger LOGGER = LoggerFactory.getLogger(GobblinHelixTaskStateTracker.class);
+  @VisibleForTesting
+  static final String IS_TASK_METRICS_SCHEDULING_FAILURE_FATAl = "helixTaskTracker.isNewTaskRegFailureFatal";

Review comment:
       FATAl -> FATAL.

##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixTaskStateTracker.java
##########
@@ -42,24 +45,39 @@
  *
  * @author Yinan Li
  */
-@Alpha
+@Slf4j
 public class GobblinHelixTaskStateTracker extends AbstractTaskStateTracker {
-
-  private static final Logger LOGGER = LoggerFactory.getLogger(GobblinHelixTaskStateTracker.class);
+  @VisibleForTesting
+  static final String IS_TASK_METRICS_SCHEDULING_FAILURE_FATAl = "helixTaskTracker.isNewTaskRegFailureFatal";
+  private static final String DEFAULT_TASK_METRICS_SCHEDULING_FAILURE_FATAl = "false";

Review comment:
       FATAl -> FATAL




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #3092: [GOBBLIN-1251] Propagate exception in TaskStateTracker for caller to trigger Helix retry

Posted by GitBox <gi...@apache.org>.
autumnust commented on a change in pull request #3092:
URL: https://github.com/apache/incubator-gobblin/pull/3092#discussion_r478704131



##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixTaskStateTracker.java
##########
@@ -59,7 +58,11 @@ public void registerNewTask(Task task) {
     try {
       this.scheduledReporters.put(task.getTaskId(), scheduleTaskMetricsUpdater(new TaskMetricsUpdater(task), task));
     } catch (RejectedExecutionException ree) {
+      // Propagate the exception to caller that has full control of the life-cycle of a helix task.
       LOGGER.error(String.format("Scheduling of task state reporter for task %s was rejected", task.getTaskId()));
+      Throwables.propagate(ree);

Review comment:
       addressed. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org