You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2021/06/05 01:53:40 UTC

[GitHub] [gobblin] aplex opened a new pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

aplex opened a new pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299


   Note: this branch depends on https://github.com/apache/gobblin/pull/3296 , and includes changes from it. After base PR is merged, this one needs to be rebased on master branch.
   
   In previous commits, we've added automatic troubleshooting to Gobblin
   Azkaban jobs, and here we will collect and expose discovered issues
   in Gobblin service.
   
   Initial implementation will store issues for a limited number of jobs in
   memory, and future commits will add persistence.
   
   https://issues.apache.org/jira/browse/GOBBLIN-1457


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] aplex commented on a change in pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
aplex commented on a change in pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#discussion_r669184965



##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/InMemoryMultiContextIssueRepository.java
##########
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.runtime.troubleshooter;
+
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.commons.collections4.map.LRUMap;
+
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import javax.inject.Inject;
+import javax.inject.Singleton;
+
+import org.apache.gobblin.util.ConfigUtils;
+
+/**
+ * Stores issues from multiple jobs, flows or other contexts in memory.
+ *
+ * To limit the memory consumption, it will keep only the last {@link #MAX_CONTEXT_COUNT} contexts,
+ * and older ones will be discarded.
+ * */
+@Singleton
+public class InMemoryMultiContextIssueRepository implements MultiContextIssueRepository {
+  public static final int DEFAULT_MAX_CONTEXT_COUNT = 100;

Review comment:
       we'll use persistent issue storage in the next PR - https://github.com/apache/gobblin/pull/3327 that will store more issues in DB. In-memory is a fallback implementation in case we want to reduce the DB load for some reason.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] codecov-commenter edited a comment on pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#issuecomment-872608040


   # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3299](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a322606) into [master](https://codecov.io/gh/apache/gobblin/commit/8f5718a00f43771a6ad04ffc4bca0fd86cfa17ec?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8f5718a) will **decrease** coverage by `4.47%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/gobblin/pull/3299/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3299      +/-   ##
   ============================================
   - Coverage     47.50%   43.03%   -4.48%     
   + Complexity     8182     1941    -6241     
   ============================================
     Files          1654      394    -1260     
     Lines         62532    16871   -45661     
     Branches       6792     2072    -4720     
   ============================================
   - Hits          29708     7260   -22448     
   + Misses        30199     8811   -21388     
   + Partials       2625      800    -1825     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../runtime/job\_catalog/NonObservingFSJobCatalog.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvam9iX2NhdGFsb2cvTm9uT2JzZXJ2aW5nRlNKb2JDYXRhbG9nLmphdmE=) | | |
   | [...vice/modules/flowgraph/datanodes/HttpDataNode.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9mbG93Z3JhcGgvZGF0YW5vZGVzL0h0dHBEYXRhTm9kZS5qYXZh) | | |
   | [...\_catalog/PackagedTemplatesJobCatalogDecorator.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvam9iX2NhdGFsb2cvUGFja2FnZWRUZW1wbGF0ZXNKb2JDYXRhbG9nRGVjb3JhdG9yLmphdmE=) | | |
   | [...rg/apache/gobblin/restli/throttling/QPSPolicy.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2UvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2Utc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3Jlc3RsaS90aHJvdHRsaW5nL1FQU1BvbGljeS5qYXZh) | | |
   | [.../gobblin/kafka/client/BaseKafkaConsumerRecord.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2thZmthL2NsaWVudC9CYXNlS2Fma2FDb25zdW1lclJlY29yZC5qYXZh) | | |
   | [.../gobblin/restli/throttling/DynamicTokenBucket.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2UvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2Utc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3Jlc3RsaS90aHJvdHRsaW5nL0R5bmFtaWNUb2tlbkJ1Y2tldC5qYXZh) | | |
   | [...he/gobblin/runtime/JobExecutionEventSubmitter.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvSm9iRXhlY3V0aW9uRXZlbnRTdWJtaXR0ZXIuamF2YQ==) | | |
   | [...doop/HadoopKerberosKeytabAuthenticationPlugin.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lLWhhZG9vcC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9ydW50aW1lL2luc3RhbmNlL3BsdWdpbi9oYWRvb3AvSGFkb29wS2VyYmVyb3NLZXl0YWJBdXRoZW50aWNhdGlvblBsdWdpbi5qYXZh) | | |
   | [...rce/extractor/extract/kafka/KafkaSimpleSource.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NvdXJjZS9leHRyYWN0b3IvZXh0cmFjdC9rYWZrYS9LYWZrYVNpbXBsZVNvdXJjZS5qYXZh) | | |
   | [...e/extractor/filebased/TokenizedFileDownloader.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NvdXJjZS9leHRyYWN0b3IvZmlsZWJhc2VkL1Rva2VuaXplZEZpbGVEb3dubG9hZGVyLmphdmE=) | | |
   | ... and [2027 more](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8f5718a...a322606](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] aplex commented on a change in pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
aplex commented on a change in pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#discussion_r662652132



##########
File path: gobblin-runtime/src/test/java/org/apache/gobblin/runtime/troubleshooter/InMemoryIssueRepositoryTest.java
##########
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.runtime.troubleshooter;
+
+import java.util.List;
+
+import org.testng.annotations.Test;
+
+import static org.testng.AssertJUnit.assertEquals;
+import static org.testng.AssertJUnit.assertTrue;
+
+
+public class InMemoryIssueRepositoryTest {
+
+  @Test
+  public void canPutIssue()
+      throws Exception {
+
+    InMemoryIssueRepository repository = new InMemoryIssueRepository();
+
+    Issue testIssue = getTestIssue("first", "code1");
+    repository.put(testIssue);
+
+    List<Issue> issues = repository.getAll();
+    assertEquals(1, issues.size());
+    assertEquals(testIssue, issues.get(0));
+  }
+
+  @Test
+  public void canPutMultipleIssues()
+      throws Exception {
+
+    InMemoryIssueRepository repository = new InMemoryIssueRepository();
+
+    repository.put(getTestIssue("first", "code1"));
+    repository.put(getTestIssue("second", "code2"));
+    repository.put(getTestIssue("third", "code3"));
+
+    List<Issue> issues = repository.getAll();
+    assertEquals(3, issues.size());
+    assertTrue(issues.stream().anyMatch(i -> i.getCode().equals("code2")));
+  }
+
+  @Test
+  public void canRemoveIssue()
+      throws Exception {
+
+    InMemoryIssueRepository repository = new InMemoryIssueRepository();
+
+    repository.put(getTestIssue("first", "code1"));
+    repository.put(getTestIssue("second", "code2"));
+    repository.put(getTestIssue("third", "code3"));
+
+    List<Issue> issues = repository.getAll();
+    assertEquals(3, issues.size());
+
+    repository.remove("code2");
+
+    issues = repository.getAll();
+    assertEquals(2, issues.size());
+  }
+
+  @Test
+  public void canDeduplicateIssues()
+      throws Exception {
+
+    InMemoryIssueRepository repository = new InMemoryIssueRepository();
+
+    repository.put(getTestIssue("first", "code1"));
+    repository.put(getTestIssue("second", "code2"));
+    repository.put(getTestIssue("second-2", "code2"));
+    repository.put(getTestIssue("second-3", "code2"));
+
+    List<Issue> issues = repository.getAll();
+    assertEquals(2, issues.size());
+    assertTrue(issues.stream().anyMatch(i -> i.getCode().equals("code1")));
+    assertTrue(issues.stream().anyMatch(i -> i.getCode().equals("code2")));

Review comment:
       For each job, we store only the first issue with a specific code. So, if we had 1000 "AccessDenied" errors, we'll only capture the first one and discard the rest. This should ensure that user is not overwhelmed with repeated errors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] codecov-commenter edited a comment on pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#issuecomment-872608040


   # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3299](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (542d8a1) into [master](https://codecov.io/gh/apache/gobblin/commit/8f5718a00f43771a6ad04ffc4bca0fd86cfa17ec?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8f5718a) will **increase** coverage by `1.01%`.
   > The diff coverage is `71.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/gobblin/pull/3299/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3299      +/-   ##
   ============================================
   + Coverage     47.50%   48.51%   +1.01%     
   + Complexity     8182     7537     -645     
   ============================================
     Files          1654     1421     -233     
     Lines         62532    55778    -6754     
     Branches       6792     6415     -377     
   ============================================
   - Hits          29708    27063    -2645     
   + Misses        30199    26195    -4004     
   + Partials       2625     2520     -105     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...g/apache/gobblin/service/monitoring/JobStatus.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9uaXRvcmluZy9Kb2JTdGF0dXMuamF2YQ==) | `18.18% <0.00%> (-0.87%)` | :arrow_down: |
   | [...gobblin/service/monitoring/JobStatusRetriever.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9uaXRvcmluZy9Kb2JTdGF0dXNSZXRyaWV2ZXIuamF2YQ==) | `2.38% <10.00%> (+2.38%)` | :arrow_up: |
   | [...n/runtime/troubleshooter/JobIssueEventHandler.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvSm9iSXNzdWVFdmVudEhhbmRsZXIuamF2YQ==) | `72.22% <72.22%> (ø)` | |
   | [...leshooter/InMemoryMultiContextIssueRepository.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvSW5NZW1vcnlNdWx0aUNvbnRleHRJc3N1ZVJlcG9zaXRvcnkuamF2YQ==) | `95.23% <95.23%> (ø)` | |
   | [...in/runtime/troubleshooter/TroubleshooterUtils.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvVHJvdWJsZXNob290ZXJVdGlscy5qYXZh) | `100.00% <100.00%> (ø)` | |
   | [.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=) | `79.43% <0.00%> (-0.94%)` | :arrow_down: |
   | [...main/java/org/apache/gobblin/yarn/YarnService.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFyblNlcnZpY2UuamF2YQ==) | `14.36% <0.00%> (-0.79%)` | :arrow_down: |
   | [...n/java/org/apache/gobblin/yarn/YarnHelixUtils.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFybkhlbGl4VXRpbHMuamF2YQ==) | `26.66% <0.00%> (ø)` | |
   | [.../java/org/apache/gobblin/async/BufferedRecord.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2FzeW5jL0J1ZmZlcmVkUmVjb3JkLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [.../java/org/apache/gobblin/writer/ConsoleWriter.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3dyaXRlci9Db25zb2xlV3JpdGVyLmphdmE=) | `50.00% <0.00%> (ø)` | |
   | ... and [1091 more](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8f5718a...542d8a1](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] aplex merged pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
aplex merged pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] codecov-commenter edited a comment on pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#issuecomment-872608040


   # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3299](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (542d8a1) into [master](https://codecov.io/gh/apache/gobblin/commit/8f5718a00f43771a6ad04ffc4bca0fd86cfa17ec?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8f5718a) will **decrease** coverage by `1.72%`.
   > The diff coverage is `71.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/gobblin/pull/3299/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3299      +/-   ##
   ============================================
   - Coverage     47.50%   45.78%   -1.73%     
   - Complexity     8182     9014     +832     
   ============================================
     Files          1654     1802     +148     
     Lines         62532    71334    +8802     
     Branches       6792     7952    +1160     
   ============================================
   + Hits          29708    32662    +2954     
   - Misses        30199    35694    +5495     
   - Partials       2625     2978     +353     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...g/apache/gobblin/service/monitoring/JobStatus.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9uaXRvcmluZy9Kb2JTdGF0dXMuamF2YQ==) | `18.18% <0.00%> (-0.87%)` | :arrow_down: |
   | [...gobblin/service/monitoring/JobStatusRetriever.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9uaXRvcmluZy9Kb2JTdGF0dXNSZXRyaWV2ZXIuamF2YQ==) | `2.38% <10.00%> (+2.38%)` | :arrow_up: |
   | [...n/runtime/troubleshooter/JobIssueEventHandler.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvSm9iSXNzdWVFdmVudEhhbmRsZXIuamF2YQ==) | `72.22% <72.22%> (ø)` | |
   | [...leshooter/InMemoryMultiContextIssueRepository.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvSW5NZW1vcnlNdWx0aUNvbnRleHRJc3N1ZVJlcG9zaXRvcnkuamF2YQ==) | `95.23% <95.23%> (ø)` | |
   | [...in/runtime/troubleshooter/TroubleshooterUtils.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvVHJvdWJsZXNob290ZXJVdGlscy5qYXZh) | `100.00% <100.00%> (ø)` | |
   | [...lin/elasticsearch/writer/FutureCallbackHolder.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tZWxhc3RpY3NlYXJjaC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9lbGFzdGljc2VhcmNoL3dyaXRlci9GdXR1cmVDYWxsYmFja0hvbGRlci5qYXZh) | `61.42% <0.00%> (-1.43%)` | :arrow_down: |
   | [.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=) | `79.43% <0.00%> (-0.94%)` | :arrow_down: |
   | [...main/java/org/apache/gobblin/yarn/YarnService.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFyblNlcnZpY2UuamF2YQ==) | `14.36% <0.00%> (-0.79%)` | :arrow_down: |
   | [...n/java/org/apache/gobblin/yarn/YarnHelixUtils.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFybkhlbGl4VXRpbHMuamF2YQ==) | `26.66% <0.00%> (ø)` | |
   | [.../java/org/apache/gobblin/async/BufferedRecord.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2FzeW5jL0J1ZmZlcmVkUmVjb3JkLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | ... and [730 more](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8f5718a...542d8a1](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] umustafi commented on a change in pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
umustafi commented on a change in pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#discussion_r649435102



##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/IssueEventBuilder.java
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.runtime.troubleshooter;
+
+import java.util.Map;
+
+import org.apache.commons.lang.StringUtils;
+
+import lombok.Getter;
+import lombok.Setter;
+
+import org.apache.gobblin.metrics.GobblinTrackingEvent;
+import org.apache.gobblin.metrics.event.GobblinEventBuilder;
+import org.apache.gobblin.runtime.util.GsonUtils;
+
+
+/**
+ * The builder builds builds a specific {@link GobblinTrackingEvent} whose metadata has
+ * {@value GobblinEventBuilder#EVENT_TYPE} to be {@value #ISSUE_EVENT_TYPE}
+ *
+ * <p>
+ * Note: A {@link IssueEventBuilder} instance is not reusable
+ */
+public class IssueEventBuilder extends GobblinEventBuilder {
+  public static final String JOB_ISSUE = "JobIssue";
+
+  private static final String ISSUE_EVENT_TYPE = "IssueEvent";
+  private static final String METADATA_ISSUE = "issue";
+
+  @Getter
+  @Setter
+  private Issue issue;
+
+  public IssueEventBuilder(String name) {
+    this(name, NAMESPACE);
+  }
+
+  public IssueEventBuilder(String name, String namespace) {
+    super(name, namespace);
+    metadata.put(EVENT_TYPE, ISSUE_EVENT_TYPE);
+  }
+
+  public static boolean isIssueEvent(GobblinTrackingEvent event) {
+    String eventType = (event.getMetadata() == null) ? "" : event.getMetadata().get(EVENT_TYPE);
+    return StringUtils.isNotEmpty(eventType) && eventType.equals(ISSUE_EVENT_TYPE);
+  }
+
+  /**
+   * Create a {@link IssueEventBuilder} from a {@link GobblinTrackingEvent}.
+   * Will return null if the event is not of the correct type.
+   */
+  public static IssueEventBuilder fromEvent(GobblinTrackingEvent event) {
+    if (!isIssueEvent(event)) {
+      return null;
+    }
+    Map<String, String> metadata = event.getMetadata();
+    IssueEventBuilder issueEventBuilder = new IssueEventBuilder(event.getName(), event.getNamespace());
+
+    metadata.forEach((key, value) -> {
+      if (METADATA_ISSUE.equals(key)) {
+        issueEventBuilder.issue = GsonUtils.GSON_WITH_DATE_HANDLING.fromJson(value, Issue.class);
+      } else {
+        issueEventBuilder.addMetadata(key, value);
+      }
+    });
+    return issueEventBuilder;
+  }
+
+  public static Issue getIssueFromEvent(GobblinTrackingEvent event) {
+    String serializedIssue = event.getMetadata().get(METADATA_ISSUE);
+    return GsonUtils.GSON_WITH_DATE_HANDLING.fromJson(serializedIssue, Issue.class);
+  }
+
+  public GobblinTrackingEvent build() {

Review comment:
       previously we never returned GobblinTrackingEvents for exception/error propogation right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] aplex commented on pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
aplex commented on pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#issuecomment-872616247


   @arjun4084346 , @Will-Lo , can you take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] aplex commented on a change in pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
aplex commented on a change in pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#discussion_r663285475



##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/InMemoryMultiContextIssueRepository.java
##########
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.runtime.troubleshooter;
+
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.commons.collections4.map.LRUMap;
+
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import javax.inject.Inject;
+import javax.inject.Singleton;
+
+import org.apache.gobblin.util.ConfigUtils;
+
+/**
+ * Stores issues from multiple jobs, flows or other contexts in memory.
+ *
+ * To limit the memory consumption, it will keep only the last {@link #MAX_CONTEXT_COUNT} contexts,
+ * and older ones will be discarded.
+ * */
+@Singleton
+public class InMemoryMultiContextIssueRepository implements MultiContextIssueRepository {
+  public static final int DEFAULT_MAX_CONTEXT_COUNT = 100;

Review comment:
       So each job will produce no more than a 100 issues (that's hardcoded in another class InMemoryIssueRepository), but realistically it will be 0-10. This number controls issues for how many jobs will be kept in memory. So by default it will be up to 100 jobs with up to 100 issues - 10 000 issues max. If the issues takes 20KB (with stacktrace and message), that will be 200mb extra memory, which should be ok. Also the number of tracked jobs is changeable with a config setting below.
   
   In the next PR we'll use db to store the issues. The in-memory repo is a fallback system.

##########
File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/Orchestrator.java
##########
@@ -104,8 +104,8 @@
   private Map<String, FlowCompiledState> flowGauges = Maps.newHashMap();
 
 
-  public Orchestrator(Config config, FlowStatusGenerator flowStatusGenerator, Optional<TopologyCatalog> topologyCatalog,
-      Optional<DagManager> dagManager, Optional<Logger> log, boolean instrumentationEnabled) {
+  public Orchestrator(Config config, Optional<TopologyCatalog> topologyCatalog, Optional<DagManager> dagManager, Optional<Logger> log,

Review comment:
       reverted this file

##########
File path: gobblin-restli/gobblin-flow-config-service/gobblin-flow-config-service-server/src/main/java/org/apache/gobblin/service/FlowExecutionResourceLocalHandler.java
##########
@@ -168,6 +174,24 @@ public static FlowExecution convertFlowStatus(org.apache.gobblin.service.monitor
         .setJobStatuses(jobStatusArray);
   }
 
+  private static org.apache.gobblin.service.Issue convertIssue(Issue issues) {

Review comment:
       renamed the method for better clarity




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] codecov-commenter edited a comment on pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#issuecomment-872608040


   # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3299](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (542d8a1) into [master](https://codecov.io/gh/apache/gobblin/commit/8f5718a00f43771a6ad04ffc4bca0fd86cfa17ec?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8f5718a) will **decrease** coverage by `4.48%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/gobblin/pull/3299/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3299      +/-   ##
   ============================================
   - Coverage     47.50%   43.02%   -4.49%     
   + Complexity     8182     1939    -6243     
   ============================================
     Files          1654      394    -1260     
     Lines         62532    16871   -45661     
     Branches       6792     2072    -4720     
   ============================================
   - Hits          29708     7259   -22449     
   + Misses        30199     8812   -21387     
   + Partials       2625      800    -1825     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...e/modules/flowgraph/datanodes/fs/AdlsDataNode.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9mbG93Z3JhcGgvZGF0YW5vZGVzL2ZzL0FkbHNEYXRhTm9kZS5qYXZh) | | |
   | [...n/java/org/apache/gobblin/yarn/YarnHelixUtils.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFybkhlbGl4VXRpbHMuamF2YQ==) | | |
   | [...a/org/apache/gobblin/azkaban/AzkabanJobRunner.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tYXprYWJhbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9hemthYmFuL0F6a2FiYW5Kb2JSdW5uZXIuamF2YQ==) | | |
   | [...t/retention/version/HiveDatasetVersionCleaner.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3JldGVudGlvbi92ZXJzaW9uL0hpdmVEYXRhc2V0VmVyc2lvbkNsZWFuZXIuamF2YQ==) | | |
   | [...alitychecker/row/RowLevelPolicyCheckerBuilder.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3F1YWxpdHljaGVja2VyL3Jvdy9Sb3dMZXZlbFBvbGljeUNoZWNrZXJCdWlsZGVyLmphdmE=) | | |
   | [.../org/apache/gobblin/runtime/std/JobSpecFilter.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3RkL0pvYlNwZWNGaWx0ZXIuamF2YQ==) | | |
   | [...ache/gobblin/source/workunit/ImmutableExtract.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc291cmNlL3dvcmt1bml0L0ltbXV0YWJsZUV4dHJhY3QuamF2YQ==) | | |
   | [...troubleshooter/AutomaticTroubleshooterFactory.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvQXV0b21hdGljVHJvdWJsZXNob290ZXJGYWN0b3J5LmphdmE=) | | |
   | [...a/org/apache/gobblin/compliance/ComplianceJob.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY29tcGxpYW5jZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9jb21wbGlhbmNlL0NvbXBsaWFuY2VKb2IuamF2YQ==) | | |
   | [...runtime/mapreduce/GobblinWorkUnitsInputFormat.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvZ29iYmxpbi9ydW50aW1lL21hcHJlZHVjZS9Hb2JibGluV29ya1VuaXRzSW5wdXRGb3JtYXQuamF2YQ==) | | |
   | ... and [2027 more](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8f5718a...542d8a1](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] jack-moseley commented on a change in pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
jack-moseley commented on a change in pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#discussion_r669136411



##########
File path: gobblin-restli/gobblin-flow-config-service/gobblin-flow-config-service-api/src/main/pegasus/org/apache/gobblin/service/Issue.pdl
##########
@@ -0,0 +1,46 @@
+namespace org.apache.gobblin.service
+
+/**
+ * Issue describes a specific unique problem in the job or application.
+ *
+ * Issue can be generated from log entries, health checks, and other places.
+ */
+record Issue {
+
+  /**
+   * Time when the issue have occured
+   */
+  time: Timestamp

Review comment:
       Is there a reason to create a new `Timestamp` type here instead of just directly having the field as a long?

##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/InMemoryMultiContextIssueRepository.java
##########
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.runtime.troubleshooter;
+
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.commons.collections4.map.LRUMap;
+
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import javax.inject.Inject;
+import javax.inject.Singleton;
+
+import org.apache.gobblin.util.ConfigUtils;
+
+/**
+ * Stores issues from multiple jobs, flows or other contexts in memory.
+ *
+ * To limit the memory consumption, it will keep only the last {@link #MAX_CONTEXT_COUNT} contexts,
+ * and older ones will be discarded.
+ * */
+@Singleton
+public class InMemoryMultiContextIssueRepository implements MultiContextIssueRepository {
+  public static final int DEFAULT_MAX_CONTEXT_COUNT = 100;

Review comment:
       So just curious, are we planning to use a much larger value than 100? Because I can imagine if one job of a flow runs, the flow takes several hours to finish, then a user queries the status, it's possible that 100 other GaaS jobs have already run in that time and then the issue information will be missing for the early jobs of the flow.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] umustafi commented on a change in pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
umustafi commented on a change in pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#discussion_r649438980



##########
File path: gobblin-runtime/src/test/java/org/apache/gobblin/runtime/troubleshooter/InMemoryIssueRepositoryTest.java
##########
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.runtime.troubleshooter;
+
+import java.util.List;
+
+import org.testng.annotations.Test;
+
+import static org.testng.AssertJUnit.assertEquals;
+import static org.testng.AssertJUnit.assertTrue;
+
+
+public class InMemoryIssueRepositoryTest {
+
+  @Test
+  public void canPutIssue()
+      throws Exception {
+
+    InMemoryIssueRepository repository = new InMemoryIssueRepository();
+
+    Issue testIssue = getTestIssue("first", "code1");
+    repository.put(testIssue);
+
+    List<Issue> issues = repository.getAll();
+    assertEquals(1, issues.size());
+    assertEquals(testIssue, issues.get(0));
+  }
+
+  @Test
+  public void canPutMultipleIssues()
+      throws Exception {
+
+    InMemoryIssueRepository repository = new InMemoryIssueRepository();
+
+    repository.put(getTestIssue("first", "code1"));
+    repository.put(getTestIssue("second", "code2"));
+    repository.put(getTestIssue("third", "code3"));
+
+    List<Issue> issues = repository.getAll();
+    assertEquals(3, issues.size());
+    assertTrue(issues.stream().anyMatch(i -> i.getCode().equals("code2")));
+  }
+
+  @Test
+  public void canRemoveIssue()
+      throws Exception {
+
+    InMemoryIssueRepository repository = new InMemoryIssueRepository();
+
+    repository.put(getTestIssue("first", "code1"));
+    repository.put(getTestIssue("second", "code2"));
+    repository.put(getTestIssue("third", "code3"));
+
+    List<Issue> issues = repository.getAll();
+    assertEquals(3, issues.size());
+
+    repository.remove("code2");
+
+    issues = repository.getAll();
+    assertEquals(2, issues.size());
+  }
+
+  @Test
+  public void canDeduplicateIssues()
+      throws Exception {
+
+    InMemoryIssueRepository repository = new InMemoryIssueRepository();
+
+    repository.put(getTestIssue("first", "code1"));
+    repository.put(getTestIssue("second", "code2"));
+    repository.put(getTestIssue("second-2", "code2"));
+    repository.put(getTestIssue("second-3", "code2"));
+
+    List<Issue> issues = repository.getAll();
+    assertEquals(2, issues.size());
+    assertTrue(issues.stream().anyMatch(i -> i.getCode().equals("code1")));
+    assertTrue(issues.stream().anyMatch(i -> i.getCode().equals("code2")));

Review comment:
       Does this account for a case where we have multiple issues with the same code that originated from two different places or does this usually map to propagated errors?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] codecov-commenter commented on pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#issuecomment-872608040


   # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3299](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c7150dd) into [master](https://codecov.io/gh/apache/gobblin/commit/8f5718a00f43771a6ad04ffc4bca0fd86cfa17ec?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8f5718a) will **decrease** coverage by `4.67%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/gobblin/pull/3299/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3299      +/-   ##
   ============================================
   - Coverage     47.50%   42.83%   -4.68%     
   + Complexity     8182     1931    -6251     
   ============================================
     Files          1654      394    -1260     
     Lines         62532    16871   -45661     
     Branches       6792     2072    -4720     
   ============================================
   - Hits          29708     7226   -22482     
   + Misses        30199     8846   -21353     
   + Partials       2625      799    -1826     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...management/policy/SelectBeforeTimeBasedPolicy.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3BvbGljeS9TZWxlY3RCZWZvcmVUaW1lQmFzZWRQb2xpY3kuamF2YQ==) | | |
   | [...in/source/extractor/exception/SchemaException.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NvdXJjZS9leHRyYWN0b3IvZXhjZXB0aW9uL1NjaGVtYUV4Y2VwdGlvbi5qYXZh) | | |
   | [...in/source/extractor/extract/sftp/SftpFsHelper.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NvdXJjZS9leHRyYWN0b3IvZXh0cmFjdC9zZnRwL1NmdHBGc0hlbHBlci5qYXZh) | | |
   | [...org/apache/gobblin/service/FlowConfigV2Client.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi1mbG93LWNvbmZpZy1zZXJ2aWNlL2dvYmJsaW4tZmxvdy1jb25maWctc2VydmljZS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc2VydmljZS9GbG93Q29uZmlnVjJDbGllbnQuamF2YQ==) | | |
   | [...modules/flowgraph/DatasetDescriptorConfigKeys.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9mbG93Z3JhcGgvRGF0YXNldERlc2NyaXB0b3JDb25maWdLZXlzLmphdmE=) | | |
   | [.../data/management/trash/ImmediateDeletionTrash.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3RyYXNoL0ltbWVkaWF0ZURlbGV0aW9uVHJhc2guamF2YQ==) | | |
   | [...blin/compaction/verify/InputRecordCountHelper.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb21wYWN0aW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbXBhY3Rpb24vdmVyaWZ5L0lucHV0UmVjb3JkQ291bnRIZWxwZXIuamF2YQ==) | | |
   | [.../management/policy/SelectAfterTimeBasedPolicy.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3BvbGljeS9TZWxlY3RBZnRlclRpbWVCYXNlZFBvbGljeS5qYXZh) | | |
   | [.../data/management/version/StringDatasetVersion.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3ZlcnNpb24vU3RyaW5nRGF0YXNldFZlcnNpb24uamF2YQ==) | | |
   | [...g/apache/gobblin/binary\_creation/OrcTestTools.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1iaW5hcnktbWFuYWdlbWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9iaW5hcnlfY3JlYXRpb24vT3JjVGVzdFRvb2xzLmphdmE=) | | |
   | ... and [2029 more](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8f5718a...c7150dd](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] aplex commented on a change in pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
aplex commented on a change in pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#discussion_r662651751



##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/IssueEventBuilder.java
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.runtime.troubleshooter;
+
+import java.util.Map;
+
+import org.apache.commons.lang.StringUtils;
+
+import lombok.Getter;
+import lombok.Setter;
+
+import org.apache.gobblin.metrics.GobblinTrackingEvent;
+import org.apache.gobblin.metrics.event.GobblinEventBuilder;
+import org.apache.gobblin.runtime.util.GsonUtils;
+
+
+/**
+ * The builder builds builds a specific {@link GobblinTrackingEvent} whose metadata has
+ * {@value GobblinEventBuilder#EVENT_TYPE} to be {@value #ISSUE_EVENT_TYPE}
+ *
+ * <p>
+ * Note: A {@link IssueEventBuilder} instance is not reusable
+ */
+public class IssueEventBuilder extends GobblinEventBuilder {
+  public static final String JOB_ISSUE = "JobIssue";
+
+  private static final String ISSUE_EVENT_TYPE = "IssueEvent";
+  private static final String METADATA_ISSUE = "issue";
+
+  @Getter
+  @Setter
+  private Issue issue;
+
+  public IssueEventBuilder(String name) {
+    this(name, NAMESPACE);
+  }
+
+  public IssueEventBuilder(String name, String namespace) {
+    super(name, namespace);
+    metadata.put(EVENT_TYPE, ISSUE_EVENT_TYPE);
+  }
+
+  public static boolean isIssueEvent(GobblinTrackingEvent event) {
+    String eventType = (event.getMetadata() == null) ? "" : event.getMetadata().get(EVENT_TYPE);
+    return StringUtils.isNotEmpty(eventType) && eventType.equals(ISSUE_EVENT_TYPE);
+  }
+
+  /**
+   * Create a {@link IssueEventBuilder} from a {@link GobblinTrackingEvent}.
+   * Will return null if the event is not of the correct type.
+   */
+  public static IssueEventBuilder fromEvent(GobblinTrackingEvent event) {
+    if (!isIssueEvent(event)) {
+      return null;
+    }
+    Map<String, String> metadata = event.getMetadata();
+    IssueEventBuilder issueEventBuilder = new IssueEventBuilder(event.getName(), event.getNamespace());
+
+    metadata.forEach((key, value) -> {
+      if (METADATA_ISSUE.equals(key)) {
+        issueEventBuilder.issue = GsonUtils.GSON_WITH_DATE_HANDLING.fromJson(value, Issue.class);
+      } else {
+        issueEventBuilder.addMetadata(key, value);
+      }
+    });
+    return issueEventBuilder;
+  }
+
+  public static Issue getIssueFromEvent(GobblinTrackingEvent event) {
+    String serializedIssue = event.getMetadata().get(METADATA_ISSUE);
+    return GsonUtils.GSON_WITH_DATE_HANDLING.fromJson(serializedIssue, Issue.class);
+  }
+
+  public GobblinTrackingEvent build() {

Review comment:
       We have a "job failed" GTE, but it only contains one latest exception. GTE with issues will also capture warnings and other errors that happened in Azkaban Job or Mapper.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] codecov-commenter edited a comment on pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#issuecomment-872608040


   # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3299](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a322606) into [master](https://codecov.io/gh/apache/gobblin/commit/8f5718a00f43771a6ad04ffc4bca0fd86cfa17ec?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8f5718a) will **increase** coverage by `1.01%`.
   > The diff coverage is `71.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/gobblin/pull/3299/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3299      +/-   ##
   ============================================
   + Coverage     47.50%   48.52%   +1.01%     
   + Complexity     8182     7539     -643     
   ============================================
     Files          1654     1421     -233     
     Lines         62532    55778    -6754     
     Branches       6792     6415     -377     
   ============================================
   - Hits          29708    27064    -2644     
   + Misses        30199    26194    -4005     
   + Partials       2625     2520     -105     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...g/apache/gobblin/service/monitoring/JobStatus.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9uaXRvcmluZy9Kb2JTdGF0dXMuamF2YQ==) | `18.18% <0.00%> (-0.87%)` | :arrow_down: |
   | [...gobblin/service/monitoring/JobStatusRetriever.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9uaXRvcmluZy9Kb2JTdGF0dXNSZXRyaWV2ZXIuamF2YQ==) | `2.38% <10.00%> (+2.38%)` | :arrow_up: |
   | [...n/runtime/troubleshooter/JobIssueEventHandler.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvSm9iSXNzdWVFdmVudEhhbmRsZXIuamF2YQ==) | `72.22% <72.22%> (ø)` | |
   | [...leshooter/InMemoryMultiContextIssueRepository.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvSW5NZW1vcnlNdWx0aUNvbnRleHRJc3N1ZVJlcG9zaXRvcnkuamF2YQ==) | `95.23% <95.23%> (ø)` | |
   | [...in/runtime/troubleshooter/TroubleshooterUtils.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdHJvdWJsZXNob290ZXIvVHJvdWJsZXNob290ZXJVdGlscy5qYXZh) | `100.00% <100.00%> (ø)` | |
   | [.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=) | `79.43% <0.00%> (-0.94%)` | :arrow_down: |
   | [...main/java/org/apache/gobblin/yarn/YarnService.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFyblNlcnZpY2UuamF2YQ==) | `14.36% <0.00%> (-0.79%)` | :arrow_down: |
   | [...n/java/org/apache/gobblin/yarn/YarnHelixUtils.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFybkhlbGl4VXRpbHMuamF2YQ==) | `26.66% <0.00%> (ø)` | |
   | [.../java/org/apache/gobblin/async/BufferedRecord.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2FzeW5jL0J1ZmZlcmVkUmVjb3JkLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [.../java/org/apache/gobblin/writer/ConsoleWriter.java](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3dyaXRlci9Db25zb2xlV3JpdGVyLmphdmE=) | `50.00% <0.00%> (ø)` | |
   | ... and [1091 more](https://codecov.io/gh/apache/gobblin/pull/3299/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8f5718a...a322606](https://codecov.io/gh/apache/gobblin/pull/3299?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] aplex commented on a change in pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
aplex commented on a change in pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#discussion_r669187949



##########
File path: gobblin-restli/gobblin-flow-config-service/gobblin-flow-config-service-api/src/main/pegasus/org/apache/gobblin/service/Issue.pdl
##########
@@ -0,0 +1,46 @@
+namespace org.apache.gobblin.service
+
+/**
+ * Issue describes a specific unique problem in the job or application.
+ *
+ * Issue can be generated from log entries, health checks, and other places.
+ */
+record Issue {
+
+  /**
+   * Time when the issue have occured
+   */
+  time: Timestamp

Review comment:
       It's actually unclear what "long" means for Time. In the most common interpretation, unix timestamp is a number of seconds since 1970, but in our API it is milliseconds. Having a type reference with a description of what it means should be more clear. Timestamp type comment says that it is in milliseconds.
   
   Personally, I think we should have used ISO format like "2015-04-14T11:07:36.639Z" that is both human- and computer- readable, but it's too late to change it since other parts of API use timestamps.  https://stackoverflow.com/a/29626123 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [gobblin] Will-Lo commented on a change in pull request #3299: [GOBBLIN-1457] Add automatic troubleshooter to Gobblin service

Posted by GitBox <gi...@apache.org>.
Will-Lo commented on a change in pull request #3299:
URL: https://github.com/apache/gobblin/pull/3299#discussion_r662661207



##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/InMemoryMultiContextIssueRepository.java
##########
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.runtime.troubleshooter;
+
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.commons.collections4.map.LRUMap;
+
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import javax.inject.Inject;
+import javax.inject.Singleton;
+
+import org.apache.gobblin.util.ConfigUtils;
+
+/**
+ * Stores issues from multiple jobs, flows or other contexts in memory.
+ *
+ * To limit the memory consumption, it will keep only the last {@link #MAX_CONTEXT_COUNT} contexts,
+ * and older ones will be discarded.
+ * */
+@Singleton
+public class InMemoryMultiContextIssueRepository implements MultiContextIssueRepository {
+  public static final int DEFAULT_MAX_CONTEXT_COUNT = 100;

Review comment:
       If there is a long running job that emits a large number of failures, is it possible that this number is too high?

##########
File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/Orchestrator.java
##########
@@ -104,8 +104,8 @@
   private Map<String, FlowCompiledState> flowGauges = Maps.newHashMap();
 
 
-  public Orchestrator(Config config, FlowStatusGenerator flowStatusGenerator, Optional<TopologyCatalog> topologyCatalog,
-      Optional<DagManager> dagManager, Optional<Logger> log, boolean instrumentationEnabled) {
+  public Orchestrator(Config config, Optional<TopologyCatalog> topologyCatalog, Optional<DagManager> dagManager, Optional<Logger> log,

Review comment:
       Was this file modified by Intellij autoformatter?

##########
File path: gobblin-restli/gobblin-flow-config-service/gobblin-flow-config-service-server/src/main/java/org/apache/gobblin/service/FlowExecutionResourceLocalHandler.java
##########
@@ -168,6 +174,24 @@ public static FlowExecution convertFlowStatus(org.apache.gobblin.service.monitor
         .setJobStatuses(jobStatusArray);
   }
 
+  private static org.apache.gobblin.service.Issue convertIssue(Issue issues) {

Review comment:
       Add a javadoc that this is to convert a runtime issue to a gobblin service reported issue?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org