You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2020/06/12 00:18:42 UTC

[GitHub] [incubator-gobblin] sv2000 opened a new pull request #3039: GOBBLIN-1191: Reuse Helix instance names when containers are released…

sv2000 opened a new pull request #3039:
URL: https://github.com/apache/incubator-gobblin/pull/3039


   … by Gobblin Application Master
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
       - https://issues.apache.org/jira/browse/GOBBLIN-1191
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if applicable):
   Currently, when containers are released by Gobblin Application Master, the Helix instance names are not released back to the pool of unused instance names. The effect of this is that the number of unique Helix instances keeps growing over time. Since Helix creates one znode per instance, this can result in the number of znodes to grow unboundedly. 
   
   
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] asfgit closed pull request #3039: GOBBLIN-1191: Reuse Helix instance names when containers are released…

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #3039:
URL: https://github.com/apache/incubator-gobblin/pull/3039


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] codecov-commenter edited a comment on pull request #3039: GOBBLIN-1191: Reuse Helix instance names when containers are released…

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3039:
URL: https://github.com/apache/incubator-gobblin/pull/3039#issuecomment-643484827


   # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=h1) Report
   > Merging [#3039](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/8b9744e21f9bfb7ff2d01391eed28bb0b8799b33&el=desc) will **increase** coverage by `36.52%`.
   > The diff coverage is `6.66%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc)](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3039       +/-   ##
   =============================================
   + Coverage      9.27%   45.80%   +36.52%     
   - Complexity     1695     9333     +7638     
   =============================================
     Files          1956     1956               
     Lines         74402    74423       +21     
     Branches       8247     8250        +3     
   =============================================
   + Hits           6903    34086    +27183     
   + Misses        66826    37145    -29681     
   - Partials        673     3192     +2519     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...main/java/org/apache/gobblin/yarn/YarnService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFyblNlcnZpY2UuamF2YQ==) | `14.73% <6.66%> (+14.73%)` | `4.00 <0.00> (+4.00)` | |
   | [.../org/apache/gobblin/writer/AvroHdfsDataWriter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3dyaXRlci9BdnJvSGRmc0RhdGFXcml0ZXIuamF2YQ==) | `77.77% <0.00%> (ø)` | `5.00% <0.00%> (+1.00%)` | |
   | [...gobblin/service/monitoring/JobStatusRetriever.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9uaXRvcmluZy9Kb2JTdGF0dXNSZXRyaWV2ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | `2.00% <0.00%> (+2.00%)` | |
   | [...blin/data/management/retention/DatasetCleaner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3JldGVudGlvbi9EYXRhc2V0Q2xlYW5lci5qYXZh) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (ø%)` | |
   | [...ata/management/copy/hive/HivePartitionFileSet.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvaGl2ZS9IaXZlUGFydGl0aW9uRmlsZVNldC5qYXZh) | `0.94% <0.00%> (+0.94%)` | `1.00% <0.00%> (+1.00%)` | |
   | [...ain/java/org/apache/gobblin/runtime/TaskState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvVGFza1N0YXRlLmphdmE=) | `81.97% <0.00%> (+1.16%)` | `32.00% <0.00%> (ø%)` | |
   | [...in/java/org/apache/gobblin/util/FileListUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvRmlsZUxpc3RVdGlscy5qYXZh) | `71.42% <0.00%> (+1.29%)` | `28.00% <0.00%> (ø%)` | |
   | [.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==) | `63.03% <0.00%> (+1.32%)` | `33.00% <0.00%> (+1.00%)` | |
   | [...in/source/extractor/extract/kafka/KafkaSource.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NvdXJjZS9leHRyYWN0b3IvZXh0cmFjdC9rYWZrYS9LYWZrYVNvdXJjZS5qYXZh) | `1.34% <0.00%> (+1.34%)` | `1.00% <0.00%> (+1.00%)` | |
   | [...ava/org/apache/gobblin/runtime/MultiConverter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvTXVsdGlDb252ZXJ0ZXIuamF2YQ==) | `83.60% <0.00%> (+1.63%)` | `9.00% <0.00%> (+1.00%)` | |
   | ... and [1029 more](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=footer). Last update [8b9744e...86f58fc](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] codecov-commenter commented on pull request #3039: GOBBLIN-1191: Reuse Helix instance names when containers are released…

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #3039:
URL: https://github.com/apache/incubator-gobblin/pull/3039#issuecomment-643484827


   # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=h1) Report
   > Merging [#3039](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/8b9744e21f9bfb7ff2d01391eed28bb0b8799b33&el=desc) will **increase** coverage by `36.52%`.
   > The diff coverage is `6.66%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc)](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3039       +/-   ##
   =============================================
   + Coverage      9.27%   45.80%   +36.52%     
   - Complexity     1695     9333     +7638     
   =============================================
     Files          1956     1956               
     Lines         74402    74423       +21     
     Branches       8247     8250        +3     
   =============================================
   + Hits           6903    34086    +27183     
   + Misses        66826    37145    -29681     
   - Partials        673     3192     +2519     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...main/java/org/apache/gobblin/yarn/YarnService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFyblNlcnZpY2UuamF2YQ==) | `14.73% <6.66%> (+14.73%)` | `4.00 <0.00> (+4.00)` | |
   | [.../org/apache/gobblin/writer/AvroHdfsDataWriter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3dyaXRlci9BdnJvSGRmc0RhdGFXcml0ZXIuamF2YQ==) | `77.77% <0.00%> (ø)` | `5.00% <0.00%> (+1.00%)` | |
   | [...gobblin/service/monitoring/JobStatusRetriever.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9uaXRvcmluZy9Kb2JTdGF0dXNSZXRyaWV2ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | `2.00% <0.00%> (+2.00%)` | |
   | [...blin/data/management/retention/DatasetCleaner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3JldGVudGlvbi9EYXRhc2V0Q2xlYW5lci5qYXZh) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (ø%)` | |
   | [...ata/management/copy/hive/HivePartitionFileSet.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvaGl2ZS9IaXZlUGFydGl0aW9uRmlsZVNldC5qYXZh) | `0.94% <0.00%> (+0.94%)` | `1.00% <0.00%> (+1.00%)` | |
   | [...ain/java/org/apache/gobblin/runtime/TaskState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvVGFza1N0YXRlLmphdmE=) | `81.97% <0.00%> (+1.16%)` | `32.00% <0.00%> (ø%)` | |
   | [...in/java/org/apache/gobblin/util/FileListUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvRmlsZUxpc3RVdGlscy5qYXZh) | `71.42% <0.00%> (+1.29%)` | `28.00% <0.00%> (ø%)` | |
   | [.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==) | `63.03% <0.00%> (+1.32%)` | `33.00% <0.00%> (+1.00%)` | |
   | [...in/source/extractor/extract/kafka/KafkaSource.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NvdXJjZS9leHRyYWN0b3IvZXh0cmFjdC9rYWZrYS9LYWZrYVNvdXJjZS5qYXZh) | `1.34% <0.00%> (+1.34%)` | `1.00% <0.00%> (+1.00%)` | |
   | [...ava/org/apache/gobblin/runtime/MultiConverter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvTXVsdGlDb252ZXJ0ZXIuamF2YQ==) | `83.60% <0.00%> (+1.63%)` | `9.00% <0.00%> (+1.00%)` | |
   | ... and [1029 more](https://codecov.io/gh/apache/incubator-gobblin/pull/3039/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=footer). Last update [8b9744e...86f58fc](https://codecov.io/gh/apache/incubator-gobblin/pull/3039?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #3039: GOBBLIN-1191: Reuse Helix instance names when containers are released…

Posted by GitBox <gi...@apache.org>.
autumnust commented on a change in pull request #3039:
URL: https://github.com/apache/incubator-gobblin/pull/3039#discussion_r439142383



##########
File path: gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java
##########
@@ -192,7 +190,7 @@
   // A queue of unused Helix instance names. An unused Helix instance name gets put

Review comment:
       Can you fix the comment here? 

##########
File path: gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java
##########
@@ -699,7 +700,8 @@ protected void handleContainerCompletion(ContainerStatus containerStatus) {
 
       // Add the Helix instance name of the completed container to the queue of unused

Review comment:
       Same for the comment here

##########
File path: gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java
##########
@@ -753,8 +755,19 @@ public void onContainersAllocated(List<Container> containers) {
 
         LOGGER.info(String.format("Container %s has been allocated", container.getId()));
 
-        String instanceName = unusedHelixInstanceNames.poll();
-        while (Strings.isNullOrEmpty(instanceName) || HelixUtils.isInstanceLive(helixManager, instanceName)) {
+        //Iterate over the (thread-safe) set of unused instances to find the first instance that is not currently live.
+        //Once we find a candidate instance, it is removed from the set.
+        String instanceName = null;
+        for (Iterator<String> iterator = unusedHelixInstanceNames.iterator(); iterator.hasNext(); ) {

Review comment:
       for loop doesn't seem to fit here but it seems more intuitive to use while ? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] sv2000 closed pull request #3039: GOBBLIN-1191: Reuse Helix instance names when containers are released…

Posted by GitBox <gi...@apache.org>.
sv2000 closed pull request #3039:
URL: https://github.com/apache/incubator-gobblin/pull/3039


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org