You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2022/04/27 12:19:57 UTC

[GitHub] [helix] richardstartin opened a new pull request, #2065: fix bug where generateJobList would iterate until the parallelism is reached

richardstartin opened a new pull request, #2065:
URL: https://github.com/apache/helix/pull/2065

   ### Issues
   
   - [x] My PR addresses the following Helix issues and references them in the PR description:
   
   Fixes #2064
   
   ### Description
   
   When there is a high level of parallelism configured and the JobDag is a job queue, building the ready job list would take a long time because the loop iterates until the level of parallelism is reached, and doesn't terminate when all the jobs have been processed
   
   ### Tests
   
   `TestRuntimeJobDag#testBuildJobQueueWithParallelismExceedingJobCount` was added to verify that the correct job order is still produced. It is not a performance test, but before the fix it took 22 seconds to complete, afterwards it takes ~1ms.
   
   ### Changes that Break Backward Compatibility (Optional)
   
   None.
   
   ### Documentation (Optional)
   
   This is a bug fix, no documentation required.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] junkaixue commented on a diff in pull request #2065: fix bug where generateJobList would iterate until the parallelism is reached

Posted by GitBox <gi...@apache.org>.
junkaixue commented on code in PR #2065:
URL: https://github.com/apache/helix/pull/2065#discussion_r860286211


##########
helix-core/src/main/java/org/apache/helix/task/RuntimeJobDag.java:
##########
@@ -198,11 +198,15 @@ public void generateJobList() {
     resetJobListAndDependencyMaps();
     computeIndependentNodes();
     _readyJobList.addAll(_independentNodes);
-    if (_isJobQueue && _readyJobList.size() > 0) {
+    if (_isJobQueue && !_readyJobList.isEmpty()) {

Review Comment:
   Not quite sure whether this is just a style change? If we dont change it, would it cause any problem?



##########
helix-core/src/main/java/org/apache/helix/task/RuntimeJobDag.java:
##########
@@ -198,11 +198,15 @@ public void generateJobList() {
     resetJobListAndDependencyMaps();
     computeIndependentNodes();
     _readyJobList.addAll(_independentNodes);
-    if (_isJobQueue && _readyJobList.size() > 0) {
+    if (_isJobQueue && !_readyJobList.isEmpty()) {
       // For job queue, only get number of parallel jobs to run in the ready list.
       for (int i = 1; i < _numParallelJobs; i++) {
-        if (_parentsToChildren.containsKey(_readyJobList.peekLast())) {
-          _readyJobList.offer(_parentsToChildren.get(_readyJobList.peekLast()).iterator().next());
+        String parent = _readyJobList.peekLast();
+        Set<String> children = _parentsToChildren.get(parent);
+        if (children != null) {
+          _readyJobList.offer(children.iterator().next());
+        } else {
+          break;

Review Comment:
   NIT: it would be better make it as:
   
   if (children == null) {
   break;
   } 
   
   _readyJobList.offer(children.iterator().next());



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] richardstartin commented on a diff in pull request #2065: fix bug where generateJobList would iterate until the parallelism is reached

Posted by GitBox <gi...@apache.org>.
richardstartin commented on code in PR #2065:
URL: https://github.com/apache/helix/pull/2065#discussion_r860303720


##########
helix-core/src/main/java/org/apache/helix/task/RuntimeJobDag.java:
##########
@@ -198,11 +198,15 @@ public void generateJobList() {
     resetJobListAndDependencyMaps();
     computeIndependentNodes();
     _readyJobList.addAll(_independentNodes);
-    if (_isJobQueue && _readyJobList.size() > 0) {
+    if (_isJobQueue && !_readyJobList.isEmpty()) {

Review Comment:
   ArrayDeque.size has a nontrivial implementation (https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/ArrayDeque.java#L649) but this is an emptiness check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] junkaixue merged pull request #2065: fix bug where generateJobList would iterate until the parallelism is reached

Posted by GitBox <gi...@apache.org>.
junkaixue merged PR #2065:
URL: https://github.com/apache/helix/pull/2065


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] junkaixue commented on pull request #2065: fix bug where generateJobList would iterate until the parallelism is reached

Posted by GitBox <gi...@apache.org>.
junkaixue commented on PR #2065:
URL: https://github.com/apache/helix/pull/2065#issuecomment-1112430572

   @richardstartin Please let us know when it is ready to merge. Helix usually has the last step to let author to confirm when it is ready to merge if you check the PR guidance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a diff in pull request #2065: fix bug where generateJobList would iterate until the parallelism is reached

Posted by GitBox <gi...@apache.org>.
narendly commented on code in PR #2065:
URL: https://github.com/apache/helix/pull/2065#discussion_r860332884


##########
helix-core/src/main/java/org/apache/helix/task/RuntimeJobDag.java:
##########
@@ -198,11 +198,15 @@ public void generateJobList() {
     resetJobListAndDependencyMaps();
     computeIndependentNodes();
     _readyJobList.addAll(_independentNodes);
-    if (_isJobQueue && _readyJobList.size() > 0) {
+    if (_isJobQueue && !_readyJobList.isEmpty()) {

Review Comment:
   Little did I know! I think this is good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] richardstartin commented on a diff in pull request #2065: fix bug where generateJobList would iterate until the parallelism is reached

Posted by GitBox <gi...@apache.org>.
richardstartin commented on code in PR #2065:
URL: https://github.com/apache/helix/pull/2065#discussion_r860632650


##########
helix-core/src/main/java/org/apache/helix/task/RuntimeJobDag.java:
##########
@@ -198,11 +198,15 @@ public void generateJobList() {
     resetJobListAndDependencyMaps();
     computeIndependentNodes();
     _readyJobList.addAll(_independentNodes);
-    if (_isJobQueue && _readyJobList.size() > 0) {
+    if (_isJobQueue && !_readyJobList.isEmpty()) {
       // For job queue, only get number of parallel jobs to run in the ready list.
       for (int i = 1; i < _numParallelJobs; i++) {
-        if (_parentsToChildren.containsKey(_readyJobList.peekLast())) {
-          _readyJobList.offer(_parentsToChildren.get(_readyJobList.peekLast()).iterator().next());
+        String parent = _readyJobList.peekLast();
+        Set<String> children = _parentsToChildren.get(parent);
+        if (children != null) {
+          _readyJobList.offer(children.iterator().next());
+        } else {
+          break;

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] richardstartin commented on a diff in pull request #2065: fix bug where generateJobList would iterate until the parallelism is reached

Posted by GitBox <gi...@apache.org>.
richardstartin commented on code in PR #2065:
URL: https://github.com/apache/helix/pull/2065#discussion_r860303939


##########
helix-core/src/main/java/org/apache/helix/task/RuntimeJobDag.java:
##########
@@ -198,11 +198,15 @@ public void generateJobList() {
     resetJobListAndDependencyMaps();
     computeIndependentNodes();
     _readyJobList.addAll(_independentNodes);
-    if (_isJobQueue && _readyJobList.size() > 0) {
+    if (_isJobQueue && !_readyJobList.isEmpty()) {
       // For job queue, only get number of parallel jobs to run in the ready list.
       for (int i = 1; i < _numParallelJobs; i++) {
-        if (_parentsToChildren.containsKey(_readyJobList.peekLast())) {
-          _readyJobList.offer(_parentsToChildren.get(_readyJobList.peekLast()).iterator().next());
+        String parent = _readyJobList.peekLast();
+        Set<String> children = _parentsToChildren.get(parent);
+        if (children != null) {
+          _readyJobList.offer(children.iterator().next());
+        } else {
+          break;

Review Comment:
   Sure I can change this if you like.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] richardstartin commented on pull request #2065: fix bug where generateJobList would iterate until the parallelism is reached

Posted by GitBox <gi...@apache.org>.
richardstartin commented on PR #2065:
URL: https://github.com/apache/helix/pull/2065#issuecomment-1112433737

   > @richardstartin Please let us know when it is ready to merge. Helix usually has the last step to let author to confirm when it is ready to merge if you check the PR guidance.
   
   @junkaixue I am happy for this to be merged now, thanks for letting me know.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org