You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/09/17 20:44:04 UTC
[GitHub] [helix] kaisun2000 opened a new issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
kaisun2000 opened a new issue #1372:
URL: https://github.com/apache/helix/issues/1372
LOG 976 touch 7
> 020-09-17T11:27:23.8512026Z [ERROR] stopDeleteJobAndResumeNamedQueue(org.apache.helix.integration.task.TestTaskRebalancerStopResume) Time elapsed: 900.001 s <<< FAILURE!
2020-09-17T11:27:23.8515885Z org.testng.internal.thread.ThreadTimeoutException: Method org.testng.internal.TestNGMethod.stopDeleteJobAndResumeNamedQueue() didn't finish within the time-out 900000
2020-09-17T11:27:23.8519077Z
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org
[GitHub] [helix] kaisun2000 commented on issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on issue #1372:
URL: https://github.com/apache/helix/issues/1372#issuecomment-699175490
50 times single run with debug.
2000 JobConfig.Builder jobBuilder = new JobConfig.Builder().setCommand(MockTask.TASK_COMMAND)
.setTargetResource(WorkflowGenerator.DEFAULT_TGT_DB)
.setTargetPartitionStates(Sets.newHashSet(targetPartition))
.setJobCommandConfigMap(Collections.singletonMap(MockTask.JOB_DELAY, "200"));
String jobName = targetPartition.toLowerCase() + "Job" + i;
LOG.info("Enqueuing job: " + jobName);
queueBuilder.enqueueJob(jobName, jobBuilder);
currentJobNames.add(i, jobName);
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org
[GitHub] [helix] kaisun2000 commented on issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on issue #1372:
URL: https://github.com/apache/helix/issues/1372#issuecomment-695147425
LOG 1074
>2020-09-19T01:20:52.9828081Z [ERROR] stopDeleteJobAndResumeNamedQueue(org.apache.helix.integration.task.TestTaskRebalancerStopResume) Time elapsed: 607.514 s <<< FAILURE!
2020-09-19T01:20:52.9839860Z org.apache.helix.HelixException: Workflow "stopDeleteJobAndResumeNamedQueue", job "stopDeleteJobAndResumeNamedQueue_slaveJob2_second" timed out
2020-09-19T01:20:52.9915658Z at org.apache.helix.integration.task.TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue(TestTaskRebalancerStopResume.java:247)
2020-09-19T01:20:52.9919495Z
2020-09-19T01:20:53.3830418Z [ERROR] Failures:
2020-09-19T01:20:53.3836636Z [ERROR] TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue:247 » Helix Work...
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org
[GitHub] [helix] kaisun2000 commented on issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on issue #1372:
URL: https://github.com/apache/helix/issues/1372#issuecomment-700400474
adding job race condition avoid with pausing
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org
[GitHub] [helix] kaisun2000 commented on issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on issue #1372:
URL: https://github.com/apache/helix/issues/1372#issuecomment-697125318
LOG 1252 09/22 touch 10 after JJ's rebalancer change
>2020-09-23T03:28:13.6548620Z [ERROR] stopDeleteJobAndResumeNamedQueue(org.apache.helix.integration.task.TestTaskRebalancerStopResume) Time elapsed: 607.521 s <<< FAILURE!
2020-09-23T03:28:13.6551573Z org.apache.helix.HelixException: Workflow "stopDeleteJobAndResumeNamedQueue", job "stopDeleteJobAndResumeNamedQueue_slaveJob2_second" timed out
2020-09-23T03:28:13.6555575Z at org.apache.helix.integration.task.TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue(TestTaskRebalancerStopResume.java:247)
2020-09-23T03:28:13.6559777Z
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org
[GitHub] [helix] kaisun2000 commented on issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on issue #1372:
URL: https://github.com/apache/helix/issues/1372#issuecomment-695147635
Note, this is a frequent failure case.
```
// add job 3 back
JobConfig.Builder job = new JobConfig.Builder().setCommand(MockTask.TASK_COMMAND)
.setTargetResource(WorkflowGenerator.DEFAULT_TGT_DB)
.setTargetPartitionStates(Sets.newHashSet("SLAVE"));
// the name here MUST be unique in order to avoid conflicts with the old job cached in
// RuntimeJobDag
String newJob = deletedJob2 + "_second";
LOG.info("Enqueuing job: " + newJob);
_driver.enqueueJob(queueName, newJob, job);
currentJobNames.add(newJob);
// Ensure the jobs left are successful completed in the correct order
long preJobFinish = 0;
for (int i = 0; i < currentJobNames.size(); i++) {
String namedSpaceJobName = String.format("%s_%s", queueName, currentJobNames.get(i));
_driver.pollForJobState(queueName, namedSpaceJobName, TaskState.COMPLETED); --------> failed here.
JobContext jobContext = _driver.getJobContext(namedSpaceJobName);
long jobStart = jobContext.getStartTime();
Assert.assertTrue(jobStart >= preJobFinish);
preJobFinish = jobContext.getFinishTime();
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org
[GitHub] [helix] jiajunwang closed issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
Posted by GitBox <gi...@apache.org>.
jiajunwang closed issue #1372:
URL: https://github.com/apache/helix/issues/1372
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org
[GitHub] [helix] kaisun2000 commented on issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on issue #1372:
URL: https://github.com/apache/helix/issues/1372#issuecomment-694490011
similar to #1366
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org
[GitHub] [helix] kaisun2000 commented on issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on issue #1372:
URL: https://github.com/apache/helix/issues/1372#issuecomment-694496223
LOG 979
>2020-09-17T11:29:48.0010235Z [ERROR] stopAndResumeNamedQueue(org.apache.helix.integration.task.TestTaskRebalancerStopResume) Time elapsed: 900.013 s <<< FAILURE!
2020-09-17T11:29:48.0022641Z org.testng.internal.thread.ThreadTimeoutException: Method org.testng.internal.TestNGMethod.stopAndResumeNamedQueue() didn't finish within the time-out 900000
2020-09-17T11:29:48.0024922Z
2020-09-17T11:29:48.3789290Z [ERROR] Failures:
2020-09-17T11:29:48.3797170Z [ERROR] TestTaskRebalancerStopResume.stopAndResumeNamedQueue » ThreadTimeout Method or...
2020-09-17T11:29:48.3798445Z [ERROR] Tests run: 1196, Failures: 1, Errors: 0, Skipped: 1
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org
[GitHub] [helix] jiajunwang commented on issue #1372: Fix TestTaskRebalancerStopResume.stopDeleteJobAndResumeNamedQueue
Posted by GitBox <gi...@apache.org>.
jiajunwang commented on issue #1372:
URL: https://github.com/apache/helix/issues/1372#issuecomment-849104158
Close test unstable tickets since we have an automatic tracking mechanism https://github.com/apache/helix/pull/1757 now for tracking the most recent test issues.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org