You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "mingleizhang (JIRA)" <ji...@apache.org> on 2017/06/13 03:08:00 UTC
[jira] [Comment Edited] (FLINK-6682) Improve error message in case
parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047364#comment-16047364 ]
mingleizhang edited comment on FLINK-6682 at 6/13/17 3:07 AM:
--------------------------------------------------------------
Hi,[~Zentol] . I put the check conditions at {{assignAttemptState}} method which under {{StateAssignmentOperation}}. What do you think ? Not very sure about task info I wrote. It would be great helpful if you can check this.
{code}
private void assignAttemptState(ExecutionJobVertex executionJobVertex, List<OperatorState> operatorStates) {
List<OperatorID> operatorIDs = executionJobVertex.getOperatorIDs();
//1. first compute the new parallelism
checkParallelismPreconditions(operatorStates, executionJobVertex);
int newParallelism = executionJobVertex.getParallelism();
if (executionJobVertex.getMaxParallelism() < newParallelism) {
LOG.error("Task info: {}, maxParallelism number: {}, and parallelism number: {}",
this.tasks, executionJobVertex.getMaxParallelism(), newParallelism);
}
List<KeyGroupRange> keyGroupPartitions = createKeyGroupPartitions(
executionJobVertex.getMaxParallelism(),
newParallelism);
{code}
was (Author: mingleizhang):
Hi,[~Zentol] . I put the check conditions at {{assignAttemptState}} method. What do you think ? Not very sure about task info I wrote. It would be great helpful if you can check this.
{code}
private void assignAttemptState(ExecutionJobVertex executionJobVertex, List<OperatorState> operatorStates) {
List<OperatorID> operatorIDs = executionJobVertex.getOperatorIDs();
//1. first compute the new parallelism
checkParallelismPreconditions(operatorStates, executionJobVertex);
int newParallelism = executionJobVertex.getParallelism();
if (executionJobVertex.getMaxParallelism() < newParallelism) {
LOG.error("Task info: {}, maxParallelism number: {}, and parallelism number: {}",
this.tasks, executionJobVertex.getMaxParallelism(), newParallelism);
}
List<KeyGroupRange> keyGroupPartitions = createKeyGroupPartitions(
executionJobVertex.getMaxParallelism(),
newParallelism);
{code}
> Improve error message in case parallelism exceeds maxParallelism
> ----------------------------------------------------------------
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing
> Affects Versions: 1.3.0, 1.4.0
> Reporter: Chesnay Schepler
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're not providing a useful error message, as all you get is an IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed
> at org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)