You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (Jira)" <ji...@apache.org> on 2022/10/05 10:22:00 UTC

[jira] [Comment Edited] (FLINK-29501) Allow overriding JobVertex parallelisms during job submission

    [ https://issues.apache.org/jira/browse/FLINK-29501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612935#comment-17612935 ] 

Chesnay Schepler edited comment on FLINK-29501 at 10/5/22 10:21 AM:
--------------------------------------------------------------------

??The user application doesn't have access to the JobGraph during the normal execution flow??

It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism)
Yes, this isn't a good approach :)

> redeploy the Flink JobGraph

I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration?
Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)?


On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though.


was (Author: zentol):
> The user application doesn't have access to the JobGraph during the normal execution flow

It would set the parallelism like user applications usually do, in the main() method where they define their workflow. You would of course have to parameterize the parallelism of every single operator (and expose that _somehow_), but that may not be such a bad idea anyway? (could force certain operations to run with a specific parallelism)
Yes, this isn't a good approach :)

> redeploy the Flink JobGraph

I don't really follow. Will you suspend the job, and restart it from another JM with a different configuration?
Or is this something meant to be specific to the YARN per-job mode (which loads the jobgraph from a file)?


On a related note, there were some ideas about adding a REST endpoint for the adaptive scheduler that allows the parallelism to be changed at runtime. Not sure if we ever wrote that down in a JIRA ticket though.

> Allow overriding JobVertex parallelisms during job submission
> -------------------------------------------------------------
>
>                 Key: FLINK-29501
>                 URL: https://issues.apache.org/jira/browse/FLINK-29501
>             Project: Flink
>          Issue Type: New Feature
>          Components: Runtime / Configuration, Runtime / REST
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: Minor
>
> It is a common scenario that users want to make changes to the parallelisms in the JobGraph. For example, because they discover that the job needs more or less resources. There is the option to do this globally via the job parallelism. However, for fine-tuned jobs jobs with potentially many branches, tuning on the job vertex level is required.
> This is to propose a way such that users can apply a mapping \{{jobVertexId => parallelism}} before the job is submitted without having to modify the JobGraph manually.
> One way to achieving this would be to add an optional map field to the Rest API jobs endpoint. However, in deployment modes like the application mode, this might not make sense because users do not have control the rest endpoint.
> Similarly to how other job parameters are passed in the application mode, we propose to add the overrides as a configuration parameter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)