You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Zheng Yu Chen <ja...@gmail.com> on 2022/09/01 04:08:03 UTC

Re: [DISCUSS] FLIP-256 Support Job Dynamic Parameter With Flink Rest Api

Hi @Hong

As @Chesnay said, we just follow the same order that the CLI currently
enforces. @

Gyula and me agree this sugget,

It also makes sense to mention that this FLIP is not about the
prioritization of config options. so this feature maybe do this filp and
after discuss , not now

After the addition is complete, if there are no new comments, we will enter
the voting stage and start it ~

what do you think?

Teoh, Hong <li...@amazon.co.uk.invalid> 于2022年8月27日周六 16:06写道:

> Hi Zheng Yu,
>
> Sorry for the late reply, I was on holiday last week.
>
> Could we propose the following config instead?
>
> rest.submit.job.override-config = true/false
>   - true: REST API will have priority over user code (i.e. Rest API /
> Flink CLI > User Code > Cluster Config)
>   - false (default): user code will have priority over REST API (i.e. User
> Code > Rest API / Flink CLI > Cluster Config)
>
> This way, the default behaviour will be according to the proposed FLIP,
> but we will have an additional toggle to ignore the configuration set in
> user code.
>
>     >    However, there will be a problem in doing this. If the user writes
>     >     this Flink Config in jar by hardcoding, the highest priority
> must be
>     >     the code at this time, so the problem may still exist, and the
> user
>     >     cannot be prevented from this behavior
>
> Hmm, would it be better if we set it such that the
> "rest.submit.job.override-config" cannot be overwritten in user code
> (ignored), just like configuration  "jobmanager.rpc.port"?
>
> What do you think?
>
> Regards,
> Hong
>
>
>
> On 26/08/2022, 12:05, "Zheng Yu Chen" <ja...@gmail.com> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
>     HI Hong,
>
>     Maybe you forgot. I don’t know if the current FLIP still needs to be
>     modified. If not, we will start with the current plan. If you have
>     relevant ideas in the future, please continue to discuss. If not, we
>     will open voting at a later date.
>
>     Teoh, Hong <li...@amazon.co.uk.invalid> 于2022年8月22日周一 16:02写道:
>     >
>     > Hi Zheng Yu,
>     >
>     > We would have to take the same "cluster configuration" (cannot be
> set on job submission) vs "job configuration" approach in the User code as
> well. And we can classify jobmanager.rest.api.submit.job.allow-reset-config
> as a cluster configuration. That way, neither the REST API / User code can
> override this configuration, and it can only be set in cluster
> configuration on startup.
>     >
>     > There are other cluster specific configurations that are already
> treated this way (e.g. jobmanager.rpc.address,
> jobmanager.memory.process.size). As part of this work, I wonder if we can
> update the Flink docs on configuration (
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/)
> to specify which configuration is "cluster" and which is "job".
>     >
>     > Regards,
>     > Hong
>     >
>     >
>     >
>     > On 20/08/2022, 09:16, "Zheng Yu Chen" <ja...@gmail.com> wrote:
>     >
>     >     CAUTION: This email originated from outside of the organization.
> Do not click links or open attachments unless you can confirm the sender
> and know the content is safe.
>     >
>     >
>     >
>     >      @Gyula say that  add a config for this but not expose it on the
> rest
>     >     api is right
>     >     when user start up session cluster,we can config (eg
>     >     jobmanager.rest.api.submit.job.allow-reset-config = true/false)
> to
>     >     controller this behavior
>     >
>     >     Now we have an existing cluster, the admin configures
> checkpointing
>     >     interval by this session cluster
>     >
>     >     jobmanager.rest.api.submit.job.allow-reset-config = true/false
>     >
>     >     - true : allow user reset new value config
>     >     - false(default) : not allow user set new config from ,throw exp
> to
>     >     tell user what properties is admin was set it
>     >
>     >     However, there will be a problem in doing this. If the user
> writes
>     >     this Flink Config in jar by hardcoding, the highest priority
> must be
>     >     the code at this time, so the problem may still exist, and the
> user
>     >     cannot be prevented from this behavior.
>     >
>     >     what do you think ?
>     >
>     >     @Hong @Gyula @Teoh @Danny
>     >
>     >
>     >     Gyula Fóra <gy...@gmail.com> 于2022年8月18日周四 18:37写道:
>     >     >
>     >     > +1 for the proposal.
>     >     >
>     >     > @Hong : I feel that the inverted override flag should be a
> cluster setting
>     >     > and not something the user can override at will. I fear that
> this might
>     >     > defeat the purpose of the feature itself.
>     >     > So I think we should add a config for this but not expose it
> on the rest api
>     >     >
>     >     > Gyula
>     >     >
>     >     > On Thu, Aug 18, 2022 at 12:33 PM Teoh, Hong
> <li...@amazon.co.uk.invalid>
>     >     > wrote:
>     >     >
>     >     > > +1 to this FLIP.
>     >     > > This is very useful for teams building a Flink platform to
> run jobs from
>     >     > > an external user
>     >     > >
>     >     > > +1 on Danny's comment on adding a configuration to allow
> inverted order of
>     >     > > overrides.
>     >     > > However, might it be better to include an "override" toggle
> in the REST
>     >     > > API itself? That way we can change the flink configuration
> override
>     >     > > behaviour without restarting the Flink cluster. This would
> make sense if we
>     >     > > are thinking of a Session cluster and deploying multiple
> Flink jobs to the
>     >     > > same cluster.
>     >     > >
>     >     > > Regards,
>     >     > > Hong
>     >     > >
>     >     > >
>     >     > > On 18/08/2022, 10:46, "Danny Cranmer" <
> dannycranmer@apache.org> wrote:
>     >     > >
>     >     > >     CAUTION: This email originated from outside of the
> organization. Do
>     >     > > not click links or open attachments unless you can confirm
> the sender and
>     >     > > know the content is safe.
>     >     > >
>     >     > >
>     >     > >
>     >     > >     +1 thanks for driving this FLIP.
>     >     > >
>     >     > >     We actually have an internally forked equivalent of
> this, which is on
>     >     > > our
>     >     > >     list to try to upstream. Your proposal would *almost*
> work "off the
>     >     > > shelf"
>     >     > >     for us. We have an inverted order or priority:
>     >     > >     - Rest API / Flink CLI > User Code > Cluster Config
>     >     > >
>     >     > >     This is because our end users do not submit their jobs
> directly, our
>     >     > >     service does it on their behalf. For our usecase we do
> not want to
>     >     > > allow
>     >     > >     users to override certain values we set, since some are
> managed from
>     >     > > our
>     >     > >     service configuration, example: checkpointing interval.
>     >     > >
>     >     > >     How would you feel about including a cluster
> configuration to invert
>     >     > > the
>     >     > >     order? This could be decoupled to a follow-up FLIP.
>     >     > >
>     >     > >     Thanks,
>     >     > >     Danny Cranmer
>     >     > >
>     >     > >
>     >     > >     On Tue, Aug 16, 2022 at 7:21 AM Zheng Yu Chen <
> jam.gzczy@gmail.com>
>     >     > > wrote:
>     >     > >
>     >     > >     > This is a good suggestion, we can start this work
> after we finish the
>     >     > >     > discussion and vote
>     >     > >     >
>     >     > >     > --
>     >     > >     > Best
>     >     > >     >
>     >     > >     > ConradJam
>     >     > >     >
>     >     > >     >
>     >     > >     > <zh...@outlook.com> 于2022年8月15日周一 21:01写道:
>     >     > >     >
>     >     > >     > > I've created a new ticket [FLINK-28973] Extending
>     >     > > /jars/:jarid/plan API
>     >     > >     > to
>     >     > >     > > support setting Flink configs - ASF JIRA (apache.org
> )<
>     >     > >     > > https://issues.apache.org/jira/browse/FLINK-28973>.
> for the job
>     >     > > plan
>     >     > >     > API.
>     >     > >     > > Maybe we can create a new umbrella ticket for
> FLIP-256 and put
>     >     > >     > FLINK-27060
>     >     > >     > > & FLINK-28973 as subtasks. WDYT?
>     >     > >     > >
>     >     > >     > > Best,
>     >     > >     > > Zhanghao Chen
>     >     > >     > > ________________________________
>     >     > >     > > From: zhengyu chen <ja...@gmail.com>
>     >     > >     > > Sent: Monday, August 15, 2022 18:06
>     >     > >     > > To: dev@flink.apache.org <de...@flink.apache.org>
>     >     > >     > > Subject: [DISCUSS] FLIP-256 Support Job Dynamic
> Parameter With
>     >     > > Flink Rest
>     >     > >     > > Api
>     >     > >     > >
>     >     > >     > > Hi all,
>     >     > >     > >
>     >     > >     > >
>     >     > >     > > We would like to start a discussion thread on
> FLIP-256 Support Job
>     >     > >     > Dynamic
>     >     > >     > > Parameter With Flink Rest Api
>     >     > >     > > <
>     >     > >     > >
>     >     > >     >
>     >     > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-256+Support+Job+Dynamic+Parameter+With+Flink+Rest+Api
>     >     > >     > > >
>     >     > >     > > [1]
>     >     > >     > >
>     >     > >     > > After the user submits the jar package, running a
> job through
>     >     > > restapi
>     >     > >     > > (/jars/:jarid/run) [2] can only pass in
> (allowNonRestoredState,
>     >     > >     > > savepointPath, programArg, entry-class, parallelism)
> parameters,
>     >     > > which is
>     >     > >     > > obvious with the diversification of job parameters
> (eg Checkpoint
>     >     > >     > address)
>     >     > >     > >
>     >     > >     > > This solves the problem that the user can pass in
> other parameters
>     >     > > when
>     >     > >     > > submitting a job, avoiding the user to define these
> job parameters
>     >     > > in the
>     >     > >     > > code, resulting in the need to repackage the job for
> each
>     >     > > modification
>     >     > >     > >
>     >     > >     > > There was some interest from users [3] from a meetup
> and the
>     >     > > mailing
>     >     > >     > list.
>     >     > >     > > Looking forward to comments and feedback, thanks!
>     >     > >     > >
>     >     > >     > >
>     >     > >     > > [1]
>     >     > >     > >
>     >     > >     > >
>     >     > >     >
>     >     > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-256+Support+Job+Dynamic+Parameter+With+Flink+Rest+Api
>     >     > >     > >
>     >     > >     > >
>     >     > >     > > [2]
>     >     > >     > >
>     >     > >     > >
>     >     > >     >
>     >     > >
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jars-jarid-run
>     >     > >     > >
>     >     > >     > > [3]
> https://issues.apache.org/jira/browse/FLINK-27060
>     >     > >     > >
>     >     > >     > >
>     >     > >     > >
>     >     > >     > >
>     >     > >     > > --
>     >     > >     > > Best
>     >     > >     > >
>     >     > >     > > ConradJam
>     >     > >     > >
>     >     > >     >
>     >     > >
>     >     > >
>     >
>
>     --
>     Best
>
>     ConradJam
>
>

-- 
Best

ConradJam

Re: [DISCUSS] FLIP-256 Support Job Dynamic Parameter With Flink Rest Api

Posted by "Teoh, Hong" <li...@amazon.co.uk.INVALID>.
That sounds good to me. Thanks for working in this Zheng Yu!

Hong
________________________________________
From: Zheng Yu Chen <ja...@gmail.com>
Sent: 01 September 2022 05:08:03
To: dev@flink.apache.org
Subject: RE: [EXTERNAL][DISCUSS] FLIP-256 Support Job Dynamic Parameter With Flink Rest Api

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



Hi @Hong

As @Chesnay said, we just follow the same order that the CLI currently
enforces. @

Gyula and me agree this sugget,

It also makes sense to mention that this FLIP is not about the
prioritization of config options. so this feature maybe do this filp and
after discuss , not now

After the addition is complete, if there are no new comments, we will enter
the voting stage and start it ~

what do you think?

Teoh, Hong <li...@amazon.co.uk.invalid> 于2022年8月27日周六 16:06写道:

> Hi Zheng Yu,
>
> Sorry for the late reply, I was on holiday last week.
>
> Could we propose the following config instead?
>
> rest.submit.job.override-config = true/false
>   - true: REST API will have priority over user code (i.e. Rest API /
> Flink CLI > User Code > Cluster Config)
>   - false (default): user code will have priority over REST API (i.e. User
> Code > Rest API / Flink CLI > Cluster Config)
>
> This way, the default behaviour will be according to the proposed FLIP,
> but we will have an additional toggle to ignore the configuration set in
> user code.
>
>     >    However, there will be a problem in doing this. If the user writes
>     >     this Flink Config in jar by hardcoding, the highest priority
> must be
>     >     the code at this time, so the problem may still exist, and the
> user
>     >     cannot be prevented from this behavior
>
> Hmm, would it be better if we set it such that the
> "rest.submit.job.override-config" cannot be overwritten in user code
> (ignored), just like configuration  "jobmanager.rpc.port"?
>
> What do you think?
>
> Regards,
> Hong
>
>
>
> On 26/08/2022, 12:05, "Zheng Yu Chen" <ja...@gmail.com> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
>     HI Hong,
>
>     Maybe you forgot. I don’t know if the current FLIP still needs to be
>     modified. If not, we will start with the current plan. If you have
>     relevant ideas in the future, please continue to discuss. If not, we
>     will open voting at a later date.
>
>     Teoh, Hong <li...@amazon.co.uk.invalid> 于2022年8月22日周一 16:02写道:
>     >
>     > Hi Zheng Yu,
>     >
>     > We would have to take the same "cluster configuration" (cannot be
> set on job submission) vs "job configuration" approach in the User code as
> well. And we can classify jobmanager.rest.api.submit.job.allow-reset-config
> as a cluster configuration. That way, neither the REST API / User code can
> override this configuration, and it can only be set in cluster
> configuration on startup.
>     >
>     > There are other cluster specific configurations that are already
> treated this way (e.g. jobmanager.rpc.address,
> jobmanager.memory.process.size). As part of this work, I wonder if we can
> update the Flink docs on configuration (
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/)
> to specify which configuration is "cluster" and which is "job".
>     >
>     > Regards,
>     > Hong
>     >
>     >
>     >
>     > On 20/08/2022, 09:16, "Zheng Yu Chen" <ja...@gmail.com> wrote:
>     >
>     >     CAUTION: This email originated from outside of the organization.
> Do not click links or open attachments unless you can confirm the sender
> and know the content is safe.
>     >
>     >
>     >
>     >      @Gyula say that  add a config for this but not expose it on the
> rest
>     >     api is right
>     >     when user start up session cluster,we can config (eg
>     >     jobmanager.rest.api.submit.job.allow-reset-config = true/false)
> to
>     >     controller this behavior
>     >
>     >     Now we have an existing cluster, the admin configures
> checkpointing
>     >     interval by this session cluster
>     >
>     >     jobmanager.rest.api.submit.job.allow-reset-config = true/false
>     >
>     >     - true : allow user reset new value config
>     >     - false(default) : not allow user set new config from ,throw exp
> to
>     >     tell user what properties is admin was set it
>     >
>     >     However, there will be a problem in doing this. If the user
> writes
>     >     this Flink Config in jar by hardcoding, the highest priority
> must be
>     >     the code at this time, so the problem may still exist, and the
> user
>     >     cannot be prevented from this behavior.
>     >
>     >     what do you think ?
>     >
>     >     @Hong @Gyula @Teoh @Danny
>     >
>     >
>     >     Gyula Fóra <gy...@gmail.com> 于2022年8月18日周四 18:37写道:
>     >     >
>     >     > +1 for the proposal.
>     >     >
>     >     > @Hong : I feel that the inverted override flag should be a
> cluster setting
>     >     > and not something the user can override at will. I fear that
> this might
>     >     > defeat the purpose of the feature itself.
>     >     > So I think we should add a config for this but not expose it
> on the rest api
>     >     >
>     >     > Gyula
>     >     >
>     >     > On Thu, Aug 18, 2022 at 12:33 PM Teoh, Hong
> <li...@amazon.co.uk.invalid>
>     >     > wrote:
>     >     >
>     >     > > +1 to this FLIP.
>     >     > > This is very useful for teams building a Flink platform to
> run jobs from
>     >     > > an external user
>     >     > >
>     >     > > +1 on Danny's comment on adding a configuration to allow
> inverted order of
>     >     > > overrides.
>     >     > > However, might it be better to include an "override" toggle
> in the REST
>     >     > > API itself? That way we can change the flink configuration
> override
>     >     > > behaviour without restarting the Flink cluster. This would
> make sense if we
>     >     > > are thinking of a Session cluster and deploying multiple
> Flink jobs to the
>     >     > > same cluster.
>     >     > >
>     >     > > Regards,
>     >     > > Hong
>     >     > >
>     >     > >
>     >     > > On 18/08/2022, 10:46, "Danny Cranmer" <
> dannycranmer@apache.org> wrote:
>     >     > >
>     >     > >     CAUTION: This email originated from outside of the
> organization. Do
>     >     > > not click links or open attachments unless you can confirm
> the sender and
>     >     > > know the content is safe.
>     >     > >
>     >     > >
>     >     > >
>     >     > >     +1 thanks for driving this FLIP.
>     >     > >
>     >     > >     We actually have an internally forked equivalent of
> this, which is on
>     >     > > our
>     >     > >     list to try to upstream. Your proposal would *almost*
> work "off the
>     >     > > shelf"
>     >     > >     for us. We have an inverted order or priority:
>     >     > >     - Rest API / Flink CLI > User Code > Cluster Config
>     >     > >
>     >     > >     This is because our end users do not submit their jobs
> directly, our
>     >     > >     service does it on their behalf. For our usecase we do
> not want to
>     >     > > allow
>     >     > >     users to override certain values we set, since some are
> managed from
>     >     > > our
>     >     > >     service configuration, example: checkpointing interval.
>     >     > >
>     >     > >     How would you feel about including a cluster
> configuration to invert
>     >     > > the
>     >     > >     order? This could be decoupled to a follow-up FLIP.
>     >     > >
>     >     > >     Thanks,
>     >     > >     Danny Cranmer
>     >     > >
>     >     > >
>     >     > >     On Tue, Aug 16, 2022 at 7:21 AM Zheng Yu Chen <
> jam.gzczy@gmail.com>
>     >     > > wrote:
>     >     > >
>     >     > >     > This is a good suggestion, we can start this work
> after we finish the
>     >     > >     > discussion and vote
>     >     > >     >
>     >     > >     > --
>     >     > >     > Best
>     >     > >     >
>     >     > >     > ConradJam
>     >     > >     >
>     >     > >     >
>     >     > >     > <zh...@outlook.com> 于2022年8月15日周一 21:01写道:
>     >     > >     >
>     >     > >     > > I've created a new ticket [FLINK-28973] Extending
>     >     > > /jars/:jarid/plan API
>     >     > >     > to
>     >     > >     > > support setting Flink configs - ASF JIRA (apache.org
> )<
>     >     > >     > > https://issues.apache.org/jira/browse/FLINK-28973>.
> for the job
>     >     > > plan
>     >     > >     > API.
>     >     > >     > > Maybe we can create a new umbrella ticket for
> FLIP-256 and put
>     >     > >     > FLINK-27060
>     >     > >     > > & FLINK-28973 as subtasks. WDYT?
>     >     > >     > >
>     >     > >     > > Best,
>     >     > >     > > Zhanghao Chen
>     >     > >     > > ________________________________
>     >     > >     > > From: zhengyu chen <ja...@gmail.com>
>     >     > >     > > Sent: Monday, August 15, 2022 18:06
>     >     > >     > > To: dev@flink.apache.org <de...@flink.apache.org>
>     >     > >     > > Subject: [DISCUSS] FLIP-256 Support Job Dynamic
> Parameter With
>     >     > > Flink Rest
>     >     > >     > > Api
>     >     > >     > >
>     >     > >     > > Hi all,
>     >     > >     > >
>     >     > >     > >
>     >     > >     > > We would like to start a discussion thread on
> FLIP-256 Support Job
>     >     > >     > Dynamic
>     >     > >     > > Parameter With Flink Rest Api
>     >     > >     > > <
>     >     > >     > >
>     >     > >     >
>     >     > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-256+Support+Job+Dynamic+Parameter+With+Flink+Rest+Api
>     >     > >     > > >
>     >     > >     > > [1]
>     >     > >     > >
>     >     > >     > > After the user submits the jar package, running a
> job through
>     >     > > restapi
>     >     > >     > > (/jars/:jarid/run) [2] can only pass in
> (allowNonRestoredState,
>     >     > >     > > savepointPath, programArg, entry-class, parallelism)
> parameters,
>     >     > > which is
>     >     > >     > > obvious with the diversification of job parameters
> (eg Checkpoint
>     >     > >     > address)
>     >     > >     > >
>     >     > >     > > This solves the problem that the user can pass in
> other parameters
>     >     > > when
>     >     > >     > > submitting a job, avoiding the user to define these
> job parameters
>     >     > > in the
>     >     > >     > > code, resulting in the need to repackage the job for
> each
>     >     > > modification
>     >     > >     > >
>     >     > >     > > There was some interest from users [3] from a meetup
> and the
>     >     > > mailing
>     >     > >     > list.
>     >     > >     > > Looking forward to comments and feedback, thanks!
>     >     > >     > >
>     >     > >     > >
>     >     > >     > > [1]
>     >     > >     > >
>     >     > >     > >
>     >     > >     >
>     >     > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-256+Support+Job+Dynamic+Parameter+With+Flink+Rest+Api
>     >     > >     > >
>     >     > >     > >
>     >     > >     > > [2]
>     >     > >     > >
>     >     > >     > >
>     >     > >     >
>     >     > >
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jars-jarid-run
>     >     > >     > >
>     >     > >     > > [3]
> https://issues.apache.org/jira/browse/FLINK-27060
>     >     > >     > >
>     >     > >     > >
>     >     > >     > >
>     >     > >     > >
>     >     > >     > > --
>     >     > >     > > Best
>     >     > >     > >
>     >     > >     > > ConradJam
>     >     > >     > >
>     >     > >     >
>     >     > >
>     >     > >
>     >
>
>     --
>     Best
>
>     ConradJam
>
>

--
Best

ConradJam