You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Esa Heikkinen <es...@student.tut.fi> on 2018/05/29 10:31:33 UTC

env.execute() ?

Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

RE: env.execute() ?

Posted by Esa Heikkinen <es...@student.tut.fi>.

My final target is implement the application like in the attachment. I don’t know why it is so hard to me (maybe because I am too beginner with Flink). It may be little difficulties to build “upper level” state-machine outside of streams in Flink, because everything is so stream-oriented in Flink? I think every execution steps of state-machine could use own env.execute()? Good or bad idea or impossible ? I already asked this before in this email list, but I got the answer, it is “piece of cake” and I should do my homework, but no details..

I would be very grateful for the assistance. I can even pay little money, if anyone does it (using CEP ?). Actually I am PhD-student in Tampere University of Technology (Finland) and I have selected Flink as a benchmark for my (very simple) analyzer (that is very state-machine-oriented). I don’t know whether it was good or bad choice. But it is very hard to find suitable analyzer for comparison.

Best, Esa

From: Shuyi Chen <su...@gmail.com>
Sent: Thursday, May 31, 2018 12:38 AM
To: Esa Heikkinen <es...@student.tut.fi>
Cc: Fabian Hueske <fh...@gmail.com>; user@flink.apache.org
Subject: Re: env.execute() ?

I think you might be looking for the functionality provided by the clusterclient [1]. But I am not sure if I fully understand the meaning of "do internally in sync with application". Maybe you can give a concrete use case, so we can help better, if the ClusterClient is not what you want.

[1] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/client/program/ClusterClient.html

On Wed, May 30, 2018 at 3:18 AM, Esa Heikkinen <es...@student.tut.fi>> wrote:
Hi

Ok. Thanks for the clarification. But the controlling of savepoints is only possible by command line (or a script) ? Or is it possible to do internally in sync with application ?

Esa

From: Shuyi Chen <su...@gmail.com>>
Sent: Wednesday, May 30, 2018 8:18 AM
To: Esa Heikkinen <es...@student.tut.fi>>
Cc: Fabian Hueske <fh...@gmail.com>>; user@flink.apache.org<ma...@flink.apache.org>
Subject: Re: env.execute() ?

Hi Esa,

I think having more than one env.execute() is anti-pattern in Flink.

env.execute() behaves differently depending on the env. For local, it will generate the flink job graph, and start a local mini cluster in background to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a remote cluster, e.g. running on YARN/Mesos, the local process might stay attached or detach to the job on the remote cluster given options. So it's not a simple "unstoppable forever loop", and I dont think the "stop env.execute() and then do something and after that restart it" will work in general.

But I think you can take a look at savepoints [1] and checkpoints [2] in Flink. With savepoints, you can stop the running job, and do something else, and restart from the savepoints to resume the processing.

[1]  https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/savepoints.html
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html

Thanks
Shuyi

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <es...@student.tut.fi>> wrote:
Hi

Are there only one env.execute() in application ?

Is it unstoppable forever loop ?

Or can I stop env.execute() and then do something and after that restart it ?

Best, Esa

From: Fabian Hueske <fh...@gmail.com>>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <es...@student.tut.fi>>
Cc: user@flink.apache.org<ma...@flink.apache.org>
Subject: Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().
Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <es...@student.tut.fi>>:
Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

--
"So you have to trust that the dots will somehow connect in your future."

--
"So you have to trust that the dots will somehow connect in your future."

Re: env.execute() ?

Posted by Shuyi Chen <su...@gmail.com>.

I think you might be looking for the functionality provided by the
clusterclient [1]. But I am not sure if I fully understand the meaning of
"do internally in sync with application". Maybe you can give a concrete use
case, so we can help better, if the ClusterClient is not what you want.

[1]
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/client/program/ClusterClient.html

On Wed, May 30, 2018 at 3:18 AM, Esa Heikkinen <esa.heikkinen@student.tut.fi
> wrote:

> Hi
>
>
>
> Ok. Thanks for the clarification. But the controlling of savepoints is
> only possible by command line (or a script) ? Or is it possible to do
> internally in sync with application ?
>
>
>
> Esa
>
>
>
> *From:* Shuyi Chen <su...@gmail.com>
> *Sent:* Wednesday, May 30, 2018 8:18 AM
> *To:* Esa Heikkinen <es...@student.tut.fi>
> *Cc:* Fabian Hueske <fh...@gmail.com>; user@flink.apache.org
> *Subject:* Re: env.execute() ?
>
>
>
> Hi Esa,
>
>
>
> I think having more than one env.execute() is anti-pattern in Flink.
>
>
>
> env.execute() behaves differently depending on the env. For local, it will
> generate the flink job graph, and start a local mini cluster in background
> to run the job graph directly.
> For remote case, it will generate the flink job graph and submit it to a
> remote cluster, e.g. running on YARN/Mesos, the local process might stay
> attached or detach to the job on the remote cluster given options. So it's
> not a simple "unstoppable forever loop", and I dont think the "stop
> env.execute() and then do something and after that restart it" will work in
> general.
>
>
>
> But I think you can take a look at savepoints [1] and checkpoints [2] in
> Flink. With savepoints, you can stop the running job, and do something
> else, and restart from the savepoints to resume the processing.
>
>
>
>
>
> [1]  https://ci.apache.org/projects/flink/flink-docs-
> release-1.5/ops/state/savepoints.html
>
> [2] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/
> checkpoints.html
>
>
>
> Thanks
>
> Shuyi
>
>
>
> On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <
> esa.heikkinen@student.tut.fi> wrote:
>
> Hi
>
>
>
> Are there only one env.execute() in application ?
>
>
>
> Is it unstoppable forever loop ?
>
>
>
> Or can I stop env.execute() and then do something and after that restart
> it ?
>
>
>
> Best, Esa
>
>
>
> *From:* Fabian Hueske <fh...@gmail.com>
> *Sent:* Tuesday, May 29, 2018 1:35 PM
> *To:* Esa Heikkinen <es...@student.tut.fi>
> *Cc:* user@flink.apache.org
> *Subject:* Re: env.execute() ?
>
>
>
> Hi,
>
>
>
> It is mandatory for all DataStream programs and most DataSet programs.
>
>
>
> Exceptions are ExecutionEnvironment.print() and
> ExecutionEnvironment.collect().
>
> Both methods are defined on the DataSet ExecutionEnvironment and call
> execute() internally.
>
>
>
> Best, Fabian
>
>
>
> 2018-05-29 12:31 GMT+02:00 Esa Heikkinen <es...@student.tut.fi>:
>
> Hi
>
>
>
> Is it env.execute() mandatory at the end of application ? It is possible
> to run the application without it ?
>
>
>
> I found some examples where it is missing.
>
>
>
> Best, Esa
>
>
>
>
>
>
>
> --
>
> "So you have to trust that the dots will somehow connect in your future."
>



-- 
"So you have to trust that the dots will somehow connect in your future."

RE: env.execute() ?

Posted by Esa Heikkinen <es...@student.tut.fi>.

Hi

Ok. Thanks for the clarification. But the controlling of savepoints is only possible by command line (or a script) ? Or is it possible to do internally in sync with application ?

Esa

From: Shuyi Chen <su...@gmail.com>
Sent: Wednesday, May 30, 2018 8:18 AM
To: Esa Heikkinen <es...@student.tut.fi>
Cc: Fabian Hueske <fh...@gmail.com>; user@flink.apache.org
Subject: Re: env.execute() ?

Hi Esa,

I think having more than one env.execute() is anti-pattern in Flink.

env.execute() behaves differently depending on the env. For local, it will generate the flink job graph, and start a local mini cluster in background to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a remote cluster, e.g. running on YARN/Mesos, the local process might stay attached or detach to the job on the remote cluster given options. So it's not a simple "unstoppable forever loop", and I dont think the "stop env.execute() and then do something and after that restart it" will work in general.

But I think you can take a look at savepoints [1] and checkpoints [2] in Flink. With savepoints, you can stop the running job, and do something else, and restart from the savepoints to resume the processing.

[1]  https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/savepoints.html
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html

Thanks
Shuyi

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <es...@student.tut.fi>> wrote:
Hi

Are there only one env.execute() in application ?

Is it unstoppable forever loop ?

Or can I stop env.execute() and then do something and after that restart it ?

Best, Esa

From: Fabian Hueske <fh...@gmail.com>>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <es...@student.tut.fi>>
Cc: user@flink.apache.org<ma...@flink.apache.org>
Subject: Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().
Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <es...@student.tut.fi>>:
Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

--
"So you have to trust that the dots will somehow connect in your future."

Re: env.execute() ?

Posted by Shuyi Chen <su...@gmail.com>.

Hi Esa,

I think having more than one env.execute() is anti-pattern in Flink.

env.execute() behaves differently depending on the env. For local, it will
generate the flink job graph, and start a local mini cluster in background
to run the job graph directly.
For remote case, it will generate the flink job graph and submit it to a
remote cluster, e.g. running on YARN/Mesos, the local process might stay
attached or detach to the job on the remote cluster given options. So it's
not a simple "unstoppable forever loop", and I dont think the "stop
env.execute() and then do something and after that restart it" will work in
general.

But I think you can take a look at savepoints [1] and checkpoints [2] in
Flink. With savepoints, you can stop the running job, and do something
else, and restart from the savepoints to resume the processing.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/savepoints.html
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html

Thanks
Shuyi

On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <esa.heikkinen@student.tut.fi
> wrote:

> Hi
>
>
>
> Are there only one env.execute() in application ?
>
>
>
> Is it unstoppable forever loop ?
>
>
>
> Or can I stop env.execute() and then do something and after that restart
> it ?
>
>
>
> Best, Esa
>
>
>
> *From:* Fabian Hueske <fh...@gmail.com>
> *Sent:* Tuesday, May 29, 2018 1:35 PM
> *To:* Esa Heikkinen <es...@student.tut.fi>
> *Cc:* user@flink.apache.org
> *Subject:* Re: env.execute() ?
>
>
>
> Hi,
>
>
>
> It is mandatory for all DataStream programs and most DataSet programs.
>
>
>
> Exceptions are ExecutionEnvironment.print() and
> ExecutionEnvironment.collect().
>
> Both methods are defined on the DataSet ExecutionEnvironment and call
> execute() internally.
>
>
>
> Best, Fabian
>
>
>
> 2018-05-29 12:31 GMT+02:00 Esa Heikkinen <es...@student.tut.fi>:
>
> Hi
>
>
>
> Is it env.execute() mandatory at the end of application ? It is possible
> to run the application without it ?
>
>
>
> I found some examples where it is missing.
>
>
>
> Best, Esa
>
>
>

-- 
"So you have to trust that the dots will somehow connect in your future."

Re: env.execute() ?

Posted by Rong Rong <wa...@gmail.com>.

Hi Esa,

In Flink documentation[1], what you specified before env.execute() is the
job graph.
"Once you specified the complete program you need to *trigger the program
execution* by calling execute()".

execute() can be finite or infinite, depending on whether your data source
is finite, or whether you interrupt the program.

Best,
Rong

[1]:
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/api_concepts.html#anatomy-of-a-flink-program


On Tue, May 29, 2018 at 3:56 AM, Esa Heikkinen <esa.heikkinen@student.tut.fi
> wrote:

> Hi
>
>
>
> Are there only one env.execute() in application ?
>
>
>
> Is it unstoppable forever loop ?
>
>
>
> Or can I stop env.execute() and then do something and after that restart
> it ?
>
>
>
> Best, Esa
>
>
>
> *From:* Fabian Hueske <fh...@gmail.com>
> *Sent:* Tuesday, May 29, 2018 1:35 PM
> *To:* Esa Heikkinen <es...@student.tut.fi>
> *Cc:* user@flink.apache.org
> *Subject:* Re: env.execute() ?
>
>
>
> Hi,
>
>
>
> It is mandatory for all DataStream programs and most DataSet programs.
>
>
>
> Exceptions are ExecutionEnvironment.print() and
> ExecutionEnvironment.collect().
>
> Both methods are defined on the DataSet ExecutionEnvironment and call
> execute() internally.
>
>
>
> Best, Fabian
>
>
>
> 2018-05-29 12:31 GMT+02:00 Esa Heikkinen <es...@student.tut.fi>:
>
> Hi
>
>
>
> Is it env.execute() mandatory at the end of application ? It is possible
> to run the application without it ?
>
>
>
> I found some examples where it is missing.
>
>
>
> Best, Esa
>
>
>

RE: env.execute() ?

Posted by Esa Heikkinen <es...@student.tut.fi>.

Hi

Are there only one env.execute() in application ?

Is it unstoppable forever loop ?

Or can I stop env.execute() and then do something and after that restart it ?

Best, Esa

From: Fabian Hueske <fh...@gmail.com>
Sent: Tuesday, May 29, 2018 1:35 PM
To: Esa Heikkinen <es...@student.tut.fi>
Cc: user@flink.apache.org
Subject: Re: env.execute() ?

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and ExecutionEnvironment.collect().
Both methods are defined on the DataSet ExecutionEnvironment and call execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <es...@student.tut.fi>>:
Hi

Is it env.execute() mandatory at the end of application ? It is possible to run the application without it ?

I found some examples where it is missing.

Best, Esa

Re: env.execute() ?

Posted by Fabian Hueske <fh...@gmail.com>.

Hi,

It is mandatory for all DataStream programs and most DataSet programs.

Exceptions are ExecutionEnvironment.print() and
ExecutionEnvironment.collect().
Both methods are defined on the DataSet ExecutionEnvironment and call
execute() internally.

Best, Fabian

2018-05-29 12:31 GMT+02:00 Esa Heikkinen <es...@student.tut.fi>:

> Hi
>
>
>
> Is it env.execute() mandatory at the end of application ? It is possible
> to run the application without it ?
>
>
>
> I found some examples where it is missing.
>
>
>
> Best, Esa
>