You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Jordan Ganoff <jo...@ganoff.com> on 2016/05/30 14:14:24 UTC

Executing detached data stream programs

Hi,

I plan to leveraging Flink data stream programs within a larger application. I’d like to be able to execute a data stream program in detached mode directly from the StreamExecutionEnvironment similar to how I can execute a program in blocking mode. I was expecting to find StreamExecutionEnvironment.executeDetached(). What’s the best practice for executing a detached data stream program and monitoring it’s progress?

From what I can tell one approach could be to refactor the client configuration and interaction logic from CliFrontend into a more embed-friendly API.

Thoughts?

Thanks,
Jordan Ganoff

Re: Executing detached data stream programs

Posted by Robert Metzger <rm...@apache.org>.
Hi Jordan,

the community is definitively open to discuss this further (in particular
if users start asking for the feature)
Here is the related JIRA issue:
https://issues.apache.org/jira/browse/FLINK-2313

On Tue, May 31, 2016 at 5:19 PM, jganoff <jo...@corvana.com> wrote:

> Hi Robert,
>
> Thanks for the suggestion. Threading out a blocking
> RemoteStreamEnvironment.execute() call and polling the monitoring REST API
> will work for now. Once the job transitions to running I will kill the
> thread and monitor the job through the REST API.
>
> As for metrics, accumulators, and other job state I think I can get away
> with monitoring jobs through a combination of my own metrics system and the
> data exposed through the Flink Monitoring REST APIs [0].
>
> I didn't find any JIRA issues related to my original request for being able
> to easily submit detached jobs through an ExecutionEnvironment. Does that
> sound like something the Flink team would be open to discussing?
>
> Thanks!
>
> [0]
>
> https://ci.apache.org/projects/flink/flink-docs-master/internals/monitoring_rest_api.html
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Executing-detached-data-stream-programs-tp7268p7291.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>

Re: Executing detached data stream programs

Posted by jganoff <jo...@corvana.com>.
Hi Robert,

Thanks for the suggestion. Threading out a blocking
RemoteStreamEnvironment.execute() call and polling the monitoring REST API
will work for now. Once the job transitions to running I will kill the
thread and monitor the job through the REST API.

As for metrics, accumulators, and other job state I think I can get away
with monitoring jobs through a combination of my own metrics system and the
data exposed through the Flink Monitoring REST APIs [0].

I didn't find any JIRA issues related to my original request for being able
to easily submit detached jobs through an ExecutionEnvironment. Does that
sound like something the Flink team would be open to discussing?

Thanks!

[0]
https://ci.apache.org/projects/flink/flink-docs-master/internals/monitoring_rest_api.html



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Executing-detached-data-stream-programs-tp7268p7291.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Executing detached data stream programs

Posted by Robert Metzger <rm...@apache.org>.
Hi Jordan,

the ./bin/flink client's run command has a -d / --detached flag for
detached execution.
However, this doesn't allow you to programatically control the running job.
What you probably have to do is using the RemoteEnvironment submitting the
job in a blocking way using a separate thread.
Then, another thread can monitor the application's progress (using the
jobmanager's REST APIs) or some custom monitoring.

I know that this is not very handy right now, and there are more arguments
for having a nice API for other features like metrics, accumulators, state
queries or data samples. But I think such features will be added to Flink
eventually.


On Mon, May 30, 2016 at 4:14 PM, Jordan Ganoff <jo...@ganoff.com> wrote:

> Hi,
>
> I plan to leveraging Flink data stream programs within a larger
> application. I’d like to be able to execute a data stream program in
> detached mode directly from the StreamExecutionEnvironment similar to how I
> can execute a program in blocking mode. I was expecting to find
> StreamExecutionEnvironment.executeDetached(). What’s the best practice for
> executing a detached data stream program and monitoring it’s progress?
>
> From what I can tell one approach could be to refactor the client
> configuration and interaction logic from CliFrontend into a more
> embed-friendly API.
>
> Thoughts?
>
> Thanks,
> Jordan Ganoff