You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Markus Nentwig <ne...@informatik.uni-leipzig.de> on 2016/09/26 15:58:25 UTC

Re: Complex batch workflow needs (too) much time to create executionPlan

Hi Fabian, 

at first, sorry for the late answer. The given execution plan was created
after 20 minutes, only one vertex centric iteration is missing. I can
optimize the program because some operators are only needed to create
intermediate debug results, still, it's not enough to run as one Flink job.
My "solution" is currently that I split the program in several parts and
execute them on their own writing intermediate results to disk, which is
working.

As for the stacktrace, I created some of them for the "big" workflow which
does not finish the optimization phase at all, here are three of them:
shortly after start: http://pastebin.com/i7SMGbxa
after ~32 minutes: http://pastebin.com/yJrYETxi
after ~76minutes: http://pastebin.com/fCsv8bie

I am no expert in analyzing these stacktraces, but perhaps it's helping in
some way for you!? ;)



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Complex-batch-workflow-needs-too-much-time-to-create-executionPlan-tp8596p9177.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Complex batch workflow needs (too) much time to create executionPlan

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Markus,

thanks for the stacktraces!
The client is indeed stuck in the optimizer. I have to look a bit more into
this.
Did you try to set JoinHints in your plan? That should reduce the plan
space that is enumerated and therefore reduce the optimization time (maybe
enough to run your application as a single program).

Nonetheless, I think we should look into the optimizer and check how we can
improve the plan enumeration.
I create the JIRA issue FLINK-4688 [1] to track this issue.

Thanks for reporting this,
Fabian

[1] https://issues.apache.org/jira/browse/FLINK-4688



2016-09-26 17:58 GMT+02:00 Markus Nentwig <nentwig@informatik.uni-leipzig.de
>:

> Hi Fabian,
>
> at first, sorry for the late answer. The given execution plan was created
> after 20 minutes, only one vertex centric iteration is missing. I can
> optimize the program because some operators are only needed to create
> intermediate debug results, still, it's not enough to run as one Flink job.
> My "solution" is currently that I split the program in several parts and
> execute them on their own writing intermediate results to disk, which is
> working.
>
> As for the stacktrace, I created some of them for the "big" workflow which
> does not finish the optimization phase at all, here are three of them:
> shortly after start: http://pastebin.com/i7SMGbxa
> after ~32 minutes: http://pastebin.com/yJrYETxi
> after ~76minutes: http://pastebin.com/fCsv8bie
>
> I am no expert in analyzing these stacktraces, but perhaps it's helping in
> some way for you!? ;)
>
>
>
> --
> View this message in context: http://apache-flink-user-
> mailing-list-archive.2336050.n4.nabble.com/Complex-batch-
> workflow-needs-too-much-time-to-create-executionPlan-tp8596p9177.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>