You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Yassine MARZOUGUI <y....@mindlytix.com> on 2016/11/23 13:03:04 UTC

Sink not switched to "RUNUNG" even though a task slot is available

Hi all,

My batch job has the follwoing plan in the end (figure attached):


​
I have a total of 32 task slots, and I have set the parallelism of the last
two operators before the sink to 31. The sink parallelism is 1. The last
operator before the sink is a MapOperator, so it doesn't need to buffer all
elements before emitting the output.

Looking at the dashboard, I see that one task slot is available, but the
sink subtask stays in the state "CREATED".

Any explanation for this behaviour? Thank you.

Best,
Yassine

Re: Sink not switched to "RUNUNG" even though a task slot is available

Posted by Yassine MARZOUGUI <y....@mindlytix.com>.
Hi Robert,

I see, so the join needs to consume all data first and process it.
In my case, I couldn't wait long because the first join quickly generated a
lot of data that can't fit in the memory or in the disk. The solution was
then to manually specify a JoinHint and broadcast the small dataset, that
way I was able to get an output of the join as soon as data is received.

Thanks a lot for the clarification!

Best,
Yassine

2016-11-23 17:35 GMT+01:00 Robert Metzger <rm...@apache.org>:

> Hi Yassine,
>
> you don't necessarily need to set the parallelism of the last two
> operators of 31, the sink with parallelism 1 will fit still into the slots.
> A task slot can, by default, hold an entire "slice" or parallel instance
> of a job.
>
> The reason why the sink stays in state CREATE in the beginning is because
> it didn't receive any data yet. In batch mode, an operator is switching to
> RUNNING once it received the first record. In your case, there are some
> operations (reduce, join) before the sink that first need to consume all
> data and process it.
> If you wait long enough, you should see the sink to become active.
>
> Regards,
> Robert
>
>
>
> On Wed, Nov 23, 2016 at 2:03 PM, Yassine MARZOUGUI <
> y.marzougui@mindlytix.com> wrote:
>
>> Hi all,
>>
>> My batch job has the follwoing plan in the end (figure attached):
>>
>>
>> ​
>> I have a total of 32 task slots, and I have set the parallelism of the
>> last two operators before the sink to 31. The sink parallelism is 1. The
>> last operator before the sink is a MapOperator, so it doesn't need to
>> buffer all elements before emitting the output.
>>
>> Looking at the dashboard, I see that one task slot is available, but the
>> sink subtask stays in the state "CREATED".
>>
>> Any explanation for this behaviour? Thank you.
>>
>> Best,
>> Yassine
>>
>
>

Re: Sink not switched to "RUNUNG" even though a task slot is available

Posted by Robert Metzger <rm...@apache.org>.
Hi Yassine,

you don't necessarily need to set the parallelism of the last two operators
of 31, the sink with parallelism 1 will fit still into the slots.
A task slot can, by default, hold an entire "slice" or parallel instance of
a job.

The reason why the sink stays in state CREATE in the beginning is because
it didn't receive any data yet. In batch mode, an operator is switching to
RUNNING once it received the first record. In your case, there are some
operations (reduce, join) before the sink that first need to consume all
data and process it.
If you wait long enough, you should see the sink to become active.

Regards,
Robert



On Wed, Nov 23, 2016 at 2:03 PM, Yassine MARZOUGUI <
y.marzougui@mindlytix.com> wrote:

> Hi all,
>
> My batch job has the follwoing plan in the end (figure attached):
>
>
> ​
> I have a total of 32 task slots, and I have set the parallelism of the
> last two operators before the sink to 31. The sink parallelism is 1. The
> last operator before the sink is a MapOperator, so it doesn't need to
> buffer all elements before emitting the output.
>
> Looking at the dashboard, I see that one task slot is available, but the
> sink subtask stays in the state "CREATED".
>
> Any explanation for this behaviour? Thank you.
>
> Best,
> Yassine
>