You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by mars <sk...@yahoo.com> on 2020/07/29 14:51:58 UTC

Flink jobs getting finished because of "Could not allocate the required slot within slot request timeout"

Hi All,

 I have an EMR Cluster with one Master Node and 3 worker Nodes ( it has auto
scaling enabled and the max no.of worker nodes can go up to 8). 

I have 3 Spark Jobs that are running currently on the Cluster.

I submitted 3 Flink Jobs and all of them finished as the slots are not
available error.

In flink-conf.xml i have

jobmanager.heap.mb: 4096
taskmanager.heap.mb: 4096

And the Master node has 16 vcores and 64Gb Memory and each worker node has 4
vcores and 16GB Memory.

And when i am submitting the flink job i am passing the arg (-p 2) which
should set the parallelism to 2.

And YARN UI is showing the following stats

Containers Running : 7
Memory Used          : 21.63GB
Memory Total          : 36GB
vCores Used           : 7
VCores Total           : 12
Active Nodes          : 3

I cannot figure out why the slots cannot be allocated to Flink Jobs. First
of all even with 3 Active Nodes there are still 5 VCores available and more
over for this Cluster Auto Scaling is enabled and EMR should allocate up to
8 Nodes i.e 5 more new nodes should be allocated is required.

Appreciate any insights.

Also i cannot find the task manager logs on any of the nodes.

Thanks
Sateesh





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink jobs getting finished because of "Could not allocate the required slot within slot request timeout"

Posted by Zhu Zhu <re...@gmail.com>.

Hi Sateesh,

Would you check Flink jobmanager log to see whether it has sent container
requests to YARN RM?
If the request is sent but not fulfilled, you will need to check the YARN
RM logs or the YARN cluster
resources at that time to see whether that container request is fulfillable.
The resources for a requested container can be found in Flink JM log.

Thanks,
Zhu Zhu

mars <sk...@yahoo.com> 于2020年7月29日周三 下午10:52写道：

> Hi All,
>
>  I have an EMR Cluster with one Master Node and 3 worker Nodes ( it has
> auto
> scaling enabled and the max no.of worker nodes can go up to 8).
>
> I have 3 Spark Jobs that are running currently on the Cluster.
>
> I submitted 3 Flink Jobs and all of them finished as the slots are not
> available error.
>
> In flink-conf.xml i have
>
> jobmanager.heap.mb: 4096
> taskmanager.heap.mb: 4096
>
> And the Master node has 16 vcores and 64Gb Memory and each worker node has
> 4
> vcores and 16GB Memory.
>
> And when i am submitting the flink job i am passing the arg (-p 2) which
> should set the parallelism to 2.
>
> And YARN UI is showing the following stats
>
> Containers Running : 7
> Memory Used          : 21.63GB
> Memory Total          : 36GB
> vCores Used           : 7
> VCores Total           : 12
> Active Nodes          : 3
>
> I cannot figure out why the slots cannot be allocated to Flink Jobs. First
> of all even with 3 Active Nodes there are still 5 VCores available and more
> over for this Cluster Auto Scaling is enabled and EMR should allocate up to
> 8 Nodes i.e 5 more new nodes should be allocated is required.
>
> Appreciate any insights.
>
> Also i cannot find the task manager logs on any of the nodes.
>
> Thanks
> Sateesh
>
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>