You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Timo Walther <fl...@twalthr.com> on 2014/07/22 14:05:28 UTC
Which is the right degree of parallism?
Hey everyone,
I want to get the maximum performance of my small 2 node cluster. At the
moment my execution plan has a "parallelism" of "1" at each operator.
What "-p XX" argument should I pass to the job? The number of nodes,
number of CPUs or number of slots?
Thanks and regards,
Timo
Re: Which is the right degree of parallism?
Posted by Stephan Ewen <se...@apache.org>.
The number of slots that a machine offers is what you define in the config.
Setting it to #cores is in many cases reasonable.
Have a look at the default config under "/conf", it has an entry there
where you set the slots per machine.
Re: Which is the right degree of parallism?
Posted by Ufuk Celebi <u....@fu-berlin.de>.
On 22 Jul 2014, at 14:58, Aljoscha Krettek <al...@apache.org> wrote:
> Do the slots correlate with the number of cores? I think the slots business
> might be confusing for some users.
I think it depends as well. Number of cores would be a reasonable default for the slots though.
Re: Which is the right degree of parallism?
Posted by Aljoscha Krettek <al...@apache.org>.
Do the slots correlate with the number of cores? I think the slots business
might be confusing for some users.
On Tue, Jul 22, 2014 at 2:31 PM, Stephan Ewen <se...@apache.org> wrote:
> Hey!
>
> That depends on the job, but in general, #cores is a good point to start.
>
> Stephan
>
>
>
> On Tue, Jul 22, 2014 at 2:05 PM, Timo Walther <fl...@twalthr.com> wrote:
>
> > Hey everyone,
> >
> > I want to get the maximum performance of my small 2 node cluster. At the
> > moment my execution plan has a "parallelism" of "1" at each operator.
> > What "-p XX" argument should I pass to the job? The number of nodes,
> > number of CPUs or number of slots?
> >
> > Thanks and regards,
> > Timo
> >
>
Re: Which is the right degree of parallism?
Posted by Stephan Ewen <se...@apache.org>.
Hey!
That depends on the job, but in general, #cores is a good point to start.
Stephan
On Tue, Jul 22, 2014 at 2:05 PM, Timo Walther <fl...@twalthr.com> wrote:
> Hey everyone,
>
> I want to get the maximum performance of my small 2 node cluster. At the
> moment my execution plan has a "parallelism" of "1" at each operator.
> What "-p XX" argument should I pass to the job? The number of nodes,
> number of CPUs or number of slots?
>
> Thanks and regards,
> Timo
>